ASME STB-1:2020 pdf free download - Guideline on Big Data/Digital Transformation Workflows and Applications for the Oil and Gas Industry

02-18-2022 comment

ASME STB-1:2020 pdf free download – Guideline on Big Data/Digital Transformation Workflows and Applications for the Oil and Gas Industry
2.3.1 Types and Usage
Unstructured data is data that cannot be defined into objects, tables or easily recognized structures. Text and photographs comprise most unstructured data, but it might also contain dates, numbers, facts, and references that do not organically organize themselves. When individuals discuss Big Data, the unstructured data is the typical reference point for that discussion. In addition to the lack of structure, the amount of data to be stored is enormous and possibly makes up 80% of most organizations’ data. Data mining techniques for unstructured data require tools that can process natural language and/or recognize patterns, and large quantities of both. In this sense, the data does contain a form of structure, but it is not easily recognizable.
2.3.2 Respective Databases
(a) Data Lakes
Also known as data warehouses or data swamps, data lakes are large repositories of data that are unstructured, or sometimes referred to as “raw.” Due to the storage requirements, most data lakes are stored in the cloud unless the organization owning the data has considerable onsite storage. Examples of off-site data lakes are Google Cloud, Amazon S3 or Apache Hadoop. A Hadoop is an open-source software for reliable, scalable and distributed computing. It provides massive storage capabilities and impressive computing power. It is not a programming language but rather an ecosystem that facilitates the moving and organization of Big Data. Hadoop-powered storage provides the capability for storing information derived by Internet of Things (IoT).
(b) NoSQL
NoSQL databases, originally referred as “non-SQL” or “non-relational” database, store data in non- tabular form. Included in this set of databases are the previously discussed ODBMS, Key-value stores, Document stores, and Graph databases. Each of these types of databases uses a unique method to map data, documents, dictionaries or relationships through tags. Examples of NoSQL include Apache Ignite, Couchbase, Oracle NoSQL, Amazon DynamoDB, and many others. These databases arrange data based on correlations of values rather than tables.
(c) Graph Databases
Graph databases are a unique subset of NoSQL databases that are gaining popularity for complex data mapping. They map data elements on a chart or graph and have finite numbers of relations. Graph databases have nodes with data and edges that describe relationships. Each node can have many edges and therefore described many relationships. These databases are suited for data sets with a wide variety of both structured and unstructured data. Examples of Graph databases include AllegroGraph, Neoj4, and Infinite Graph. These databases use languages to manage the data such as SPARQL, Java, and CYPHER.
2.4.1 Responsibility of the Enterprise
The enterprise team of engineers, planners, project managers and data professionals are responsible for solving business challenges using both internal and external sources of data. Many of these internal sources are proprietary to the business and are key to the competitive advantage of the business. Some external sources are subscription-based and should ideally be treated as confidential. The Data Scientist has a core responsibility to safeguard the confidentially-provided information. Guarding this data requires both active protection (e.g. not sharing with other individuals not expressly governed by the same confidentiality) and passive protection (e.g. following corporate protocols, using protection software, backing up data sets).

Main Focus Download

Standards World

ASME STB-1:2020 pdf free download – Guideline on Big Data/Digital Transformation Workflows and Applications for the Oil and Gas Industry

LEAVE A REPLY Cancel the reply