Sie sind auf Seite 1von 3

Unstructured data is information, in many different forms, that doesn't hew to

conventional data models and thus typically isn't a good fit for a mainstream
relational database. Thanks to the emergence of alternative platforms for
storing and managing such data, it is increasingly prevalent in IT systems and
is used by organizations in a variety of business intelligence
and analytics applications.

Traditional structured data, such as the transaction data in financial systems


and other business applications, conforms to a rigid format to ensure
consistency in processing and analyzing it. Sets of unstructured data, on the
other hand, can be maintained in formats that aren't uniform, freeing analytics
teams to work with all of the available data without necessarily having to
consolidate and standardize it first. That enables more comprehensive
analyses than would otherwise be possible.

Types of unstructured data

One of the most common types of unstructured data is text. Unstructured text is
generated and collected in a wide range of forms, including Word documents, email
messages, PowerPoint presentations, survey responses, transcripts of call center
interactions, and posts from blogs and social media sites.

Other types of unstructured data include images, audio and video files. Machine data is
another category, one that's growing quickly in many organizations. For example, log
filesfrom websites, servers, networks and applications -- particularly mobile ones --
yield a trove of activity and performance data. In addition, companies increasingly
capture and analyze data from sensors on manufacturing equipment and other internet
of things (IoT) connected devices.

In some cases, such data may be considered to be semi-structured -- for example,


if metadata tags are added to provide information and context about the content of the
data. The line between unstructured and semi-structured data isn't absolute, though;
some data management consultants contend that all data, even the unstructured kind,
has some level of structure.

Unstructured data analytics

Because of its nature, unstructured data isn't suited to transaction processing


applications, which are the province of structured data. Instead, it's primarily used for
BI and analytics. One popular application is customer analytics. Retailers,
manufacturers and other companies analyze unstructured data to improve customer
relationship managementprocesses and enable more-targeted marketing; they also do
sentiment analysis to identify both positive and negative views of products, customer
service and corporate entities, as expressed by customers on social networks and in
other forums.

Predictive maintenance is an emerging analytics use case for unstructured data. For
example, manufacturers can analyze sensor data to try to detect equipment failures
before they occur in plant-floor systems or finished products in the field. Energy
pipelines can also be monitored and checked for potential problems using
unstructured data collected from IoT sensors.
Analyzing log data from IT systems highlights usage trends, identifies capacity
limitations and pinpoints the cause of application errors, system crashes, performance
bottlenecks and other issues. Unstructured data analytics also aids
regulatory compliance efforts, particularly in helping organizations understand what
corporate documents and records contain.

Unstructured data techniques and platforms

Analyst firms report that the vast majority of new data being generated is
unstructured. In the past, that type of information often was locked away in siloed
document management systems, individual manufacturing devices and the like --
making it what's known as dark data, unavailable for analysis.

But things changed with the development of big data platforms,


primarily Hadoop clusters, NoSQL databases and the Amazon Simple Storage Service
(S3). They provide the required infrastructure for processing, storing and managing
large volumes of unstructured data without the imposition of a common data model
and a single database schema, as in relational databases and data warehouses.

A variety of analytics techniques and tools are used to analyze unstructured data in big
data environments. Text analytics tools look for patterns, keywords and sentiment in
textual data; at a more advanced level, natural language processing technology is a form
of artificial intelligence that seeks to understand meaning and context in text and
human speech, increasingly with the aid of deep learning algorithms that use neural
networks to analyze data. Other techniques that play roles in unstructured data
analytics include data mining, machine learning and predictive analytics.

Das könnte Ihnen auch gefallen