Sie sind auf Seite 1von 4

Veracity

Veracity refers to the trustworthiness of the data. Can the manager rely on the fact that the
data is representative of? Every good manager knows that there are inherent discrepancies in
all types of data.
There is wide spread consensus about the potential value of Big Data, the data may render
itself worthless if it is not accurate. This becomes particularly true for programmes that
involve automated decision-making, or feeding the data into an unsupervised machine
learning algorithm. It is because as there is no human intervention, the results of such
programmes are only as good as the data they are working with.
Sean Owen, Senior Director of Data Science at CloudEra, expanded upon this: Lets say
that, in theory, you have customer behaviour data and want to predict purchase intent. In
practice what you have are log files in four formats from six systems, some incomplete, with
noise and errors. These have to be copied, translated and unified. Owens US counterpart,
Josh Wills, said their job revolves so much around the cleaning up of messy data that he was
more a data janitor than a data scientist.
One important thing the one should understand is that Big Data is the messy, bulky and noisy
in nature, and a huge amount of work goes into extracting accurate and usable data out of Big
Data. Dealing with a great deal of information in high volumes coming in rapidly is of no use
if that information is not accurate. Off base information can bring about considerable
problems for associations and additionally for customers. Along these lines, associations need
to guarantee that the information is right and the examinations performed on the information
are right. Particularly in computerized basic leadership, with no human intervention, you
should ensure that both the information and tests for the same are correct.If an organisation to
become information-centric, it should take measures to ensure that they can trust their data as
well as their analysis.
Forecasting models often face challenges due to their assumption of data availability in real
time being invalidated. While volume, velocity, and variety to characterize the qualitative
aspect of big data, veracity, refers its quantitative aspect. Without addressing veracity, big
data solutions risk degradation in performance and inaccurate interpretation of generated
insights. An example of Veracity problem is Smart Electricity Grids where high volume
electricity consumption data is collected by smart meters at consumer premises and securely
transmitted back to the electric utility over wireless or broadband networks to be used for
forecasting. Due to physical limitations of existing transmission networks, data from all smart
meters is not readily available in real-time. This can lead to incorrect estimates of power
consumption.

Variability
Variability refers to data whose meaning is constantly changing. This is quite evident when
gathering data relies on language processing. Variability is sometimes confused with variety.
One example that I want to cite is the Bakery example, so if you go to a bakery that sells
fifteen different breads. That is variety. Now imagine you go to that bakery three days in a
row and every day you buy the same type of bread but each day it tastes and smells different.
That is variability.
Variability is thus very relevant in performing sentiment analyses mostly on social media.
Variability means that the meaning is changing(rapidly). In (almost) the same social media
updates the same word can mean different. In order to perform a proper sentiment analyses,
algorithms need to be able to understand the context and be able to decipher the exact
meaning of that statement.
Hopkins (a principal analyst at Forrester) cited the supercomputer Watson as a prime example
of this. To participate in the gameshow Jeopardy, Watson had to dissect an answer into its
meaning and [] to figure out what the right question was. Words dont have static
definitions, and their meaning can vary wildly in context.
Say a company was trying to gauge sentiment towards a sweet house using these tweets:

Delicious sweets from the @Lakshmi Mishthan Bhandar- what a great way to start the day!
Greatly disappointed that my local Lakshmi Mishthan Bhandar have stopped stocking
BLTs.
Had to wait in line for 45 minutes at the Lakshmi Mishthan Bhandar today. Great, well
theres my lunchbreak gone
Evidently, great on its own is not a sufficient identifier of positive sentiment expressed by
people. Instead, companies have to develop sophisticated programmes which can
comprehend context and decode the precise meaning of words through it. Although it looks
difficult, its not impossible; for example Bloomberg launched a programme that gauged the
social media buzz created for companies for Wall Street last year.

Application in Healthcare
Big Data has tremendous potential to add value in all healthcare settings. Big Data solutions
can help organizations personalize care, engage patients, reduce variability and costs, and
improve quality. Once Big Data is managed and integrated, organizations can apply analytics
to better understand the clinical and operational states of their business based on historical
and current trends, and predict what might occur in the future with a trusted level of
reliability.
Big Data solutions can result in personalized medicine that makes a dramatic difference by
redirecting the care of a patient toward the most favourable outcome before predictably
sustaining an adverse clinical event," he says
These new insights can improve health at many levels. With Big Data, best practices are more
readily identified, variability decreases, and costs and quality are enhanced by providers,
delivering a truly personalized patient experience.
EMR Example
EMR (Electronic Medical Records) has revolutionized the healthcare industry. Despite the
improvement, there remains a significant gap in doctors knowledge about patient status,
which hinders their ability to improve outcomes and lower costs. Because the EMR is limited
to the provider's record of care, it does not collect other types of essential data that is needed
to get better outcomes, which results in problems like;
Root causes for events like admissions and readmissions are not recorded.
Clinical information is almost always provider-supplied, leaving out patient perspectives.
Critical staging or other data on acuteness of conditions are often not included, such as
cancer staging or rejection fraction in cardiac cases.
Intervention measurement is lacking, therefore the effectiveness of treatments is not always
clear.

Certain diagnoses are missing, especially mental illness.


Variability in data recorded in structured fields is a significant problem with some EMRs.
Even when an EMR is well-implemented, there may be other hazards in the data caused by
too much reliance on templates and structured data. What happens because of this is that one
patient looks almost exactly like another, and it becomes deceptively easy for providers to
check off items on the template that are not accurate.

Das könnte Ihnen auch gefallen