You are on page 1of 3

BDI: An R&D Perspective

 data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to
manage it and extract value and hidden knowledge from it.
 In other words, big data is characterised by volume, variety (structured and unstructured data) velocity (high
rate of changing) and veracity (uncertainty and incompleteness).
 A new methodology is required for transforming Big Data stored in heterogeneous and different-in-nature
data sources (e.g., legacy systems, Web, scientific data repositories, sensor and stream databases, social
networks) into a structured, hence well-interpretable format for target data analytics. As a consequence, data-
driven approaches, in biology, medicine, public policy, social sciences, and humanities, can replace the
traditional hypothesis-driven research in science.

Big Data: Science & Technology - Challenges

1. The IT Challenge: Storage and computational power


2. The computer science :Algorithm design, visualization, scalability (Machine Learning, network & Graph analysis,
streaming of data and text mining), distributed data, architectures, data dimension reduction and implementation
3. The mathematical science: Statistics, Optimisation, uncertainty quantification, model development (statistical, Ab
Initio, simulation) analysis and systems theory
4. The multi-disciplinary approach: Contextual problem solving

Big data analytics and the India equation

Some of the key actions for analytics eco-system in India would be around:
1. Talent Pool - Create industry academia partnership to groom the talent pool in universities as well as develop
strong internal training curriculum to advance analytical depth.
2. Collaborate - Form analytics forum across organization boundaries to discuss the pain-points of the practitioner
community and share best practices to scale analytics organizations.
3. Capability Development - Invest in long term skills and capabilities that forms the basis for differentiation and
value creation. There needs to be an innovation culture that will facilitate IP creation and asset development.
4. Value Creation - Building rigor to measure the impact of analytics deployment is very critical to earn legitimacy
within the organization.

How Big Data Is Going To Stop Power Theft In Rural India

 The biggest problem for discoms in India, arguably, is Aggregated Technical and Commercial losses (AT&C).
 transmitted power is lost partly due to technical reasons over the power networks, but mostly because of
thefts and people using illegal, unpaid-for hooks to divert available power to their homes and establishments.
 Some states lose as much as 50 percent of power transmitted to AT&C losses.
 ‘regional distributors lose almost 23% of the electricity they buy through theft, unmetered usage and
dissipation through old wires’.
 The good news is that something to that effect is soon to happen in rural India, as the government turns to
villages to look for ‘missing’ power.
 With the installation of sensors at literally the last mile of power consumption they will now be able to identify
spots where power leakage or theft is rampant. Since all this data will be publicly available it forces the
discoms (and the political authorities who hold them back) to act against theft. Next time there is a big
political rally in your village and the local politician chooses to steal power instead of paying for it there will be
a way for you to nail him. All you may have to do is to use your mobile phone.
Big Data: Possibilities and challenges

 Big Data and The Internet of Things (IoT) are intimately connected: billions of Internet-connected ‘things’
will, by definition, generate massive amounts of data.
 Companies are using Big Data analytics for everything from driving growth, reducing cost, improving
operational excellence, recruiting better people to completely transforming their business strategy.

 There was a confluence of label data along with highly affordable parallel programming processors that
started changing the fashion of work.

 In 2015, Google and Microsoft did something. Firstly, Google demonstrated that machines can recognise
images better than human beings.

 Microsoft went one step ahead and created a neural network. The previous network was created of 18 layers
in AlexNet. They now created 180 layers. You could now feel what the computer could really do along with
human beings which humans couldn’t.

 Issue is, where do we start from?

How do you address these two very important concerns—security and privacy, when everything is connected?

 The security levels on our systems are as complex to hack as any of your phones today, and I am pretty certain
you have more critical data on your phone today than you will have on our system. Added to that, we do give
you the option to disengage the location services so you don’t need to share your location with us. The rest of
the data is really not so critical.
 More than 50% of our customers come back and tell us they are fine with that. Second thing is, there are no
prevailing laws in India.
 The biggest challenge we’re facing today is skillsets. We are an industrial company. We need people who
understand both the engineering side of things and the computer side of things, and that is a very difficult
skillset to find. So we try to hire engineers who do computer science and co-locate them with core engineers
so that the domain knowledge transfer can seamlessly go back and forth.

Why the only thing better than big data is bigger data

Dense Data: you survey a handful of customers in depth, as in a customer survey comprised of dozens of
questions. This is “dense” data, with lots of information on every person, object or event you’re cataloging.

Sparse Data: This is the kind of data the web’s giants, like Google and Facebook, gather all the time. It’s “sparse”
because all you’re getting is a few data points from any one person, when you could be getting thousands or even
millions.

Applications
Big data includes problems that involve such large data sets and solutions that require a complex connecting the
dots. You can see such things everywhere.

1. Quora and Facebook use Big data tools to understand more about you and provide you with a feed that you
in theory should find it interesting. The fact that the feed is not interesting should show how hard the
problem is.
2. Credit card companies analyze millions of transactions to find patterns of fraud. Maybe if you bought pepsi
on the card followed by a big ticket purchase, it could be a fraudster?
3. My cousin works for a Big Data startup that analyzes weather data to help farmers sow the right seeds at the
right time. The startup got acquired by Monsanto for big $$.
4. A friend of mine works for a Big Data startup that analyzes customer behavior in real time to alert retailers
on when they should stock up stuff.
There are similar problems in defense, retail, genomics, pharma, healthcare that requires a solution.

Summary:

Big Data is a group of problems and technologies related to the availability of extremely large volumes of data that
businesses want to connect and understand. The reason why the sector is hot now is that the data and tools have
reached a critical mass. This occurred in parallel with years of education effort that has convinced organizations
that they must do something with their data treasure.

Here is a real-world application with security: with the right analytics provider, you can track patterns in the
footage, detect movement or flag suspicious activities. For example, an ATM processes 3 different transactions
with different cards, and at the same time the camera only captures one person standing at the ATM. Separate,
these 2 facts are not abnormal at all. But integrated together, you see the whole context and understand that it
might actually be a case of ATM fraud. At this point, the analytics platform can flag the event and notify you to
investigate it further.