Sie sind auf Seite 1von 20

MAPR GUIDE TO BIG DATA

IN TELECOMMUNICATIONS

MAPR GUIDE TO
BIG DATA IN
TELECOMMUNICATIONS

2
DATA CONVERGENCE
IN TELECOMMUNICATIONS
INTRODUCTION
Few industries have more to offer and to gain from big data than telecommunications. For decades,
communications service providers (CSPs) have transported and captured huge volumes of
information about customer calling patterns, wireless data usage, location, network bandwidth
statistics, and even the individual apps and webpages accessed by mobile devices.

Until recently, much of that data was discarded. There simply was no efficient way to mine value
from it and storing it was expensive. However, this is all changing. Big data technologies, when
combined with streaming analytics and analytics at scale, are enabling telecommunications
companies to uncover significant new insights about their infrastructure and their customers.
These insights are leading to massive changes in the ways they do business. In addition, with new
legislation that opens the door for internet service providers to sell data about their customers’
online behavior, data may become a significant new revenue source.

As the range of telecommunications services has expanded with the adoption of mobile data, so
have the potential applications to improve efficiency and generate new business. Providers can
leverage this information to better understand their own business. For example, usage pattern data
can guide bandwidth allocation and the positioning of equipment such as cell towers. It can also
help identify new services to offer customers and even open up new revenue streams in areas such
as targeted advertising.

Big data technologies are revolutionizing telecommunications. Tools like Apache Hadoop, streaming
analytics, and machine learning are opening up new opportunities for CSPs to gain insights from
data sets that were previously unwieldy.

This industry guide looks at the trends driving the big data revolution in telecommunications, how
different segments of the industry are being affected, and how big data is being put to work in the
field to change the competitive equation. This guide outlines a number of issues the industry faces
and discusses how these trends are driving new solution areas. It also looks at use cases that show
how big data and analytics are yielding game-changing breakthroughs for telecom providers today.

INDUSTRY TRENDS
To say the telecommunications industry is in transition is an understatement. Once a highly
regulated business with fixed prices, monopoly market control, and little customer choice, the
telecom market has been upended by digitization and mobility.

Customers now expect access from any device, anywhere. Their bandwidth needs have expanded to
include video and high-speed data. Customers also have plenty of carriers to choose from, ranging
from traditional phone service providers to cable companies to VOIP services. Thanks to full mobile
phone number portability, customers can move their service from one carrier to another with
virtually no disruption, enabling them to play carriers against each other.

3
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

In short, a once-staid industry is now hypercompetitive. This has created pressure on the large
incumbents and opportunities for nimble niche players, though there’s opportunity for the big
players as well. Telecom firms are aggressively seeking new lines of business, ranging from
advertising to cloud hosting to original programming. They are leveraging their infrastructure to
deliver new high-margin information and entertainment services to mobile customers, homes,
and businesses. Smart phones and programming are profitable new revenue sources. Although
competition has increased, customers are writing bigger checks to telecommunication service
providers than they ever have.

Some of the priorities today’s telecom providers must address include the following:
• System-wide cost take out and optimization

• Improving customer loyalty, acquisition, and retention

• Monetizing data and analytics

• Mass personalization

System-Wide Cost Reduction and Optimization


Customer demand for data is growing at an accelerating rate, putting pressure on all service
providers to optimize infrastructure. Operational expenses typically consume 30-40% of revenue,
and network operations account for about 45% of that cost.1 A single cell tower can cost $250,000 to
build, and equipment must be continually maintained and enhanced to support new protocols and
services. For example, the cost of upgrading existing networks to the new 5G network infrastructure
is expected to cost more than $100 billion over the next 10 years.2 As a result, service providers are
always looking for ways to squeeze more capacity out of existing infrastructure, reduce the cost of
expansion, and find new ways to leverage existing infrastructure profitably.

They are also seeking to better understand customer behavior in order to maximize margins. Just
1% of mobile users consume half of the world’s bandwidth.3 Carriers want to identify these heavy
users and charge them appropriately.

Two factors that will influence their planning are the move to software-defined networking (SDN)
and the surge of new bandwidth demand associated with the growth of the Internet of Things (IoT).
SDN promises to lower costs and increase flexibility, but there are big migration expenses involved.
IoT will create new pressure on capacity planning, but also yield attractive new sources of revenue.
Both technologies promise to create massive structural change in existing networks, and they will
require careful planning and ROI analysis.

Improving Customer Loyalty, Acquisition, and Retention


The consumer market is essentially saturated with more phones on the planet than people. This
means growth must come from selling more products and services to existing customers and
stealing customers away from competitors. Factors such as service quality, price, speed, and
customer service are key variables in this equation.

Research shows that customers perceive little difference between telecommunication service providers.
This means the ability to understand customer needs at a fine-grained level, provide fast and friendly
service, and customize service plans for businesses and individual customers is crucial for success.

Banerjee, Ari, Big Data & Advanced Analytics in Telecom: A Multi-Billion-Dollar Revenue Opportunity, Heavy Reading, December, 2013.
1

Goovaerts, Diana, iGR Study Forecasts $104B Cost to Upgrade LTE Networks, Build Out 5G Network, Wireless Week, December 7, 2015.
2

O’Brien, Kevin, Top 1% of Mobile Users Consume Half of World’s Bandwidth, and Gap Is Growing, The New York Times, January 5, 2012
3

4
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

KEY INDUSTRY STAKEHOLDERS


Telecommunications industry diversification has created a host of new stakeholders beyond carriers
and their customers. Each has different information needs.

Carriers
Service providers need to capture detailed information about infrastructure, service quality levels,
demand patterns, and other operational data in order to optimize resources and deliver high-quality
user experiences. They also need to mine customer behavior data in order to fine-tune their product
offerings and take advantage of new opportunities like advertising and paid information services.

Customers
In a market with a profusion of service offerings, many of them based on use, customers appreciate
having up-to-the-minute data on account status and warnings of additional charges they may incur.
CSPs can enhance customer satisfaction by offering complete reports and recommendations for
account changes that can optimize each customer’s spending.

Suppliers
The high cost of infrastructure gives service providers plenty of incentive to understand the ROI of
the dollars they spend on equipment and service. Sharing this information with suppliers can help
them negotiate more favorable contracts and ensure that suppliers are delivering on service-level
agreements.

Companies that supply end-user devices such as handsets and accessories want detailed sales
information as well as recommendations for promotions and other incentives that can improve their
bottom lines.

Regulators
Government overseers demand full transparency about service levels, rates, customer satisfaction,
and other metrics. They also want evidence that service providers are staying within customer
privacy and confidentiality guidelines. CSPs must not only capture this data, but tag and index it for
rapid access since compliance audits may require a response in 48 hours or less.

Content Providers
This new but important constituency provides programming and information services that open
up new revenue streams. Content providers require information on how their content is being
catalogued, promoted, and consumed as well as details on royalties, licenses, and other forms of
compensation.

Advertisers
Advertisers are another new and promising business opportunity. Ad buyers want clickstream data
such as views, click-throughs, downloads, registrations, view times, and other engagement metrics.
They also may demand this information within the context of location, time of day, app use, and other
variables.

5
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

TELECOMMUNICATIONS DATA SOURCES AND STANDARDS


There is no shortage of data available to telecom providers both from within their networks and from
external sources. Here are a few of the sources they can mine:

Internal systems used to provision services and manage customer accounts throw off huge volumes
of data, so do the endpoint devices that subscribers use. CSPs can choose to capture literally every
bit that goes over their networks and correlate it with external factors such as time and location
as well as the identity of individual subscribers. While regulations limit what carriers can do with
individual subscriber data, carriers can anonymize this information and pull it together into profiles
that are useful for everything from network optimization to advertising.

Independent research organizations are too numerous to list here. Scores of independent analyst
firms as well as captive research organizations owned by the carriers themselves monitor various
aspects of the industry. The U.S. Library of Congress has a sample of sources here.

Industry associations are also too numerous to list. Major U.S. groups include:

• Telecommunications Industry Association


• NCTA (The Internet & Television Association)
• Alliance for Telecommunications Industry Solutions (ATIS)
• Cellular Telecommunications Industry Association (CTIA)
• Rural Broadband Association (NTCA)

Many countries have equivalent bodies. Nearly all of these organizations publish research and data.

Government and regulatory agencies exist in every international market. In the U.S., the big
four are the Federal Communications Commission, CTIA, NCTA and the National Association of
Regulatory Utility Commissioners. Wikipedia lists more than 200 others.

Suppliers can be useful sources of information about equipment usage, technology trends, and
advice on getting more bang for the buck from their products.

Competitors may publish their own research about their markets, and their public regulatory filings
can yield insights on their own operations.

6
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

TELECOMMUNICATIONS DATA TYPES


Telecom carriers collect so much data about their customers that one of their biggest challenges
is deciding what information not to keep. Fortunately, big data tools make it possible to build
inexpensive data lakes and then decide what is most useful. Some of the data types carriers already
collect that could be analyzed include the following:

Data Source Description


CALL DATA RECORD (CDR) Analysis of individual customers can help carriers design service packages that
appeal to different categories of customers, such as frequent callers, heavy users of
international services, or those who call the same numbers often. When aggregated,
these records can yield insight into usage patterns that allow carriers to more effectively
manage their infrastructure and to optimize capacity.

MOBILE INTERNET Data is the most expensive service telecom carriers provide, and it is also the most
USAGE
highly valued by customers. Understanding how different classes of customers use data
can help CSPs be more creative about designing service plans for different categories
of users.

SMART DEVICE/IoT USAGE By tracking the types of apps and sites that customers access from their smart devices,
mobile providers can make informed choices about providing alternative sources or
forming partnerships that could result in bonus revenue streams via subscriptions or
transaction fees. As the Internet of Things ramps up, carriers will increasingly be able
to understand the types of devices that are accessing their network and design service
packages around them.

AUTOMOTIVE DATA Autonomous and semi-autonomous vehicles are essentially computers on wheels,
generating enormous amounts of data that needs to be transmitted between vehicles
and to central control hubs. Telecom providers are already partnering with automotive
companies to outfit vehicles with the wireless capabilities required by smart cars. This
will tax existing networks, but also create new revenue opportunities.

NETWORK Analyzing data from infrastructure equipment—such as voltage and current levels,
EQUIPMENT DATA
outages and operational efficiency—can help providers deploy those resources more
efficiently, detect trouble areas, and perform preventive maintenance to avoid downtime
and expensive after-hours repairs.

SERVER, NETWORK, AND These can yield a bounty of information about customer behaviors that can be used to
APPLICATION LOGS
optimize bandwidth, improve service levels, and correlate customer behavior to external
events such as storms and breaking news.

BILLING DATA This can be a source of both insight and competitive advantage. Understanding how
and when customers pay bills can help service providers reduce delinquency rates and
create services that make payments easier for customers. Billing data can also be used
to proactively optimize customers’ accounts and present them with information that
helps them make more cost-effective use of services.

MACHINE-TO-MACHINE Taking advantage of existing networks to connect equipment within the service provider’s
DATA
infrastructure can help providers balance resources to reduce slowdowns and outages.
As the Internet of Things takes hold, services optimized to connect machines, such as
medical equipment and automobiles, could uncover new revenue streams.

SOCIAL NETWORK DATA Tapping into conversations on social networks is one of the best ways to ensure
customer satisfaction, detect potential defections, and gain intelligence on competitors.

7
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

USE CASE EXAMPLES


All these resources have been used in real-life scenarios to change the telecommunications game.

Customer Loyalty and Acquisition


Once a highly regulated industry, telecom is now a virtual free-for-all. With the market nearly
saturated, carriers focus much of their efforts on stealing customers from each other. And thanks
to phone number portability and the decline of service agreements, it’s never been easier for
customers to switch.

The telecommunications industry suffers the second lowest customer satisfaction ratings after
government.4 In the U.S., turnover rates for the major carriers typically run around 20% per year.
Globally, the churn rate is much higher.5 At the same time, the cost of acquiring a single new
telecommunications customer has been pegged at more than $300.6 With more than 17 million
customers signing up for new plans or switching carriers each year, acquisition costs add up to
more than $5 billion annually.

This has made customer service a critical competency for all providers.

Big data and analytics tools help carriers understand customer behaviors and interests at a fine
level. For example, CSPs can use a variety of internal and external metrics for churn analysis,
which alerts companies to signals that a customer may be about to defect. Evidence might include
declining usage, repeated calls to the customer support center, or frequent dropped calls. Social
media also presents valuable new ways to detect at-risk customers. By monitoring online sentiment
and matching usernames to customer accounts, telcos can pinpoint dissatisfied customers and
extend individualized retention offers. Social media is also a valuable feedback mechanism for
new products and services. Customer reactions to new equipment, service plans, or offers can be
captured almost immediately and used to adjust pricing or marketing plans proactively.

Churn analysis, driven by big data and analytics, enables telecom companies to identify potential
defectors quickly and to target their retention strategies more selectively. For example, a
company can look at a large pool of recent lost customers and cross-tabulate the data with other
characteristics, such as marital status, age, location, volume of use, or payment delinquency. The
same analysis can be performed on customers who have increased spending with their providers.
This analysis yields “buckets” of customers that can be categorized according to their likely future
spending patterns. Offers and incentives can then be targeted to groups of customers. Likely
defectors can be intercepted, even if they have expressed no explicit plans to leave. This is important
since research indicates that only 5% of dissatisfied customers ever overtly express dissatisfaction
and about 80% of defectors give no reason for leaving.7

Benchmarks by Sector, American Customer Satisfaction Index, 2017.


4

5
Dobardziev, Angel, Ovum’s global survey shows that telecoms operators could lose up to half their customers in the next year, Ovum TMT
Intelligence, November 6, 2014.
Safko, Lon, How Much Did That New Customer Cost You?, Entrepreneur, January 14, 2013
6

Barlow, Janelle M. & Moller, Claus, A Complaint Is a Gift: Recovering Customer Loyalty When Things Go Wrong, 2nd edition (Oakland, Calif:
7

Berrett-Koehler Publishers; 2nd edition, 2008).

8
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

The impact of churn analysis can be substantial. McKinsey cited one telecom provider’s use of
machine learning to combine sociodemographic data, information from customer calls, and social
media interaction network usage data to identify the customers who were most likely to defect or
have trouble paying their bills. It cut churn by three percent and improved payment recovery by 35%.8

A key element of customer retention is delivering individualized service and promotions. Multi-
channel call center automation software now makes it possible for companies to create unified
views of their customers composed of feedback from phone calls, email, social media, and even
in-store visits. By applying analytics to these rich profiles, telecom companies can group customers
by category and customize marketing programs and offers to them. For example, heavy data users
may be presented with bonus bandwidth or coupons for free streaming video, while customers with
modest data needs may be offered discount upgrade incentives to get them to the next level. These
tactics work; half of the business-to-business customers surveyed by Forrester Consulting rated
personalized recommendations as the feature they would most like suppliers to offer.9

Capturing data from multiple sources in a reference database, as illustrated below, makes it
possible for that information to be used in a variety of use cases ranging from searchable customer
records to model training for machine learning applications.10 The ability of Hadoop and NoSQL
databases to combine and perform analytics on a mix of both structured and unstructured data
makes applications practical that were previously impossible.

WEB LOGS SEARCH INDEX

TRANSACTIONS REFERENCE DB MODEL TRAINING DATA

CSR NOTES & LOGS MASKED EXTRACT

Customers also vote with their clicks, and this activity can be captured and analyzed to understand
customers’ content needs. For example, customers who upload a lot of photographs or streaming
video may be offered free accounts on media-sharing services or cloud storage space. Frequent
music listeners may be offered gift cards for streaming music services. The cost of these giveaways
is a pittance compared to the cost of acquiring a new customer.

Bughin, Jacques, Telcos: The untapped promise of big data, McKinsey Quarterly, June, 2016.
8

How B2B Vendors Are Working to Meet Buyers’ Omni-Channel Desires, MarketingCharts, November 17, 2014.
9

10
Dunning, Ted & Friedman, Ellen, Prototypical Hadoop Use Cases, MapR Technologies e-book, 2010.

9
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

Social media sentiment analysis cuts both ways, and it can be useful in identifying at-risk customers
of competitors. By monitoring negative comments about rivals, telecom companies can offer timely
incentives to make the switch. Online real estate transactions or openings of new businesses
can also trigger offers to customers who are new to the area or those who are leaving but can be
retained. The same profiling techniques used to extend retention offers to customers based upon
demographic characteristics can also customize offers to potential new customers.

Big data permits much finer levels of targeted marketing. Instead of creating direct mail according
to geography, companies can segment customers by combinations of demographic and behavioral
data. Email and display advertising campaigns can be customized based on the characteristics
of individual customers, and tactics like online A/B testing quickly provide feedback on the most
effective offers and messages.

Customers increasingly expect to be treated as markets of one. Big data and analytics is enabling
this goal to be realized.

Product and Service Quality


Telecommunications is a capital-intensive business with global CSP capital expenditures expected to
total more than $2 trillion by 2019.11 Providers are under intense pressure to minimize dropped calls
and data dead zones, which are among the biggest sources of customer dissatisfaction. Breadth
of coverage is also a competitive advantage, so telecoms need to maximize reach while constantly
monitoring their networks for outages, capacity thresholds, and other service quality issues.

The amount of data generated by data equipment is enormous and getting bigger. The advent of
4G mobile networks alone increased the data volumes from mobile devices about tenfold,12 and
the arrival of 5G networks over the next two years promises to do the same. Other growth factors
include location-based data, streaming media, IPv6 addressing, and the arrival of an estimated 50
billion connected IoT devices by 2020. Much of this data will need to be analyzed in real time, both for
service-level compliance and to recognize the promise of new revenue sources through services like
location-based and contextual marketing.

Before the arrival of big data systems like Hadoop, Spark, Flink, and the MapR Converged Data
Platform, it was impractical for carriers to analyze more than a fraction of that information. But with
the price/performance improvements that big data tools have introduced, carriers can now afford to
sift through a much larger amount of activity on their networks. For example, Razorsight, a provider
of analytics services that helps telecommunication companies optimize their sales and marketing
activities, has seen the total cost of storage and processing drop from up to $20,000 per terabyte
in a traditional data warehouse to less than $3,000 per terabyte with a converged solution from
MapR Technologies.13

Ovum forecasts CSP capex over 2014–19 period will surpass US$2tn, Ovum TMT Intelligence, December14, 2014
11

Big Data for the Telecommunications Industry, Informatica Corp., 2012


12

13
Nemschoff, Michele, Hadoop In Action: Razorsight Offers Telecom Clients Predictive Analytics Solutions Based On Hadoop And Apache
Spark, MapR blog, September 8, 2015

10
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

Big data ecosystems like the MapR Converged Data Platform are changing the cost calculus by
drastically reducing the cost of managing the millions of records that flow from telecom systems
and networks every second. Here are some examples of how individual components can be applied:

• Data platform (the MapR Converged Data Platform) can store and manage billions/trillions of files
and petabytes of raw data.

• Event streaming (Apache Kafka or MapR Streams) can handle millions of messages from
connected devices for storage and processing.

• Apache Flume is capable of ingesting millions of CDRs per second into a NoSQL database like
MapR-DB or Cassandra each second.

• Apache Storm can process streaming data in real time and identify irregular or troublesome
patterns.

• Apache Spark or Mahout can be applied to create machine learning models that anticipate
capacity problems, usage spikes, and even equipment outages.

• Apache Flink is a true real-time processing engine that can be analyze data streaming over
the network.

When historical data is combined with stream processing and predictive analytics, telecom providers
can optimize their networks’ performance to unprecedented levels. They can also reduce costs
through predictive maintenance, which enables equipment problems to be diagnosed earlier and
prevents expensive field repairs. IoT will be a major factor here. By capturing data streaming from
connected devices, providers can pinpoint potential trouble areas and dispatch repair crews before a
problem occurs.

Other operational benefits that can be a realized through the use of big data include:

• Call routing efficiency can be improved to reduce customer hold times and optimize service
representative efficiency.

• Demand forecasting can better prepare carriers for infrastructure upgrades or new services.

• Real time call detail record analysis identifies service problems and forecasts capacity needs.

• Proactive customer care alerts customers to service problems or offers incentives for specific
usage scenarios.

• Service plans can be optimized based upon actual use.

Analysis of operational data has traditionally been a batch process, but streaming analytics tools
such as Kafka and Spark extend the same kind of analytics capabilities to data flowing across the
network. Not only are there operational benefits to capturing this kind of data, but CSPs can use
streaming technology to deliver on-the-spot promotions or alerts. We’ve all heard stories of users
being blindsided by large overage charges for services they weren’t even aware they were using like
global roaming. Streaming analytics and mobile alerts should prevent these unpleasant surprises
from damaging customer satisfaction.

Telecom companies that best leverage streaming technologies will gain a competitive edge.

11
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

Security and Compliance


Telecommunications is a regulated industry, and service provider networks are prime targets for
attackers. Again, big data has an important role to play.

Service providers must comply with many standards in areas like service levels, availability, pricing,
and coverage. The penalties for failing to capture this information can be steep, and audit demands
may carry deadlines of 48 hours or less. Historically, responding to regulatory requests has involved
digging through racks of data archived on tape. Thanks to Hadoop, much of this data can now be
stored online for rapid retrieval. Telecom providers can also use the technologies of big data to
better understand their own operations, flag potential regulatory violations, and correct them.

Security is a perpetual cat-and-mouse game in which analytics is playing a growing role. For
example, security information and event management (SIEM) is a growing class of real-time
analytics tools that monitors security alerts generated by network hardware and applications.
It constantly compares network activity to normal traffic patterns and flags anomalies that may
indicate penetration or fraud. The concept has existed for more than a decade, but the new breed of
machine learning and predictive analytics tools, combined with large data stores, promises to make
this technology far more effective.

Telecom providers have long had the ability to capture all the data that streams across their
networks, but they can now do so much more affordably thanks to big data and streaming analytics.
The potential bottom-line impact is clear. The industry loses about $38 billion to fraud each year,14
or about 1.7% of total revenues.

The benefits of strong security go beyond just revenue impacts. As CSPs increasingly expand into
cloud hosting, software-as-a-service, and managed services, their ability to secure their networks
will be an increasingly important factor in customer satisfaction. For example, Macquarie Telecom,
which provides secures communications services for 42% of government agencies in Australia, is
using the MapR Converged Data Platform to monitor hundreds of systems and to aggregate logs
that produce data in multiple formats. Data volumes have increased exponentially in recent years,
and the MapR Converged Platform was the best one to handle the company’s capacity and speed
requirements. The combination of real-time analytics and predictive security enables government
employees to access internet information without worrying about malicious payloads. It has also
made reporting more timely. Reports that used to take 14 days now take only two hours.15

Global Fraud Loss Survey, Communications Fraud Control Association, 2015


14

Macquarie Telecom Deploys MapR to Secure Australian Government Communications, MapR press release, November 24, 2015
15

12
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

THE MAPR CONVERGED DATA PLATFORM IN TELECOMMUNICATIONS


By pursuing our data-centric vision for a new generation of applications, MapR has created an
applications platform that converges the management of data of any size, speed, and format. It
was for this work that the company was recently awarded a patent (US9207930). This is the MapR
Converged Data Platform.
MAPR CONVERGED DATA PLATFORM

OPEN SOURCE ENGINES AND TOOLS COMMERCIAL ENGINES AND APPLICATIONS

UNIFIED MANAGEMENT AND MONITORING


PROCESSING

Search and Cloud and Custom


Others Managed Apps
Services

HDFS API POSIX API HBase API JSON API Kafka API

WEB-SCALE STORAGE DATABASE EVENT STREAMING


DATA

MAPR-FS MAPR-DB MAPR STREAMS

High Availability Real Time Unified Security Multi-tenancy Disaster Recovery Global Namespace

ENTERPRISE-GRADE PLATFORM SERVICES

Open Source Innovation on a Trusted Platform


The MapR Converged Data Platform is designed to deliver utility-grade data services and
commercially supported open source innovations to development teams, IT operations, business
analysts, and data scientists. Open source technology is a fantastic creative driver for the
sophisticated new challenges that big data, and especially new data, uncovers.

Without a converged data platform, critical information can get stuck in data silos, and inefficient
use of hardware resources can result in a costly cluster sprawl of underutilized servers and storage.
With the MapR Platform, businesses can enjoy real-time insights based on secure, protected, high-
fidelity data.

Seamless Integration with Existing Enterprise Systems


One of the most profound design decisions made by MapR was to create an enterprise-grade file
and storage system to house the data in the big data ecosystem. The MapR File System, based on
the trusted POSIX/NFS standard, makes it easier to get data in and out of the MapR Platform using
familiar enterprise tools. MapR provides open APIs for developer access to data with standard
interfaces like SQL, HDFS, HBase, JSON, and Kafka.

Continuous Trusted Operations


With its consistent focus on the integrity of data, MapR has created a hardened, clustered
platform that can withstand multiple hardware failures, data center outages, malicious attacks,
and intrusions from cybercriminals. Many proven methods of data protection, such as failover,
redundancy, and access controls, are built into the MapR Platform.

13
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

Big Data with Enterprise Stability


Game-changing big data applications and analytics will continue to rely upon open-source software.
As a company founded in and contributing to the open-source world of Hadoop and Spark, MapR
continues to define enterprise requirements and best practices for successfully using the latest
open source innovations. We deliver monthly updates to open source software packages to ensure
you have the latest innovations.

MapR Telecommunications Architecture

USE CASES

Upselling Customer Fraud Targeted


Cross-selling Segmentation Detection Marketing

Telemetry Capacity Call Network Recommendation


Planning Optimization Engine

Product and Security and Personalized Customer Loyalty


Service Quality Compliance Offers Acquisition

PROCESSING INSIGHTS STAKEHOLDERS


DATA SOURCES INGEST

OPEN SOURCE ENGINES AND TOOLS COMMERCIAL ENGINES AND APPLICATIONS


Billing Data Data Exploration Carriers

UNIFIED MANAGEMENT AND MONITORING


PROCESSING

Call Data Records Search and Cloud and


Managed
Custom
Dashboards Customers
Streaming Others
Services
Apps

Data Ingest
Mobile Usage Analytics Suppliers

Smart Device Data HDFS API POSIX API HBase API JSON API Kafka API Applications Regulators

Network Data Search Content Providers


WEB-SCALE STORAGE DATABASE EVENT STREAMING
DATA

POSIXNFS MAPR-FS MAPR-DB MAPR STREAMS


Server Logs Advertisers
File Ingest
High Availability Real Time Unified Security Multi-tenancy Disaster Recovery Global Namespace
M2M Data ENTERPRISE-GRADE PLATFORM SERVICES

Social Network Data


MAPR CONVERGED DATA PLATFORM

Data-Driven Improvement of Services or Product


Telecom companies need to share data between cell towers, users, and processing centers.
Because the volumes can be very large, it’s important to process data from the source and efficiently
transfer it to various data centers for further use. MapR Streams, a new distributed messaging
system, is able to transport huge amounts of data and make it available with reliable geo-distributed
replication across multiple data centers. With MapR Streams, you can replicate streams in a
master-slave, many-to-one, or multi-master configuration between thousands of geographically
distributed clusters.

TOPIC. A

TOPIC. B TOPIC. A

TOPIC. C TOPIC. B

REPLICATING TOPIC. C
TO ANOTHER
CLUSTER

14
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

For example, one MapR customer uses MapR Streams to collect real-time data from all of its
regional data centers and bring it to its central data center. Previously, the customer used FTP
to transfer data from antennas to regional data centers and from there to headquarters, but the
process suffered from extreme latency delays.

20-30 MINUTES

REGIONAL DASHBOARD
AGGR. DATA CENTER FOR
A REGIONAL
DATA CENTER
CENTRAL A
DATA CENTER
FTP FTP
MONITORING DASHBOARD
SYSTEM FOR
REGIONAL
REGIONAL STAGING DATA CENTER
AGGR. FTP DATA CENTER FTP B
FILE
B SERVER
REPORTING DASHBOARD
SYSTEM FOR
FTP FTP REGIONAL
DATA CENTER
C

REGIONAL
AGGR. DATA CENTER
C

Now data is collected at regional data centers with MapR Streams and made available in real time to
regional dashboards.

FILTERING CONFIG

KIBANA
PRODUCER TOPIC CONSUMER
FILE SERVER (JAVA) INDEX
(JAVA)
ELASTICSEARCH

Monitoring directory Parsing master data


Parsing CSV files Subscribing topic DASHBOARD
Publishing messages Join tables
to topic Aggregation

MapR Streams topics at regional data centers are replicated in a many-to-one configuration to
the central data center, making events available in real time to the headquarters dashboard. The
company can now monitor global performance and react quickly to improve customer services.

15
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

NETWORK REGIONAL EVENT STREAM CENTRAL


COMPONENTS DATA CENTERS REPLICATION DATA CENTER

STREAM
REAL-TIME
DASHBOARD
TOPIC

OTHER DATA
SOURCES

STREAM

TOPIC

AD-HOC REAL-TIME REPORTING


ANALYSIS ANALYSIS
STREAM

TOPIC Performance and


other monitoring
related data

Being able to process high throughput geo-distributed events in real time enables the company
to understand how and where service issues are trending and how that is affecting customers.
Crowd-based antenna optimization enables monitoring of rapidly changing network usage patterns,
with the ability to reconfigure network support to handle short-term surges, such as heavy usage
near a stadium during a sporting event.

Service optimization through equipment monitoring, capacity planning, and preventative


maintenance cuts down on dropped calls, network coverage gaps, bandwidth issues, slow download
times, long service wait times, and frequency switching.

Customer 360
Using data science to better understand and predict customer behavior is an iterative process that
involves the following steps:

1. Data discovery, collection, correlation, and analysis of data across multiple data sources,
including new data sources that traditional analytics or databases can’t use.

2. Application of machine learning algorithms to get value out of the data.

3. Use of models in production to make predictions.

4. Updating models with new data.

16
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

DATA DISCOVERY
MODEL CREATION TEST
SET
CUSTOMER
DATA CRM EVALUATE RESULTS
Prediction Modeling
FEATURE EXTRACTION
CALL CENTER Cohort Analysis
RECORDS Customer Lifetime Value
TRAINING MODEL TEST
Analysis
SET TRAINING/ MODEL
APPLICATION Attrition Modeling
BUILDING PREDICTIONS
LOGS Response Modeling
HISTORICAL Churn Modeling
WEB
CLICKSTREAM DATA

PRODUCTION

FEATURE EXTRACTION DEPLOYED PREDICTIONS


MODEL

NEW
DATA

Factors that can be analyzed to better understand the customer include:

• Customer demographic data (age, marital status, etc.)

• Sentiment analysis of social media

• Customer usage patterns

• Geographic usage trend

• Calling-circle data

• Browsing behavior from clickstream logs

• Support center statistics

• Historical data that shows patterns of behavior that suggest churn

With this analysis, telecom companies can gain insights that help them predict and enhance the
customer experience, prevent churn, and tailor marketing campaigns.

The architecture below shows how batch processing on different data sources can be used to build
and update a model, which can then be used for real-time predictions on streaming data.

DATA SOURCES COLLECT DATA STREAM PROCESSING SERVE DATA

CUSTOMER DERIVE
DATA CRM STREAM PROCESS
FEATURES

TOPIC
CALL CENTER
RECORDS
Models

APPLICATION BATCH PROCESSING Update Model


LOGS
MODEL
Build
Model
WEB
CLICKSTREAM
Feature Extraction at Machine-learning

17
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

Data Warehouse Optimization


A leading telecommunications provider plans to improve reporting and analytics on all aspects
of customer usage and billing with the expectation that it can reduce churn by identifying and
addressing network hotspots. MapR is the only Converged Data Platform that can scale to meet
the data volume needs of this company while also satisfying reporting requirements by reducing
workload on existing data warehouse systems.

DATA SOURCES OPTIMIZED DATA ARCHITECTURE MACHINE LEARNING

RELATIONAL, ETL Into Operational


OPERATIONAL APPS
SAAS, Reporting Formats
MAINFRAME (e.g., Parquet) RECOMMENDATIONS

FRAUD DETECTION
DOCUMENTS, WEB-SCALE STORAGE DATABASE EVENT STREAMING LOGISTICS
EMAILS MAPR-FS MAPR-DB MAPR STREAMS

ANALYTICS
BLOGS, DATA TRANSFORMATION, ENRICHMENT AND INTEGRATION SEARCH
TWEETS,
LINK DATA SCHEMA-LESS
DATA EXPLORATION

DATA WAREHOUSE BI, REPORTING, AD-HOC


LOG FILES,
INTEGRATED ANALYTICS
CLICKSTREAMS
SENSORS

Threat Detection
Solutionary, a subsidiary of NTT Group, is a leader in managed security services. It provides threat
intelligence, incident response, compliance and vulnerability management as a service, using a
platform that collects and correlates vast amounts of data from logs, endpoints, firewalls, and
network devices.

The company needed to improve scalability as data volume grew, but the task was cost-prohibitive
using its existing Oracle database solution. It couldn’t process unstructured log data at scale, and
there were also major performance issues.

Solutionary replaced its RDBMS solution with the MapR Converged Data Platform to achieve
scalability while still meeting reliability requirements. The new solution combines machine learning
algorithms, complex event processing, and predictive analytics to detect real-time security threats.

SOURCES DATA INGEST STREAM PROCESSING NOSQL STORAGE SERVE

Security Feeds

HTTP
TOPICS

Syslog

Firewall

Other

18
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

MACQUARIE SECURE DATA SERVICES


Macquarie Telecom’s Government Division secures telecommunications for 42% of government
agencies in Australia. Macquarie Telecom secures, monitors, and analyzes hundreds of systems logs
and other data that together comprise about one billion events every day. The company uses this
information to predict and prevent cyber attacks. The division needed to transition from standard
tools to big data analytics to be able to provide more real-time analysis on a rapidly growing amount
of data.

Macquarie chose MapR to collect internet traffic that travels through its gateways into a centralized
data lake. This repository stores information about when users open email attachments, visit
websites, and download software to their devices. The MapR Platform then runs analytics to predict
when and where attacks might come from and to enable insights about how to anticipate threats and
proactively secure the government’s system.

DATA SOURCES

ETL Into Operational


TIME SERIES, Reporting Formats
RELATIONAL
STRUCTURED DATA, (e.g., Parquet)
JSON
JSON

WEB-SCALE STORAGE DATABASE EVENT STREAMING AGILE,


MAPR-FS MAPR-DB MAPR STREAMS
SERVER SELF-SERVICE DATA
LOGS EXPLORATION
UNSTRUCTURED MAPR CONVERGED DATA PLATFORM
DATA

EMAIL, MULTI-TENANCY ACCESS CONTROLS


SOCIAL JOB/DATA PLACEMENT FILE, TABLE, COLUMN,
NFS/ RAW FILES CONTROL, VOLUMES COLUMN FAMILY, DOC,
SUB-DOC LEVELS TABLE REPLICATION
GLOBAL
AUDITING SNAPSHOTS MULTI-MASTER,
COMPLIANCE, ANALYZE BUSINESS
TRACK DATA LINEAGE
USER ACCESS CONTINUITY
AND HISTORY
REAL-TIME EVENT
DATA

The MapR Platform provides Macquarie with cost-effective scalable storage, analysis, and better
performance. Macquarie can now provide timely, tailored reports to its government clients, allowing
them to get more value more from their data, make better predictions, and be more responsive to
citizens.

19
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

All of the components of the use cases just discussed can run on the same cluster with the MapR
Converged Data Platform, providing advantages such as:

• Less complexity, fewer moving parts, and fewer things to manage because multiple clusters for
Streams/HBase/Spark/Hadoop can be merged into one cluster

• Joining data sources into one core data mediation platform so that applications consume data in
an easier way

• Unified security

• High reliability and high availability with replication from data center to data center

SOURCES/APPS BULK PROCESSING STREAM PROCESSING

TM

WEB-SCALE STORAGE DATABASE EVENT STREAMING


MA PR-FS MAPR-DB MAPR STREAMS

MAPR CONVERGED DATA PLATFORM

Telecom is a classic example of the big data issues of huge volume and velocity, but CSPs also have
demanding requirements for quick response, security, and reliability. The use cases we just described
show how telecom companies can not only address these requirements, but also unlock value from
data that was previously inaccessible or impractical to use.

20
MAPR GUIDE TO BIG DATA
IN TELECOMMUNICATIONS

CONCLUSION
In a recent survey of 273 global telecom companies,16 McKinsey identified a strong appetite for big
data projects, but relatively little use. While nearly half of respondents said their companies are
considering investments in big data and analytics, only 30% had actually made them. Many of those
reported disappointing results, with little incremental profit improvement. However, these results
were mostly blamed on poor data quality, lack of talent, and under-investment.

In contrast, a small group of telecom providers had achieved “outsized benefit” from their
investments. For example, one had used analytics models to predict the periods of heaviest network
use resulting from customer video streaming. It was able to take steps to relieve congestion and
reduce its planned capital expenditures by 15%. “The potential for companies that apply data science
effectively is substantial,” McKinsey researcher Jacques Bughin wrote.

Effective use of big data requires commitment, a clear understanding of goals, and an investment in
skills and technologies. There can be no question that telecommunications companies have many
potential use cases that can significantly improve their understanding of customers and their own
infrastructure. The best approach for early adopters is to identify projects with measurable
short-term opportunities then deploy a scalable platform that can adapt to a wide variety of data
types and tools.

MORE INFORMATION AND USE CASES


• Big Data and MapR for Telecommunications

• Churn Prediction with PySpark using MLlib and ML Packages

• How to Use Data Science and Machine Learning to Revolutionize 360° Customer Views

• MapR Streams Apache Apex Telecom use case

• NTT Comware Deploys MapR to Power Hadoop-as-a-Service for SmartCloud®

• Razorsight Offers Telecom Clients Predictive Analytics Solutions based on Hadoop and
Apache Spark

• Macquarie Telecom deploys MapR technology

• Quantium Delivers Lightning-Fast Customer Analytics Using Hadoop and Apache Spark

Jacques, Telcos: The untapped promise of big data


16

MapR and the MapR logo are registered trademarks of MapR and its subsidiaries in the United States and other countries. Other marks and brands may be claimed
as the property of others. The product plans, specifications, and descriptions herein are provided for information only and subject to change without notice, and are
provided without warranty of any kind, express or implied. Copyright © 2017 MapR Technologies, Inc.

For more information visit mapr.com

Das könnte Ihnen auch gefallen