Beruflich Dokumente
Kultur Dokumente
Content
Introduction
01
02
03
04
01
01
Telefnica
Fifth largest telecommunications company in the world Operations in Europe (7 countries), the United States and Latin America (15 countries)
Telefnica Digital
Web and mobile digital contents and services division
Telefnica PDI
01
Telefnica PDI
01
IPTV service Mobile service Music tickets service Location based offers
Gender Film and music preferences Permission to contact by SMS? Gender Address Music preferences Address Permission to contact by SMS? So you want to know my address AGAIN?!
Telefnica PDI
01
IPTV service Mobile service Music tickets service Location based offers
Gender Film and music preferences Permission to contact by SMS? Gender Address Music preferences Address Permission to contact by SMS?
Telefnica PDI
01
IPTV service Mobile service Music tickets service Location based offers
Telefnica PDI
01
Features:
Flexible prole denition, classied in services
Prole sharing options between different services
Real time API
Supplementary offline batch interface
Authorization system
High availability
Inexpensive solution & hardware
Telefnica PDI
02
02
Data model
Services, users and their prole
Services dened a set of attributes (their prole), with default value and data type
Users were registered in services
Users dened values for some of the services attributes
Each attribute value had an update date to avoid overwriting newer
changes through batch loads
Telefnica PDI
11
02
Data model
Services prole sharing matrix
Services could access attributes declared inside other services There were sharing rights for read or read and write The user had to be registered in both services
Telefnica PDI
12
02
Data model
Authorization system
Everything that could be accessed in the PS was a resource Roles dened access rights (read or read and write) of resources Auth users had roles Roles could include other roles
Telefnica PDI
13
02
Data model
Bonus features!
Multiple IDS:
Users prole could be accessed with different equivalent IDs depending on the service Each user ID was dened by an ID type (phone number, email, portal ID, hash) and the ID value
Telefnica PDI
14
02
02
02
Integration
Planned integration
Telefnica PDI
17
02
Integration
Problems arise
Telefnica PDI
18
02
Batch
Full DWH customers prole import: > 24 hours
Delta extractions: 4 - 6 hours
Loads and extractions performance proportional to data size
API:
Response time with average traffic: 110ms
Telefnica PDI
19
03
The SQL captulo Ttulo del solution Second 3 lneas Mximo version
03
Second version
High level logical architecture
03
Second version
Batch processes
Telefnica PDI
22
03
Second version
DB Batch processing
BAs ur D O
Telefnica PDI
23
03
Second version
New DB-based batch loading process
Validate format, services and attributes existence and values data types Generate intermediate le with structure like target DB table
Load intermediate le (Oracles SQL*Loader) to a temporal table
Switch DB to deferred writing, storing all incoming modications
Merge temporal table and nal table, checking values update date
Replace old users attributes values table with merge result
Apply deferred writing operations
Telefnica PDI
24
03
Second version
New batch extraction process
Loop the whole temporal table for nal formatting (empty elds) From batch side loop across the whole table (SELECT * FROM ) Write each retrieved row as a line in the resulting le
Telefnica PDI
25
03
API:
Ireland requirement: < 500ms
Telefnica PDI
26
03
API:
Telefnica PDI
Response time with average traffic: 80ms
Response time while loading was unpredictable: >300ms
27
04
The SQL captulo Ttulo del solution Third version Mximo 3 lneas
04
Third version
Speed up DB Batch processes
ain) g s (a A r DB Ou
Telefnica PDI
29
04
Third version
New (second) DB-based batch loading process
Load validated le (Oracles SQL*Loader) to a temporal table
Loop the temporal table merging the values into nal table, checking
values update date and data types
Use several concurrent writing jobs
04
Third version
Enhancements to extraction process
Loop the whole temporal table for nal formatting (empty elds) Download and write lines directly inside Oracles sqlplus No SELECT * FROM query from Batch side!
Telefnica PDI
31
04
Batch
Full DWH customers prole import: 1:10 hours (vs. 2:30 hours)
Three Delta extractions: 2:15 hours (vs. 3:00 hours)
Loads and extractions performance proportional to data size
Concurrent batch processes not so harmful
s
DBA Our API:
Response time with average traffic: 110ms
Response time while loading: 400ms
Telefnica PDI
32
F**K YEAH
04
Batch
Two Delta imports: < 2:00 hours
Two Delta extractions: < 2:00 hours
Loads and extractions performance proportional to data size
API:
Response time with average traffic: 90ms
s DBA Our
F**K YEAH
Telefnica PDI
33
04
2nd version 65Gb + > 15Gb 2:30 hours 3:00 hours 110ms Unpredictable
s DBA Our
F**K YEAH
Telefnica PDI
04
20 database tables
API: several queries withup to 35 joins and even some unions
Authorization: 5 joins to validate auth users access
Batch:
Load: 1700 lines of PL/SQL
Extraction: 1200 of PL/SQL
Telefnica PDI
35
04
Mission completed?
Telefnica PDI
36
04
20M customers, 200 prole attributes, 10 services
Mexico time window: 4:00 hours
Full DWH load!
Additional Delta feeds loads
At least two Delta extractions
BAs D Our
Telefnica PDI
37
05
05
Telefnica PDI
39
05
}
Telefnica PDI
40
05
Telefnica PDI
41
05
Only 5 collections API: typically 2 accesses (services and users collections) Authorization: access only 1 collection to grant access Batch: all processing done outside DB
Telefnica PDI
42
05
NoSQL version
High level logical architecture
05
Batch
Full DWH customers prole import: 0:12 hours (vs. 1:10 hours)
Three Delta extractions: 0:40 hours (vs. 2:15 hours)
Loads and extractions performance proportional to data size
Concurrent batch processes without performance affection
API:
Response time with average traffic: < 10ms (vs. 110ms)
Response time while loading: the same
High load (600 TPS) response time while loading: 300ms
Telefnica PDI
44
05
Batch
Two Delta imports: < 0:40 hours (vs. 2:00 hours)
Loads and extractions performance proportional to data size
Telefnica PDI
45
05
Batch
Initial Full import (20M, 40 attributes): 2:00 hours
Small Full import (20M, 6 attributes): 0:40 hours
API:
Response time with average traffic: < 10ms (vs. 90ms)
Response time while loading: the same
High load (500 TPS) response time while loading: 270ms
Telefnica PDI
46
04
SQL version 80Gb 1:10 hours 2:15 hours 400ms Timeout / failure SQL version 700Gb < 2:00 hours
+ loading
05
Mission completed?
Telefnica PDI
48
05
The bad
To keep secondary nodes synched we needed oplog of 16 or 24Gb We had to disable journaling for the rst migrations
Respect the unwritten law of at least 70% of size in RAM
Take care with compound indexes, order matters
You can save one index or you can have problems
Put most important key (never nullable) the rst one
If we had enough RAM for all data, Oracle would outperform MongoDB
05
The ugly
Full import adding 30 new attributes values: 10:00 hours Full import adding 150 new attributes values: 40:00 hours
Solutions?
Avoid this situation at all cost. Run away!
Normalize users values; move to a new individual collection
Prealloc the size with a faux eld
You could waste space!
06
Conclusions Ttulo del captulo Ttulo del captulo Mximo lneas Mximo 3 3 lneas
06
06
Questions?
Telefnica PDI
53
0X
Scale horizontally adding more BE or DB servers or disks in the SAN
Virtualized or physical servers depending on the deployment
Telefnica PDI
55
0X
MongoDB arbiters running on BE servers
Scale horizontally adding more BE servers or disks in the SAN
Sharding may already be congured to scale adding more replica sets
Telefnica PDI
56