Beruflich Dokumente
Kultur Dokumente
Mike Carey
mjcarey@ics.uci.edu
Order (id, custName, custCity, total) Product (sku, name, listPrice, size, power)
Item-Order Item-Product
Order
123 Fred LA 25.97 Product
401 Garfield T-Shirt 9.99 XL -
Item-Order
Item Item Product
1 2 9.99 2 1 3.99 544 USB Charger 5.99 - 115V
Item-Product
Item-Product 5
Michael Carey/Padhraic Smyth, UC Irvine: Stats 170A/B, Winter 2018
Enter the Relational DB Era
Order (id, custName, custCity, total)
Product (sku, name, listPrice, size, power)
123 Fred LA 25.97
401 Garfield T-Shirt 9.99 XL null
Item (order-id, ino, product-sku, qty, price) 544 USB Charger 5.99 null 115V
123 1 401 2 9.99
Item Product
1 2 9.99 Item 544 USB Charger 5.99 - 115V
2 1 3.99
• Notice that:
• Data model contains objects and pointers (OIDs)
• The world is no longer flat – the Order and Product
schemas now have set(Item) and Product in them,
respectively
Michael Carey/Padhraic Smyth, UC Irvine: Stats 170A/B, Winter 2018 8
What OODBs Sought to Offer
Item (order-id, ino, product-sku, qty, price) 401 Garfield T-Shirt 9.99 XL
• Be sure to notice:
• “One size fits all!”
• UDTs/UDFs, table hierarchies, references, ...
• But the world got flatter again...
(Timing lagged OODBs by just a few years)
http://asterixdb.apache.org/
Michael Carey/Padhraic Smyth, UC Irvine: Stats 170A/B, Winter 2018 20
Just How Big is “Big Data”?
Cores
Main
Memory This
is big
Disks data!
Michael Carey/Padhraic Smyth, UC Irvine: Stats 170A/B, Winter 2018 21
AsterixDB: “One Size Fits a Bunch”
Semistructured BDMS Desiderata:
Data Management
• Able to manage data
• Flexible data model
• Full query capability
• Continuous data
ingestion
• Efficient and robust
parallel runtime
• Cost proportional to task
at hand
• Support “Big Data data
types”
Parallel 1st Generation •
•
Database Systems “Big Data” Systems •
Highlights include:
• JSON++ based data model
• Rich type support (spatial, temporal, …)
• Records, lists, bags
• Open vs. closed types
Michael Carey/Padhraic Smyth, UC Irvine: Stats 170A/B, Winter 2018 23
ASTERIX Data Model (ADM)
CREATE DATAVERSE TinySocial; CREATE TYPE EmploymentType AS {
USE TinySocial; organizationName: string,
startDate: date,
CREATE TYPE GleambookUserType AS { endDate: date?
id: int };
}; CREATE DATASET GleambookUsers
(GleambookUserType)
PRIMARY KEY id;
Highlights include:
• JSON++ based data model
• Rich type support (spatial, temporal, …)
• Records, lists, bags
• Open vs. closed types
Michael Carey/Padhraic Smyth, UC Irvine: Stats 170A/B, Winter 2018 24
ASTERIX Data Model (ADM)
CREATE DATAVERSE TinySocial; CREATE TYPE EmploymentType AS {
USE TinySocial; organizationName: string,
startDate: date,
CREATE TYPE GleambookUserType AS { endDate: date?
id: int };
}; CREATE DATASET GleambookUsers
(GleambookUserType)
CREATE TYPE GleambookMessageType AS { PRIMARY KEY id;
messageId: int,
authorId: int, CREATE DATASET GleambookMessages
inResponseTo: int?, (GleambookMessageType)
senderLocation: point?, PRIMARY KEY messageId;
message: string
}; Highlights include:
• JSON++ based data model
• Rich type support (spatial, temporal, …)
• Records, lists, bags
• Open vs. closed types
Michael Carey/Padhraic Smyth, UC Irvine: Stats 170A/B, Winter 2018 25
Ex: GleambookUsers Data
{"id”:1, "alias":"Margarita", "name":"MargaritaStoddard", "nickname":"Mags”,
"userSince":datetime("2012-08-20T10:10:00"), "friendIds":{{2,3,6,10}},
"employment": [ {"organizationName":"Codetechno”, "startDate":date("2006-08-06")},
{"organizationName":"geomedia" , "startDate":date("2010-06-17"),
"endDate":date("2010-01-26")} ],
"gender":"F”
},