Assignment3-Rennie Ramlochan

Reviewer Name: Rennie Ramlochan - ID: #56579
Course Name: Advance Operating Systems Technology

Title: Spanner: Googles Globally-Distributed Database

Research Authors: James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ
Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh,
Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David
Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor,
Ruth Wang, Dale Woodford

In this paper, the writer presents a database model called Spanner. Spanner is a new SQL
distributed relational database designed by Google that can primarily distribute and store data,
in data centers around the world and support externally-consistent distributed transactions.
Spanner provided consistency that equaled and exceeded those used in RDBMS while enabling
it to store an amount of data that exceeds the capacity of a single data center.
In the paper, it can be seen that, Spanner provided the scalability that enabled databases to
store, a few trillion database rows in millions of nodes distributed to hundreds of data centers.
When the data was read from the database, Spanner connected the users to the data center
that was geographically closest to the user, and when the data was written to the database, it
distributed and stored it to multiple data centers. If there were a failure at the data center,
when a user tried to access/read the data, the data was simply read from another data center
that had a replica of the data.

Further enhancements were made to the model to increase scalability and performance. In
order to offer such a broad amount of geographic redundancy as well as let applications read
(and, to a lesser extent write) data without being crushed by huge latencies, the developers
introduced the functionality of True Time, which provided accurate time synchronization in a
distributed system by expressing the inaccuracy of time more specifically by versioning the
data, allowing each version to automatically be stamped with its commit time by the True Time
API and as a result applications were able to read ,write and replicate data across countries and
continents, while having extremely fast read times.
An example of this database used currently is, F1, Google's advertisement platform, which can
specify which datacenters contain which bits of data so that frequently read data can be
located nearer to users, in order to reduce write latency.

A limitation to this solution presently, is that although Spanner is scalable in the number of
nodes, the node-local data structures have relatively poor performance on complex SQL
queries, because they were designed for simple key-value accesses. (James C. Corbett, 2012,
p.263). Inadition we would also need the ability to move client-application processes between
datacentres in an automated, coordinated fashion. Moving processes raises the even more
difficult problem of managing resource acquisition and allocation between datacentres. (James
C. Corbett, 2012, p.263)

In closing, based on 5 years of intensive development efforts, Spanner was developed, by
combining, blending and extending ideas from a multitude of research communities. Firstly,
Spanner accepted familiar, easy-to-use, transactions and SQL-based query language from one
of the database research communities, while also integrating and combining the concepts of
scalability, wide distribution, failure resistance, auto segmentation, data replication and
consistency from other research communities. In addition Spanner also incorporated the
functionality of True Time, which provided accurate time synchronization in a distributed
system by expressing the inaccuracy of time more specifically in the time API. (James C. Corbett,
2012, p.263) As a result, Spanner achieved its goal of developing a scalable, globally-distributed
database which had previously been impossible in Big Tables under globally distributed
environments.

References
Spanner: Googles Globally-Distributed Database
James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman,
Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh,
Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura,
David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak,
Christopher Taylor, Ruth Wang, Dale Woodford

Jeff Shute et al. F1The Fault-Tolerant Distributed RDBMS Supporting Googles Ad Business.
Proc. of SIGMOD. May2012, pp. 777778.

Assignment3-Rennie Ramlochan

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Assignment3-Rennie Ramlochan

Hochgeladen von

Copyright:

Verfügbare Formate

Reviewer Name: Rennie Ramlochan - ID: #56579

Course Name: Advance Operating Systems Technology

Das könnte Ihnen auch gefallen