Beruflich Dokumente
Kultur Dokumente
NoSQL databases are non-relational and are designed for storing unstructured data. Because NoSQL is built
specifically to meet the requirements of big data, mobile applications, and the Internet of Things (IoT), it
provides the flexibility, scalability, and performance many businesses need to drive new, innovative services
and revenue streams.
Most NoSQL solutions are open-source, a particularly attractive option if cost and vendor independence are
important to you.
Making a move to an open-source NoSQL technology could be right for your organization, depending on your
specific needs, your applications, and the type and volume of your data. To help you decide if its the right
choiceeither a complementary technology to an existing RDBMS or as a complete replacementweve
created this quick reference guide. It outlines five of the top open-source NoSQL databases, and provides
overviews, use cases, limitations, and support options.
www.pythian.com
CASSANDRA
OVERVIEW
Apache Cassandra is an eventually consistent distributed database designed
to accept very high write rates and to operate over a geographically
distributed environment.
Data is stored in partitions and rows. Partitions are used for sharding and rows
are very similar to RDBMS tuples or rows. For example, a partition could be a
user_id and rows could be a users class grades and test scores.
SUPPORT OPTIONS
Cassandra is available as open-source software via the Apache 2.0 license.
DataStax offers a licensed product called DataStax Enterprise Edition, which is based on open-source Cassandra.
www.pythian.com
COUCHBASE SERVER
OVERVIEW
Couchbase Server is a document-store NoSQL solution that provides fast reads and writes. A Couchbase
document is in JavaScript Object Notation (JSON) format and is similar to a row or tuple in traditional RDBMS
terminology. However, Couchbase allows much more complex data representations than traditional RDBMS rows.
Couchbase is a great tool for providing built-in high-availability and sharding capabilities for data sets that do
not require full transactional ACID compliance.
SUPPORT OPTIONS
Couchbase is available as open-source software via the Couchbase community license; it is available only in
object code form from the Couchbase website.
Couchbase is also available via the Couchbase enterprise license.
HBASE
OVERVIEW
Apache HBase is an eventually consistent distributed database designed to accept very high write rates and to
operate on top of an Hadoop Distributed File System (HDFS) cluster. HBase is conceptually very similar to Cassandra.
Data is stored in large rows very similar to Google Cloud Bigtable. Rows are similar to RDMS tuples or rows
except there is no need to define each column before using it. For example, a row key could be a user_id and
columns in the row could be a users class grades and test scores.
www.pythian.com
REASONS TO USE INSTEAD OF AN RDBMS LIMITATIONS
Scales linearly by adding nodes to the underlying No secondary index support.
HDFS cluster. Very limited schema features.
Requires a minimal schema. Very limited support for aggregates.
Supports very high write rates. Requires HDFS and Apache ZooKeeper to operate;
Is accessible from Apache Hive, Apache Pig, Spark if access by other Hadoop ecosystem tools is not
SQL, and MapReduce. required, Cassandra is probably a better choice.
Is a good choice for storing data accessible by Access by programming languages outside of the
key if the data needs to be accessed by other Java virtual machine (JVM) ecosystem is limited to a
Hadoop tools. poorly defined Thrift interface.
SUPPORT OPTIONS
Available as open-source software via the Apache 2.0 license.
Enterprise licenses available from Cloudera, Hortonworks, and MapR Technologies.
MONGODB
OVERVIEW
MongoDB is a cross-platform NoSQL solution that uses a document-oriented data model. A MongoDB collection
is analogous to a table in a traditional RDBMS, and a MongoDB document is analogous to a row in an SQL table.
SUPPORT OPTIONS
MongoDB is available as open-source software via the AGPL v3.0 license.
MongoDB, Inc. provides an enterprise license that includes monitoring, deployment, backups, and support.
Additional features such as encryption are only available with the enterprise license.
Percona Server for MongoDB is an open-source solution that provides similar enterprise-only features of
MongoDB as well as additional storage engines not found in the upstream open source version.
www.pythian.com
NEO4J
OVERVIEW
Neo Technologys Neo4j is a popular graph database. In a graph database, like a relational database, there are
entities and relationships between entities. However, instead of focusing on the entities, graph databases
focus on the relationships. There can be thousands or even millions of relationships between entities
represented by a graph database.
SUPPORT OPTIONS
Neo4J is available in a free community edition under GPL V3 but it is limited to running on only one node due
to a lack of clustering. There are also no hot backups.
An enterprise version of Neo4J is available from the Neo Technology.
A government edition extends the enterprise edition, adding extra government-specific services.
www.pythian.com
ABOUT THE AUTHORS
Derek Downey is the Practice Advocate for the Open-source Database practice
at Pythian, helping to align technical and business objectives for the company and
for our clients. Derek loves automating MySQL, implementing visualization
strategies, and creating repeatable training environments.
Follow Derek on Twitter @derek_downey
ABOUT PYTHIAN
Pythian is a global technology-enabled IT services company that helps businesses compete by
adopting disruptive technologies such as advanced analytics, big data, cloud, databases, DevOps
and infrastructure management to advance innovation and increase agility. Specializing in
designing, implementing, and managing systems that directly contribute to revenue growth and
business success, Pythians highly skilled technical teams work as an integrated extension of our
clients organizations to deliver solutions that enable strategic use of data, accelerate software
delivery, and ensure reliable scalable IT systems.
Pythian, The Pythian Group, love your data, pythian.com, and Adminiscope are trademarks of The Pythian Group Inc. Other
product and company names mentioned herein may be trademarks or registered trademarks of their respective owners. The
information presented is subject to change without notice. Copyright 2016. The Pythian Group Inc. All rights reserved.
www.pythian.com