Beruflich Dokumente
Kultur Dokumente
2
New Features for Web-Scale Performance with CarrierGrade Availability
Table of Contents
Introduction ................................................................................................... 3
New Feature Overview.................................................................................. 3
MySQL Cluster Architecture ........................................................................ 4
New Features in MySQL Cluster 7.2............................................................ 6
Adaptive Query Localization (AQL) 70x speedup for complex joins ..... 6
NoSQL with native memcached API ........................................................ 8
Performance enhancements .................................................................. 13
Multi-site clustering ................................................................................. 13
Simplified active-active geographic replication & conflict resolution ...... 14
Consolidated user privileges .................................................................. 20
MySQL 5.5 server integration ................................................................. 21
Support for Virtual Machines .................................................................. 21
MySQL Enterprise Monitor further enhancements for MySQL Cluster 22
MySQL Cluster Carrier-Grade Edition Key Components .................... 22
MySQL Cluster Manager ........................................................................ 23
Oracle Premier Support .......................................................................... 24
Conclusion .................................................................................................. 25
Additional Resources ................................................................................. 26
Page 2
Introduction
MySQL Cluster is a scalable, real-time, ACID-compliant transactional database, combining 99.999% availability with the low TCO of open source. Designed around a distributed, multi-master architecture with no single point of failure, MySQL Cluster scales horizontally on commodity hardware with auto-sharding to serve read and write intensive workloads, accessed via SQL and NoSQL interfaces. Originally designed as an embedded telecoms database for in-network applications demanding carrier-grade availability and real-time performance, MySQL Cluster has been rapidly enhanced with new feature sets that extend use cases into web and enterprise applications, including: High volume OLTP; Real time analytics; Ecommerce, inventory management, shopping carts, payment processing, fulfillment tracking, etc.; Financial trading with fraud detection; Mobile and micro-payments; Session management & caching; Feed streaming, analysis and recommendations; Content management and delivery; Massively Multiplayer Online Games; Communications and presence services; Subscriber / user profile management and entitlements The purpose of this whitepaper is to explore the latest enhancements delivered as part of the MySQL Cluster 7.2 release, enabling users to: Support the development of highly dynamic services: 70x higher complex query performance, native memcached API, 4x higher data node scalability, integration with the latest MySQL 5.5 server and support for Virtual Machine (VM) environments; Enhance cross data scalability: new multi-site clustering and active/active replication; Simplify provisioning and administration: consolidated user privileges.
Page 3
Figure 1: The MySQL Cluster architecture MySQL Cluster comprises three node types which collectively provide service to the application. The application itelf doesnt need to know anything about these different nodes it simply connects to the cluster and service is seamlessly provided. The node types making up MySQL Cluster are as follows : Data nodes manage the storage and access to data. Tables are automatically sharded across the data nodes which also transparently handle load balancing, replication, failover and self-healing.
Page 4
Application nodes provide connectivity from the application logic to the data nodes. Multiple APIs are presented to the application. MySQL provides a standard SQL interface, including connectivity to all of the leading web development languages and frameworks. There are also a whole range of NoSQL inerfaces including Memcached, C++ (NDB-API), Java, JPA and REST/HTTP. Management nodes are used to configure the cluster and provide arbitration in the event of a network partition.
For a more detailed guide to the architecture of MySQL Cluster, please refer to the following Guide: http://mysql.com/why-mysql/white-papers/mysql_wp_scaling_web_databases.php
Page 5
http://messagepassing.blogspot.com/2011/01/low-latency-distributed-parallel-joins.html
Page 6
Using the new functionality this can be performed with a single network round trip where the second read operation is dependent on the results of the first:
ndbapi > read column @b:=b from t1 where pk = 22; read column c from t2 where pk=@b; [round trip] (b = 15, c = 30) [ return b = 15, c = 30 ]
This complexity is completely hidden from you when using SQL but if youre using the NDB API in your application, then it is useful to understand how to use these parameterized queries. In addition, MySQL Cluster now provides the MySQL Server with better information on the available indexes that allows the MySQL optimizer to automatically produce better query execution plans. Previously it was up to the user to provide hints to the optimizer. Real world testing has demonstrated typical performance gains of 70x across a range of queries. One example comes from an online content store and management system that involved a complex query joining 11 tables. This query took over 87 seconds to complete with MySQL Cluster 7.1, and just 1.2 seconds when tested with MySQL Cluster 7.2. You can review the actual query, test environment and full results here: http://www.clusterdb.com/mysql-cluster/70x-faster-joins-with-aql-in-mysql-cluster-7-2-dmr/ How to use AQL The use of AQL is controlled by a global variable - ndb_join_pushdown which is on by default. In order for a join to be able to exploit AQL (in other words be pushed down to the data nodes), it must meet the following conditions: Any columns to be joined must use exactly the same data type. (For example, if an INT and a BIGINT column are joined, the join cannot be pushed down) Queries referencing BLOB or TEXT columns are not supported Explicit locking is not supported; however, the NDB storage engine's characteristic implicit row-based locking is enforced. In order for a join to be pushed down, child tables in the join must be accessed using one of the ref, eq_ref, or const access methods, or some combination of these methods. Joins referencing tables explicitly partitioned by [LINEAR] HASH, LIST, or RANGE currently cannot be pushed down
You can check whether a given join can be pushed down by checking it with EXPLAIN; when the join can be pushed down, you can see references to the pushed join in the Extra column of the output. To provide an aggregated summary of how often AQL can be exploited, a number of new entries are added to the counters table in the NDBINFO database. These counters provide an insight into how frequently the AQL functionality is able to be used and can be queried using SQL:
mysql> SELECT node_id, counter_name, val FROM ndbinfo.counters WHERE block_name= "DBSPJ"; +---------+-----------------------------------+------+ | node_id | counter_name | val | +---------+-----------------------------------+------+ | 2 | READS_RECEIVED | 0 | | 2 | LOCAL_READS_SENT | 0 | | 2 | REMOTE_READS_SENT | 0 | | 2 | READS_NOT_FOUND | 0 | | 2 | TABLE_SCANS_RECEIVED | 0 | | 2 | LOCAL_TABLE_SCANS_SENT | 0 |
Page 7
| 2 | RANGE_SCANS_RECEIVED | 0 | | 2 | LOCAL_RANGE_SCANS_SENT | 0 | | 2 | REMOTE_RANGE_SCANS_SENT | 0 | | 2 | SCAN_BATCHES_RETURNED | 0 | | 2 | SCAN_ROWS_RETURNED | 0 | | 2 | PRUNED_RANGE_SCANS_RECEIVED | 0 | | 2 | CONST_PRUNED_RANGE_SCANS_RECEIVED | 0 | | 3 | READS_RECEIVED | 0 | | 3 | LOCAL_READS_SENT | 0 | | 3 | REMOTE_READS_SENT | 0 | | 3 | READS_NOT_FOUND | 0 | | 3 | TABLE_SCANS_RECEIVED | 0 | | 3 | LOCAL_TABLE_SCANS_SENT | 0 | | 3 | RANGE_SCANS_RECEIVED | 0 | | 3 | LOCAL_RANGE_SCANS_SENT | 0 | | 3 | REMOTE_RANGE_SCANS_SENT | 0 | | 3 | SCAN_BATCHES_RETURNED | 0 | | 3 | SCAN_ROWS_RETURNED | 0 | | 3 | PRUNED_RANGE_SCANS_RECEIVED | 0 | | 3 | CONST_PRUNED_RANGE_SCANS_RECEIVED | 0 | +---------+-----------------------------------+------+
Figure 3 Multiple APIs to the data held in MySQL Cluster MySQL Cluster 7.2 adds a native Memcached protocol to the rich diversity of APIs available to access the database. Using the Memcached API, web services can directly access MySQL Cluster without transformations to SQL, ensuring low latency and high throughput for read/write queries. Operations such as SQL parsing are eliminated and more of the servers hardware resources (CPU, memory and I/O) are dedicated to servicing the query within the storage engine itself.
Copyright 2012, Oracle and/or its affiliates. All rights reserved.
Page 8
Beyond performance, there are a number of other advantages to integrating Memcached natively with MySQL Cluster, including the ability to consolidate caching and database tiers into a single layer, reducing the complexity of cache invalidation and consistency checking, etc. These benefits are discussed in more detail in the Developer Zone article posted as follows: http://dev.mysql.com/tech-resources/articles/nosql-to-mysql-with-memcached.html Like memcached, MySQL Cluster provides a distributed hash table with in-memory performance for caching, accessed via the simple memcached API. MySQL Cluster extends memcached functionality by adding support for write-intensive workloads, a full relational model with ACID compliance (including persistence), rich query support, transparent partitioning for scale-out and 99.999% availability, with extensive management and monitoring. Implementation is simple - the application sends reads and writes to the Memcached process (using the standard Memcached API). This in turn invokes the Memcached Driver for NDB (which is part of the same process), which in turn calls the NDB API for very quick access to the data held in MySQL Clusters data nodes. The solution has been designed to be very flexible, allowing the application architect to find a configuration that best fits their needs. It is possible to co-locate the Memcached API in either the data nodes or application nodes, or alternatively within a dedicated Memcached layer as shown in Figure 4.
Figure 4 Memcached API deployment options Developers can still have some or all of the data cached within the Memcached server (and specify whether that data should also be persisted in MySQL Cluster) so it is possible to choose how to treat different pieces of data, for example: Storing the data purely in MySQL Cluster is best for data that is volatile, i.e. written to and read from frequently Storing the data both in MySQL Cluster and in Memcached memory is often the best option for data that is rarely updated but frequently read Data that has a short lifetime, is read frequently and does not need to be persistent could be stored only in the Memcached cache
Users can configure behavior on a per-key-prefix basis (through tables in MySQL Cluster) and the application doesnt have to care it just uses the memcached API and relies on the software to store data in the right place(s) and to keep everything synchronized. Deployment options are discussed in more detail in the following article:
Page 9
http://www.clusterdb.com/mysql-cluster/scalabale-persistent-ha-nosql-memcache-storage-using-mysqlcluster/ Any changes made to the key-value data stored in MySQL Cluster will be recorded in the binary log, and so will be applied during replication and point-in-time recovery.
Figure 5 Schema-free storage of key-values By default, every Key-Value is written to the same table with each Key-Value pair stored in a single row thus allowing schema-less data storage. This is illustrated in Figure 5 the key-value pair (key=town:maidenhead; value=SL6) is simply stored in the same generic table as all other data added through the Memcached API. Alternatively, the developer can define a key-prefix so that each key and value are linked to pre-defined columns in a specific table. This is show in Figure 6 note that this has been simplified for illustration purposes, in reality the behavior is configured in a number of tables, the next section will explain how to set up the actual configuration tables.
Figure 6 Key-value data mapped to user-defined schema In the above figure, the key-prefix town: is used to identify a set of configuration information to be used for any key that starts with that prefix (in this case town:maidenhead). The configuration data indicates that the remainder of the key (maidenhead) maps to the town column of the zip table in the map database and that the value (SL6) associated with that key should be stored in the code column of that same table.
Page 10
Of course if the application needs to access the same data through SQL then developers can map key prefixes to existing table columns, enabling Memcached access to schema-structured data already stored in MySQL Cluster. Configuring and using the Memcached API This section assumes that you have installed and are running your MySQL Cluster database and follows the 2 same steps as the original Memcached API blog post which contains some extra steps. Firstly, start the Memcached server (with the NDB driver activated):
[billy@ws2 ~]$ cd $INSTALL_PREFIX [billy@ws2 mysql-memcache]$ bin/memcached -E lib/ndb_engine.so -e "connectstring=localhost:1186;role=db-only" -vv -c 20
Notice the connectstring this allows the primary MySQL Cluster to be on a different server to the Memcached API. Note that you can actually use the same Memcached server to access multiple Clusters you configure this within the ndbmemcached database in the primary Cluster. There are a number of connectors/clients that applications can use to access the Memcached API but for simplicity, this paper uses a standard Telnet client for an interactive session. After connecting to the Memcached port (11211 by default), we store the value SL6 against the key maidenhead and then retrieve the value:
[billy@ws2 ~]$ telnet localhost 11211 set maidenhead 0 0 3 SL6 STORED get maidenhead VALUE maidenhead 0 3 SL6 END
This same data can now be accessed from the MySQL Cluster database using standard SQL:
mysql> SELECT * FROM ndbmemcache.demo_table; +------------------+------------+-----------------+--------------+ | mkey | math_value | cas_value | string_value | +------------------+------------+-----------------+--------------+ | maidenhead | NULL | 263827761397761 | SL6 | +------------------+------------+-----------------+--------------+
Of course, you can also modify this data through SQL and immediately see the change through the Memcached API:
mysql> UPDATE ndbmemcache.demo_table SET string_value='sl6 4' WHERE mkey='maidenhead'; [billy@ws2 ~]$ telnet localhost 11211 get maidenhead VALUE maidenhead 0 5 SL6 4 END
Note that this is completely schema-less, the application can keep on adding new key/value pairs and they will all get added to the default table. This may well be fine for prototyping or modest sized databases. As you can see this data can be accessed through SQL but theres a good chance that youll want a richer schema on the SQL side or youll need to have the data in multiple tables for other reasons (for example you want to
http://www.clusterdb.com/mysql-cluster/scalabale-persistent-ha-nosql-memcache-storage-using-mysql-cluster/
Page 11
replicate just some of the data to a second Cluster for geographic redundancy or to InnoDB for report generation). The next step is to create your own databases and tables (assuming that you dont already have them) and then create the definitions for how the app can get at the data through the Memcached API. First lets create a table that has a couple of columns that well also want to make accessible through the Memcached API:
mysql> CREATE DATABASE clusterdb; USE clusterdb; mysql> CREATE TABLE towns_tab (town VARCHAR(30) NOT NULL PRIMARY KEY, zip VARCHAR(10), population INT, county VARCHAR(10)) ENGINE=NDB; mysql> INSERT INTO towns_tab VALUES ('Marlow', 'SL7', 14004, 'Berkshire');
Next we need to tell the NDB driver how to access this data through the Memcached API. Two containers are created that identify the columns within our new table that will be exposed. We then define the keyprefixes that users of the Memcached API will use to indicate which piece of data (i.e. database/table/column) they are accessing:
mysql> USE ndbmemcache; mysql> INSERT INTO containers VALUES ('towns_cnt', 'clusterdb', 'towns_tab', 'town', 'zip', 0, NULL, NULL, NULL); mysql> INSERT INTO containers VALUES ('pop_cnt', 'clusterdb', 'towns_tab', 'town', 'population', 0, NULL, NULL, NULL); mysql> SELECT * FROM containers;
+------------+-------------+-----------------+-------------+----------------+-------+------------------+------------+--------------------+ | name | db_schema | db_table | key_columns | value_columns | flags | increment_column | cas_column | expire_time_column | +------------+-------------+-----------------+-------------+----------------+-------+------------------+------------+--------------------+ | demo_table | ndbmemcache | demo_table | mkey | string_value | 0 | math_value | cas_value | NULL | | pop_cnt | clusterdb | towns_tab | town | population | 0 | NULL | NULL | NULL | | demo_tabs | ndbmemcache | demo_table_tabs | mkey | val1,val2,val3 | 0 | NULL | NULL | NULL | | towns_cnt | clusterdb | towns_tab | town | zip | 0 | NULL | NULL | NULL | +------------+-------------+-----------------+-------------+----------------+-------+------------------+------------+--------------------+
mysql> INSERT INTO key_prefixes VALUES (1, 'twn_pr:', 0, 'ndb-only', 'towns_cnt'); mysql> INSERT INTO key_prefixes VALUES (1, 'pop_pr:', 0, 'ndb-only', 'pop_cnt'); mysql> SELECT * FROM key_prefixes; +----------------+------------+------------+---------------+------------+ | server_role_id | key_prefix | cluster_id | policy | container | +----------------+------------+------------+---------------+------------+ | 1 | pop_pr: | 0 | ndb-only | pop_cnt | | 3 | | 0 | caching | demo_table | | 0 | | 0 | caching | demo_table | | 0 | db: | 0 | ndb-only | demo_table | | 0 | mc: | 0 | memcache-only | NULL | | 2 | | 0 | memcache-only | NULL | | 1 | | 0 | ndb-only | demo_table | | 1 | t: | 0 | ndb-only | demo_tabs | | 1 | twn_pr: | 0 | ndb-only | towns_cnt | +----------------+------------+------------+---------------+------------+
Now these columns (and the data already added through SQL) are accessible through the Memcached API:
[billy@ws2 ~]$ telnet localhost 11211 get twn_pr:Marlow VALUE twn_pr:Marlow 0 3 SL7 END set twn_pr:Maidenhead 0 0 3 SL6 STORED set pop_pr:Maidenhead 0 0 5 42827 STORED
Page 12
| town | zip | population | county | +------------+------+------------+-----------+ | Maidenhead | SL6 | 42827 | NULL | | Marlow | SL7 | 14004 | Berkshire | +------------+------+------------+-----------+
One final test is to start a second memcached server that will access the same data. As everything is running on the same host, we need to have the second server listen on a different port:
[billy@ws2 ~] cd /usr/local/mysql-memcache [billy@ws2 mysql-memcached]$ bin/memcached E lib/ndb_engine.so -e "connectstring=localhost:1186;role=db-only" -vv -c 20 p 11212 -U 11212 [billy@ws2 ~]$ telnet localhost 11212 get twn_pr:Marlow VALUE twn_pr:Marlow 0 3 SL7 END
Performance enhancements
MySQL Cluster 7.2 improves the nodal performance of the data node by allowing it to make more effective use of the available threads. The latest benchmark figures are available from http://www.mysql.com/whymysql/benchmarks/mysql-cluster/
Multi-site clustering
Multi-Site Clustering provides a new option for cross data center scalability. For the first time splitting data nodes across data centers is a supported deployment option. With this deployment model, users can synchronously replicate updates between data centers without needing to modify their application or schema for conflict handling, and automatically failover between those sites in the event of a node failure. Improvements to the heart beating mechanism used by MySQL Cluster enables greater resilience to temporary latency spikes on a WAN, thereby maintaining operation of the cluster. A new ConnectivityCheck mechanism is introduced, which must be explicitly configured. This extra mechanism adds messaging overheads and failure handling latency, and so is not switched on by default. Users can configure various MySQL Cluster parameters including heartbeats, Connectivity_Check, GCP timeouts and transaction deadlock timeouts. You can read more about these 3 parameters in the documentation . Recommendations for Multi-Site Clustering Figure 7 Data nodes split across data centers Ensure minimal, stable latency Provision the network with sufficient bandwidth for the expected peak load - test with node recovery and system recovery; Configure the heartbeat period to ensure a safe margin above latency fluctuations Configure the ConnectivtyCheckPeriod to avoid unnecessary node failures Configure other timeouts accordingly including the GCP timeout, transaction deadlock timeout, and transaction inactivity timeout These factors are discussed in more detail here: http://blogs.oracle.com/MySQL/entry/synchronously_replicating_databases_across_data
http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-params-ndbd.html
Page 13
Example The following is a recommendation of latency and bandwidth requirements for applications with high throughput and fast failure detection requirements: Latency between remote data nodes must not exceed 20 milliseconds Bandwidth of the network link must be more than 1 Gigabit per Second
For applications that do not require this type of stringent operating environment, latency and bandwidth can be relaxed, subject to the testing recommended above. As the recommendations demonstrate, there are a number of factors that need to be considered before deploying multi-site clustering. For geo-redundancy, Oracle recommends Geographic Replication, but multisite clustering does present an alternative deployment, subject to the considerations and constraints discussed above.
Page 14
cluster site A contains the value 20 while site B contains 11 - in other words the databases are now inconsistent. How MySQL Cluster 7.2 implements eventual consistency There are two phases to establishing consistency between both clusters after an inconsistency has been introduced: 1. Detect that a conflict has happened 2. Resolve the inconsistency Detecting the conflict While we typically consider the 2 clusters in an active-active replication configuration to be peers, in this case we designate one to be the primary and the other the secondary. Reads and writes can still be sent to either cluster but it is the responsibility of the primary to identify that a conflict has arisen and then remove the inconsistency.
Figure 8 How inconsistent data can be introduced with active-active asynchronous replication
A logical clock is used to identify (in relative terms) when a change is made on the primary - for those who know something of the MySQL Cluster internals, we use the index of the Global Checkpoint that the update is contained in. For all tables that have this feature turned on, an extra, hidden column is automatically added on the primary - this represents the value of the logical clock when the change was made. Once the change has been applied on the primary, there is a "window of conflict" for the effected row(s) during which if a different change is made to the same row(s) on the secondary then there will be an inconsistency. Once the slave on the secondary has applied the change from the primary, it
Copyright 2012, Oracle and/or its affiliates. All rights reserved.
Page 15
will send a replication event back to the slave on the primary, containing the primary's clock value associated with the changes that have just been applied on the secondary. (Remember that the clock is actually the Global Checkpoint Index and so this feature is sometimes referred to as Reflected GCI). Once the slave on the primary has received this event, it knows that all changes tagged with a clock value no later than the reflected GCI are now safe - the window of conflict has closed. If an application modifies this same row on the secondary before the replication event from the primary was applied then it will send an associated replication event to the slave on the primary before it reflects the new GCI. The slave on the primary will process this replication event and compare the clock value recorded with the effected rows with the latest reflected GCI; as the clock value for the conflicting row is higher the primary recognizes that a conflict has occurred and will launch the algorithm to resolve the inconsistency. Resolving the inconsistency In earlier releases of MySQL Cluster (or if choosing to use the original algorithm in MySQL Cluster 7.2) you had a choice of simply flagging the primary key of the conflicting rows or backing out one of the changes to the conflicting rows. Using the new NDB$EPOCH_TRANS function, the primary will overwrite the data in the secondary for the effected row(s) and any other rows that were updated in the same transaction (even if they are in tables for which conflict detection has not been enabled). In fact the algorithm goes a step further and if there were subsequent transactions on the secondary that wrote to the conflicting rows then all of the changes from those dependent transactions on the secondary will be rolled back as well. How to implement enhanced conflict resolution This section assumes that replication has already been set up between two clusters as shown in Figure 9. For more details on how to set up that configuration then refer to the blog Enhanced conflict resolution with MySQL 4 Cluster active-active replication . To keep things simple, just two hosts are used; "black" contains all nodes for the primary cluster and "blue" will contain all nodes for the secondary. As an extra simplification a single MySQL Server in each cluster acts as both the master and the slave.
The first step is to identify the tables that need conflict detection enabling. Each of those tables then has to have an entry in the mysql.ndb_replication table where they're tagged as using the new NDB$EPOCH_TRANS() function - you could also choose to use NDB$EPOCH(), in which case only the changes to conflicting rows will be r rather than the full transactions. A few things to note: This must be done before creating the application tables themselves Should only be done on the primary By default the table doesn't exist and so the very first step is to create it
black-mysql> CREATE TABLE mysql.ndb_replication ( -> db VARBINARY(63), -> table_name VARBINARY(63), -> server_id INT UNSIGNED, -> binlog_type INT UNSIGNED, -> conflict_fn VARBINARY(128), -> PRIMARY KEY USING HASH (db, table_name, server_id)
4
http://www.clusterdb.com/mysql-cluster/enhanced-conflict-resolution-with-mysql-cluster-active-active-replication/
Page 16
-> ) ENGINE=NDB -> PARTITION BY KEY(db,table_name); black-mysql> INSERT INTO mysql.ndb_replication VALUES ('clusterdb', 'simple1', 8, 0, 'NDB$EPOCH_TRANS()'); black-mysql> INSERT INTO mysql.ndb_replication VALUES ('clusterdb', 'simple2', 8, 0, 'NDB$EPOCH_TRANS()'); black-mysql> INSERT INTO mysql.ndb_replication VALUES ('clusterdb', 'simple3', 8, 0, 'NDB$EPOCH_TRANS()');
For each of these tables you should also create an exceptions table which will record any conflicts that have resulted in changes being rolled back; the format of these tables is rigidly defined and so take care to copy the types exactly; again this only needs doing on the primary:
black-mysql> CREATE DATABASE clusterdb;USE clusterdb; black-mysql> CREATE TABLE <strong>simple1$EX</strong> (server_id INT UNSIGNED, master_server_id INT UNSIGNED, master_epoch BIGINT UNSIGNED, count INT UNSIGNED, id INT NOT NULL, PRIMARY KEY(server_id, master_server_id, master_epoch, count)) ENGINE=NDB; black-mysql> CREATE TABLE <strong>simple2$EX</strong> (server_id INT UNSIGNED, master_server_id INT UNSIGNED, master_epoch BIGINT UNSIGNED, count INT UNSIGNED, id INT NOT NULL, PRIMARY KEY(server_id, master_server_id, master_epoch, count)) ENGINE=NDB; black-mysql> CREATE TABLE <strong>simple3$EX</strong> (server_id INT UNSIGNED, master_server_id INT UNSIGNED, master_epoch BIGINT UNSIGNED, count INT UNSIGNED, id INT NOT NULL, PRIMARY KEY(server_id, master_server_id, master_epoch, count)) ENGINE=NDB;
Finally, the application tables themselves can be created (this only needs doing on the primary as they'll be replicated to the secondary):
black-mysql> CREATE TABLE simple1 (id INT NOT NULL PRIMARY KEY, value INT) ENGINE=ndb; black-mysql> CREATE TABLE simple2 (id INT NOT NULL PRIMARY KEY, value INT) ENGINE=ndb; black-mysql> CREATE TABLE simple3 (id INT NOT NULL PRIMARY KEY, value INT) ENGINE=ndb;
Everything is now set up and the new configuration can be tested to ensure that conflicts are detected and the correct updates are rolled back. Testing enhanced conflict detection & resolution The first step is to add some data to our new tables (note that at this point replication is running and so they only need to be created on the primary) and then update 1 row to make sure that it is replicated to the secondary:
black-mysql> black-mysql> black-mysql> black-mysql> INSERT INSERT INSERT UPDATE INTO simple1 VALUES (1,10); INTO simple2 VALUES (1,10); INTO simple3 VALUES (1,10); simple1 SET value=12 WHERE id=1;
blue-mysql> USE clusterdb; blue-mysql> SELECT * FROM simple1; +----+-------+ | id | value | +----+-------+ | 1 | 12 | +----+-------+
It is important that the NDB$EPOCH_TRANS() function rolls back any transactions on the secondary that involve a conflict (as well as subsequent, dependent transactions that modify the same rows); to do this manually the simplest approach is to stop the slave IO thread on the secondary thread in order to increase the size of the window of conflict (which is otherwise very short). Once the slave IO thread has been stopped
Copyright 2012, Oracle and/or its affiliates. All rights reserved.
Page 17
a change is made to table simple1 on the primary and then the secondary makes a (conflicting) change to the same row as well as making a change to table simple2 in the same transaction. A second transaction on the primary will change a row in simple3 - as it doesn't touch any rows that have been involved in a conflict then that change should stand.
blue-mysql> STOP SLAVE IO_THREAD; black-mysql> UPDATE simple1 SET value=13 WHERE id=1; blue-mysql> BEGIN; # conflicting transaction blue-mysql> UPDATE simple1 SET value=20 WHERE id=1; blue-mysql> UPDATE simple2 SET value=20 WHERE id=1; blue-mysql> COMMIT; blue-mysql> UPDATE simple3 SET value=20 WHERE id=1; # non conflicting blue-mysql> SELECT * FROM simple1; +----+-------+ | id | value | +----+-------+ | 1 | 20 | +----+-------+ blue-mysql> SELECT * FROM simple2; +----+-------+ | id | value | +----+-------+ | 1 | 20 | +----+-------+ blue-mysql> SELECT * FROM simple3; +----+-------+ | id | value | +----+-------+ | 1 | 20 | +----+-------+
If you now check the exception tables then you can see that the primary (black) has received the changes from the secondary (blue) and because the first transaction updated the same row in simple1 during its window of conflict it has recorded that the change needs to be rolled back - this will happen as soon as the replication thread is restarted on the secondary:
black-mysql> SELECT * FROM simple1$EX; +-----------+------------------+---------------+-------+----+ | server_id | master_server_id | master_epoch | count | id | +-----------+------------------+---------------+-------+----+ | 8 | 9 | 1494648619009 | 3 | 1 | +-----------+------------------+---------------+-------+----+ black-mysql> SELECT * FROM simple2$EX; +-----------+------------------+---------------+-------+----+ | server_id | master_server_id | master_epoch | count | id | +-----------+------------------+---------------+-------+----+ | 8 | 9 | 1494648619009 | 1 | 1 | +-----------+------------------+---------------+-------+----+ black-mysql> SELECT * FROM simple3$EX; Empty set (0.05 sec) blue-mysql> START SLAVE IO_THREAD; blue-mysql> SELECT * FROM simple1; +----+-------+ | id | value | +----+-------+ | 1 | 13 |
Page 18
+----+-------+ blue-mysql> SELECT * FROM simple2; +----+-------+ | id | value | +----+-------+ | 1 | 10 | +----+-------+ blue-mysql> SELECT * FROM simple3; +----+-------+ | id | value | +----+-------+ | 1 | 20 | +----+-------+
These are the results we expect - simple1 has the value set by the primary with the subsequent change on the secondary rolled back; simple2 was not updated by the primary but the change on the secondary was rolled back as it was made in the same transaction as the conflicting update to simple1. The change on the secondary to simple3 has survived as it was made outside of any conflicting transaction and the change was not dependent on any conflicting changes. Finally just confirm that the data is identical on the primary:
black-mysql> SELECT * FROM simple1; +----+-------+ | id | value | +----+-------+ | 1 | 13 | +----+-------+ black-mysql> SELECT * FROM simple2; +----+-------+ | id | value | +----+-------+ | 1 | 10 | +----+-------+ black-mysql> SELECT * FROM simple3; +----+-------+ | id | value | +----+-------+ | 1 | 20 | +----+-------+
Statistics are provided on the primary that record that 1 conflict has been detected, effecting 1 transaction and that it resulted in 2 row changes being rolled back:
black-mysql> SHOW STATUS LIKE 'ndb_conflict%'; +------------------------------------------+-------+ | Variable_name | Value | +------------------------------------------+-------+ | Ndb_conflict_fn_max | 0 | | Ndb_conflict_fn_old | 0 | | Ndb_conflict_fn_max_del_win | 0 | | Ndb_conflict_fn_epoch | 0 | | Ndb_conflict_fn_epoch_trans | 1 | | Ndb_conflict_trans_row_conflict_count | 1 | | Ndb_conflict_trans_row_reject_count | 2 | | Ndb_conflict_trans_reject_count | 1 | | Ndb_conflict_trans_detect_iter_count | 1 | | Ndb_conflict_trans_conflict_commit_count | 1 | +------------------------------------------+-------+
Page 19
This can become painful to manage every time you want to create a new user or change their permissions you need to repeat it on every server, miss one out and the user will not be able to access that server (or will still be able to access it after you withdraw their privileges). This is illustrated in Figure 10 The user fred is created on one MySQL Server but when Fred attempts to connect to one of the MySQL Servers theyre blocked. This maybe what is intended but in many cases the DBA will want the change to be applied across all of the MySQL Servers. This section demonstrates this default behavior before showing how to change it such that the user privilege data is common to all of the MySQL Servers; for more detailed requirements, refer to the 5 original Blog post . A new user fred is created on server 1 and a test table created by root:
$ mysql -h 192.168.1.1 -P3306 -u root --prompt 'server1-root> ' server1-root> GRANT ALL ON *.* TO 'fred'@'192.168.1.1'; server1-root> CREATE DATABASE clusterdb; USE clusterdb; server1-root> CREATE TABLE towns (id INT NOT NULL PRIMARY KEY, town VARCHAR(20)) ENGINE=NDBCLUSTER; server1-root> INSERT INTO towns VALUES (1,'Maidenhead'),(2, 'Reading');
Next confirm that the new user fred can connect through server 1 and access the data:
$ mysql -h 192.168.1.1 -P3306 -u fred --prompt 'server1-fred> ' server1-fred> SELECT * FROM clusterdb.towns; +----+------------+ | id | town | +----+------------+ | 1 | Maidenhead | | 2 | Reading | +----+------------+
If the same credentials are used to attempt the same steps through a second server (server 2) then it is not possible:
$ mysql -h 192.168.1.2 -P3306 -u fred --prompt 'server2> ' server2-fred> SELECT * FROM clusterdb.towns; ERROR 1142 (42000): SELECT command denied to user ''@'ws2.localdomain' for table 'towns'
What we need to do next is to run a script (as MySQL root) and then a stored procedure to convert 5 tables from the mysql database (user, db, tables_priv, columns_priv & procs_priv) from the MyISAM to the ndbcluster storage engine:
server1-root> SOURCE /usr/local/mysql/share/ndb_dist_priv.sql; server1-root> CALL mysql.mysql_cluster_move_privileges();
http://www.clusterdb.com/mysql-cluster/sharing-user-credentials-between-mysql-servers-with-cluster/
Page 20
The storage engine for the user privilege tables will now have been changed to ndbcluster; this can be confirmed as follows:
server1-root> SHOW CREATE TABLE mysql.user\G *************************** 1. row *************************** Table: userCreate Table: CREATE TABLE `user` ( `Host` char(60) COLLATE utf8_bin NOT NULL DEFAULT '', .... .... ) ENGINE=ndbcluster DEFAULT CHARSET=utf8 COLLATE=utf8_bin COMMENT='Users and global privileges'
Now that these tables are stored in MySQL Cluster, they are visible from all of the MySQL Servers. Whichever MySQL Server the new user attempts to connect through, that MySQL Server will fetch the privilege data from the shared data nodes rather than using local information and so the user will get the same access rights. As our clusterdb.towns table was created using the ndbcluster storage engine as well, it is accessible from all servers and so the user should now be able to see the contents of the table from server 2 as the access rights on server 2 now allow it. Note that the data already stored in those 5 mysql tables is maintained as part of the migration from MyISAM to MySQL Cluster. The final test is to confirm that the new user really is allowed to get to this data from server 2:
$ mysql -h 192.168.1.2 -P3306 -u fred --prompt 'server2-fred> server2-fred> SELECT * FROM clusterdb.towns; +----+------------+ | id | town | +----+------------+ | 1 | Maidenhead | | 2 | Reading | +----+------------+
Note that if a user is already connected to a server and their privileges are changed through a different server then they must connect & reconnect for the new privileges to be applied.
6 7
Support for the VM environment itself must come from the relevant VM vendor or approved partner http://www.oracle.com/us/technologies/virtualization/oraclevm/index.html
Page 21
The alert is first raised (info level) when the hit rate falls below 97%, the warning level is raised at 90% and the critical level at 80%. Again, you can alter any of these thresholds. The new graph simply displays how the hit rate varies over time so that you can spot trends.
http://www.clusterdb.com/mysql-cluster/monitoring-mysql-cluster-with-mysql-enterprise-monitor/
Page 22
Key components of MySQL Cluster Carrier Grade Edition include MySQL Cluster Manager and Oracle Premier Support.
Further details can be found in the original blog post . Automated on-line add-node
11
10 11
Page 23
Since MySQL Cluster 7.0 it has been possible to add new nodes to a Cluster while it is still in service; there are a number of steps involved and as with on-line upgrades if the administrator makes a mistake then it could lead to an outage.
Figure 12 Adding 2 extra data nodes to a running cluster This process is automated when using MySQL Cluster Manager; the first step is to add any new hosts (servers) to the site and indicate where those hosts can find the Cluster software:
mcm> add hosts --hosts=192.168.0.14,192.168.0.15 mysite; mcm> add package --basedir=/usr/local/mysql_7_2_1 --hosts=192.168.0.14,192.168.0.15 7_2_1;
The new nodes can then be added to the Cluster and then started up:
mcm> add process -processhosts=mysqld@192.168.0.10,mysqld@192.168.0.11,ndbd@192.168.0.14, ndbd@192.168.0.15,ndbd@192.168.0.14,ndbd@192.168.0.15 mycluster; mcm> start process --added mycluster;
The Cluster has now been extended but you need to perform a final step from any of the MySQL Servers to repartition the existing Cluster tables to use the new data nodes:
mysql> ALTER ONLINE TABLE <table-name> REORGANIZE PARTITION; mysql> OPTIMIZE TABLE <table-name>;
Page 24
Conclusion
In this paper we have explained in detail key enhancements to the MySQL Cluster database, including: Adaptive Query Localization NoSQL with native memcached API Performance enhancements Multi-site clustering Simplified active/active replication Consolidated user privileges MySQL 5.5 server integration Support for Virtual Machine environments MySQL Enterprise Monitor further enhancements for MySQL Cluster Automated on-line add-node Single-step Cluster creation (bootstrap)
For a complete listing of new features and functionality, refer to the MySQL Cluster change logs in the documentation posted at: http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-news.html Collectively these new features enable MySQL Cluster to serve a much broader range of use-cases and applications demanding high scalability on commodity hardware, 99.999% uptime and real-time performance.
Page 25
Additional Resources
Download MySQL Cluster: http://www.mysql.com/downloads/cluster/ MySQL Cluster Manager Trial (see the Resources section of the web page): http://mysql.com/products/cluster/mcm/ On-Line Demonstration - MySQL Cluster in Action: http://www.mysql.com/products/cluster/cluster_demo.html Whitepaper: Guide to Scaling Web Databases with MySQL Cluster http://mysql.com/why-mysql/white-papers/mysql_wp_scaling_web_databases.php MySQL Cluster Evaluation Guide: http://www.mysql.com/why-mysql/white-papers/mysql_cluster_eval_guide.php Best Practices Guide: Optimizing the Performance of MySQL Cluster: http://mysql.com/why-mysql/white-papers/mysql_wp_cluster_perfomance.php MySQL Cluster Documentation: http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster.html MySQL Cluster User Forum and Mailing List: http://forums.mysql.com/list.php?25 cluster@lists.mysql.com Contact MySQL Sales: https://mysql.com/about/contact/sales.html
Page 26