Sie sind auf Seite 1von 8

2ndQuadrant 10 reasons why MySQL Users should move to PostgreSQL

15 June 2011

2ndQuadrant: 10 reasons why MySQL Users should move to PostgreSQL

Copyright 2011, 2ndQuadrant Limited. All rights reserved. This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. The PostgreSQL elephant logo "Slonik" is a registered trademark of the PostgreSQL Global Development Group. MySQL is a registered trademark of Oracle Corporation and/or its affiliates.

Table of Contents
Introduction ................................................................................................................................ Comparing MySQL and PostgreSQL at a glance ............................................................................... Reason 1: Enterprise class query planner ........................................................................................ Reason 2: Crash safety, reliability and the Write-ahead log ................................................................. Reason 3: Streaming replication ..................................................................................................... Reason 4: Excellent third-party/satellite tools .................................................................................... Reason 5: Advanced Features ....................................................................................................... Reason 6: Quality and data-integrity first .......................................................................................... Reason 7: Licensing .................................................................................................................... Reason 8: Strong community ......................................................................................................... Reason 9: Procedural languages .................................................................................................... Reason 10: SQL compliance and integration .................................................................................... Advanced, ISO conforming SQL dialect ................................................................................... Support for popular ORMs .................................................................................................... About 2ndQuadrant ...................................................................................................................... Business and Contact Information .................................................................................................. 3 3 3 4 4 5 5 6 6 6 7 7 7 8 8 8

Copyright 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)

2ndQuadrant: 10 reasons why MySQL Users should move to PostgreSQL

Introduction
Over the last decade, the two main open-source database solutions were MySQL and PostgreSQL. Among ICT professionals, there is a traditional perception that MySQL is faster and easier to use than PostgreSQL. On the other hand, PostgreSQL has always been acknowledged to be more feature-rich and robust than MySQL. Over the years, both products have probably learnt from each other, and both have certainly improved considerably. PostgreSQL, with 5 major releases in the last five years, has focused on improving performance, high availability and replication as well as making life easier for DBAs and system administrators. The intended audience of this whitepaper is MySQL users and developers who are interested in PostgreSQL. These 10 reasons cover both the product (the PostgreSQL object-relational DBMS) and the open source project (licensing, community, related tools, the PostgreSQL ecosystem, etc.).

Comparing MySQL and PostgreSQL at a glance


Product License License costs
c

MySQL Community Edition GPL


a

MySQL Classic Edition Commercial N/A

MySQL Standard Edition Commercial USD 2,000

MySQL Enterprise Edition Commercial USD 5,000

MySQL Cluster CGE Commercial USD 10,000

PostgreSQL 9 TPL
b

Free under GPL

Free

Commercial embedded systems Connectors Triggers

Contact Oracle Sales Team Row triggers only

Row & d Statement LDAP, GSSAPI, SSPI, SSL Certificate, Radius, etc.
e

Standard authentication methods

ACID transactions Online Hot Backup Master-Slave Replication Partitioning Point-In-Time Recovery Administration GUI Monitoring tools Bug fixes, security and maintenance releases Professional 24x7 support
a b c

Contact 2ndQuadrant Sales Team

GNU General Public License: http://www.gnu.org/licenses/gpl.html The PostgreSQL License is a highly permissive BSD-style license: http://www.postgresql.org/about/licence PostgreSQL 9.1 will support triggers on views PostgreSQL 9.1 will support synchronous replication Integration with standard monitoring and alerting tools such as Cacti, Munin and Nagios

MySQL license costs are per 1-4 socket server per year. Pricing on 5+ socket servers requires contacting Oracle Sales Team

d e f

Ability to recover the database situation to a particular point in time or transaction

Reason 1: Enterprise class query planner


Perhaps the single biggest factor that differentiates PostgreSQL from MySQL is its extremely sophisticated query planner.

Copyright 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)

2ndQuadrant: 10 reasons why MySQL Users should move to PostgreSQL

PostgreSQL uses cost-based query optimization methods in order to get good performance for many different types of joins. Query costs are assessed based on planner statistics collected when tables are analyzed combined with adjustable planner cost constants. Advanced features such as the Genetic Query Optimizer allow optimizing even very complicated joins efficiently. MySQLs query optimizer estimates cost by heuristically calculating the number of disk blocks that will most likely need to be read. Because the query optimizer is implemented in the server, not the storage engines, not all the salient information is available to the optimizer. When the optimizer chooses badly, there are only a few tunables for Controlling Query Optimizer Performance. Developers must do things like explicitly provide index hints to make sure joins are executed correctly. To make this task smoother, MySQL provides a Query Profiler that is easier to work with than typical EXPLAIN data. Finding order in execution provides several comparisons of how the two databases handle queries differently. Because of its more robust automatic optimization, PostgreSQL usually does a better job of handling complicated joins than MySQL. Slow queries on PostgreSQL can easily have their explains automatically logged using the auto_explain 1 module . This approach is useful for DBAs who like to pro-actively tune slow-performing queries.

Reason 2: Crash safety, reliability and the Write-ahead log


A continual focus for 2ndQuadrants community work on PostgreSQL has been the write-ahead log. Write-Ahead 2 Logging (WAL) is a standard method for ensuring data integrity and crash safety of database transactions. Essentially, WALs central concept is that changes to data files (where tables and indexes reside) must be written only after those changes have been logged, that is, after log records describing the changes have been flushed to permanent storage. PostgreSQL disaster recovery features, including Continuous Archiving and Point-In-Time Recovery (PITR), 3 are based on Write-Ahead Logging . On the other hand, MyIsam lacks transactions. In the event of a crash, it must rebuild indexes and tables. This takes an amount of time that is rises at a rate that is roughly linear to the size of the database. This can have negative implications for the availability of your database. The MySQL documentation has an extensive guide to repairing MyIsam tables. While MySQL has gone someway towards moving away from MyIsam, particularly in the recent 5.5 release where it is no longer the default storage engine, many production databases still use MyIsam. The InnoDB equivalent of the write-ahead log is the binary log.

Reason 3: Streaming replication


PostgreSQLs built-in asynchronous replication (Hot standby and Streaming replication) is just a logical extension of write-ahead logging. The database is replicated with a tiny delay, typically one transaction or a few seconds. The primary database (also known as "master") and the standby databases (sometimes referred to as "slaves") are identical at the binary level. In the upcoming 9.1 release, due out in Autumn 2011, replication is further simplified. A new utility, pg_basebackup, can be used to take base backups of a running PostgreSQL database cluster. These are taken without affecting other clients to the database, and can be used both for point-in-time recovery and as the starting point for streaming replication standby servers. In PostgreSQL 9.1, 2ndQuadrant have added a new type of replication: synchronous replication. This affords an extra level of durability, by having transactions commit only when they are safely stored on one standby. It is possible to change the service level at the transaction boundary, and designate only certain transactions as needing the extra durability as appropriate.
1 2 3

http://www.postgresql.org/docs/current/static/auto-explain.html http://www.postgresql.org/docs/current/static/wal.html http://www.postgresql.org/docs/current/static/continuous-archiving.html

Copyright 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)

2ndQuadrant: 10 reasons why MySQL Users should move to PostgreSQL

2ndQuadrant has developed repmgr , an open-source tool for managing PostgreSQL replication at a very high level. Features include the ability to: Create new standbys using PostgreSQL Streaming Replication; Analyse replication lag that impacts both high availability and Hot Standby queries; Perform maintenance tasks required to handle failover when a node is removed from the standby cluster, either intentionally or via failure.

Reason 4: Excellent third-party/satellite tools


There are a large number of PostgreSQL satellite projects. For example, PostGIS is considered to be the most mature open source Geographical Information System - the first version was released in 2001. It allows users to 6 perform geospatial queries, transforming PostgreSQL into a geodatabase. The OpenStreetMap project moved 7 to PostgreSQL in 2009 from MySQL. MySQLs geospatial support is limited to 2D MBR (minimum bounding rectangle) functions. PostgreSQL/PostGIS, in contrast, has 300 functions and operators and some MM support of circular strings and compound curves. The extensibility of Postgres makes this sort of development quite natural, particularly its support for user defined datatypes and indexing schemes. Postgres supports two types of generalised indexes - GiST (Generalised Search Tree) and GIN (Generalised inverted index). The main advantage of GiST and GIN is that they allow the development of custom data types with appropriate access methods, by an expert in the domain of the data type, rather than a database expert. Apart from geospatial applications, these generalised indexes have been used to define custom indexing schemes for anything from soundex support to indexing discrete elements of arrays in a table with an array column very efficiently. With PostgresQL 9.1, the addition of the K-nearest neighbour (KNN) algorithm allows for ORDER BY clauses to use GiST indexes directly. This support will make for world-beating GIS in the upcoming 2.0 release of PostGIS, as the feature is not available from any of the proprietary databases. There are also satellite projects for every conceivable supporting task - mature connection poolers, instrumentation/monitoring tools (including support for popular monitoring frameworks like Nagios and Cacti), log analysers, GUI administration tools (pgAdmin, phpPgAdmin) and more.
5

Reason 5: Advanced Features


PostgreSQL has a number of features that set it apart from even the expensive proprietary databases. One such advanced feature is transactional DDL. In PostgreSQL, when you are inside a transaction almost any operation can be undone. There are some irreversible operations (like creating or destroying a database or tablespace), but normal table modifications, including creating and dropping tables, can be backed out of by issuing a ROLLBACK, due to its Write-ahead log design. MySQL doesnt support any sort of rollback when using MyISAM, because it is not transactional. Even InnoDB and Oracle dont provide transactional DDL, because DDL operations cause the currently open transaction to implicitly commit. Experienced PostgreSQL DBAs know to take advantage of its features here to protect themselves when doing complicated work like schema upgrades. If you put all such changes into a transaction block, you can make sure they are applied atomically all changes are applied or none are. This drastically lowers the possibility that the database will be corrupted by a typo or other such error in the schema change, which is particularly important when youre modifying multiple related tables where a mistake might destroy the relational key.
4 5 6

http://projects.2ndquadrant.com/repmgr http://postgis.refractions.net/ http://www.openstreetmap.org/ 7 http://blog.cleverelephant.ca/2009/04/openstreetmap-moves-to-postgresql.html

Copyright 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)

2ndQuadrant: 10 reasons why MySQL Users should move to PostgreSQL

Another advanced feature is exclusion constraints. Exclusion constraints are described as a generalisation of unique constraints across all operators. This feature is often used to solve the double-booking problem (ensure that each room, each of which can be in use for some period of time, isnt double booked), without race conditions, and without heavy, explicit table-level locks. PostgreSQL has been the subject of numerous academic papers and research projects, and is considered a good basis for cutting-edge database research. For example, it will soon have the first production implementation of Serializable snapshot isolation (SSI), which is based on a 2008 academic paper by Cahill, et al, which itself won a best paper award from ACM SIGMOD.

Reason 6: Quality and data-integrity first


PostgreSQL has traditionally focused on quality and data integrity. A simple example of this trend is the date datatype: in MySQL, it is possible to set a date to February 31st. PostgreSQL wont allow you to set the date to any day in February past the 28th - except the 29th on leap years. There has never been a PostgreSQL release with known bugs at release time. In contrast, MySQL has shipped with known bugs as a matter of course, and certain bugs are known to have been in MySQL for years. MySQLs criteria for General Availability does not require that the release contain no bugs; rather, it requires that it contains no serious bugs. The MySQL project continues to introduce new features in minor releases of MySQL, exposing their users to potential incompatibilities (See the Functionality added or changed section of many MySQL release notes, including, for example, 5.1.38, 5.1.41 and 5.5.6. The 5.5.6 release notes note an Incompatible change). The PostgreSQL project has a simple policy: point releases are limited to bug fixes. Major versions are supported for at least 5 years. We dont ship with known bugs, and require all patches to pass extensive regression tests before they are accepted.

Reason 7: Licensing
While 2ndQuadrant employees are very well represented among the developers of PostgreSQL, 2ndQuadrant do not dominate or otherwise control PostgreSQLs development, nor does any other commercial entity. Copyright of the source code is implicitly assigned to the PostgreSQL Global Development Group, a unincorporated loose association of PostgreSQL developers. Given the extremely liberal terms of the BSD-style PostgreSQL licence, it would not make any practical difference if a commercial entity were assigned the copyright. In contrast, Oracle owns the source code of MySQL , and vigorously pursues violators of its licensing terms. MySQL AB and its successors have a long history of ensuring that all available database APIs were GPL licensed, to force proprietary application developers to either make their client applications GPL open source, or to purchase commercial licences (when an application links to a library, it is generally considered to be a derivative work; when it connects over a network, it is generally not).

Reason 8: Strong community


Since Sun and their MySQL intellectual property were acquired by Oracle, the MySQL community has become more fractured than ever. There are now a number of popular forks of the MySQL project. Aside from Oracle MySQL, there is also considerable interest in Monty Widenius' MariaDB project. While that project strives for binary compatibility with Oracle MySQL, it is far from clear how viable that is in the long term. These sorts of shadow forks can be difficult to maintain; for example, Redhat, inc. have recently started to frustrate Oracles use of Redhat source code to produce a simple re-branding fork, Oracle Unbreakable Linux. There is also considerable interest in Drizzle, which is a project that seeks to return MySQL to its roots by adapting a micro kernel architecture, and concentrating on read mostly web applications. It may put considerable onus on the developer to rewrite their application. For example, Drizzle does not support stored procedures, and triggers must be written in C++.

Copyright 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)

2ndQuadrant: 10 reasons why MySQL Users should move to PostgreSQL

MySQL suffers from a proliferation of storage engines, including MyISAM, Percona, OurDelta, InnoDB and Falcon (discontinued after Oracle acquisition). The PostgreSQL community revolves around community PostgreSQL. While there is a small number of proprietary forks that serve niche markets, such as EMCs leading data warehousing solution Greenplum, there is a singular focus on improving PostgreSQL as a general database that performs well with varied types of work. PostgreSQL treats documentation as part of the project proper in other words, all patches written must have accompanying documentation, and the documentation is stored as SGML in the same source repository as the source code to the server itself. Errors in the documentation are considered to be bugs, and are fixed with the same expediency as any other sort of bug. The documentation is extensive, organised, and of high quality. Finally, the PostgreSQL community is extended across the world; It has representatives and volunteers in many countries of all continents that have embraced the open-source spirit.

Reason 9: Procedural languages


PostgreSQL supports numerous loadable procedural languages PL/PgSQL is similar to Oracles PL/SQL, and can be graphically debugged from within the pgAdmin administration/development tool. Other official procedural languagees include PL/Python, PL/Perl, and PL/TCL. PL/Java and PL/R are available as separate open-source projects. In most cases, these powerful procedural languages can be safely sandboxed, so that non-superusers can securely define procedural functions and triggers. These languages can be used to define arbitrary constraints on database-level datatypes (domains) and tables. This is in rather stark contrast to MySQL, where functions and triggers are generally written in a limited procedural variant of SQL. There is support for triggers written in compiled languages like C and C++.

Reason 10: SQL compliance and integration


Advanced, ISO conforming SQL dialect
PostgreSQL supports a dialect of SQL that is generally considered one of the most ISO standard conforming. For example, it serves as a reference model of standards conformance for the SQLite developers. It has support for enterprise class SQL features that were previously only available in the proprietary databases. 8 These include Window Functions and WITH Queries . Window Functions, which are of particular use in data-warehousing applications, can summarise a part of the total result set within a single row of that result set. So, for example, they could be used to show each employee, their salary, the department of that employee, and, using a window, the average salary within that department. This is all accomplished within a single, simple query. WITH queries are a way of creating simple temporary tables that only persist for the length of a single query, and are typically used to break down complicated queries into simpler parts. They can be reused and even be recursive, which allows the user to define queries that summarise hierarchical data. 2ndQuadrant consultant Marko Tiikkaja has extended WITH queries to support containing data-modifying statements. This support will be available in the upcoming 9.1 release. Apart from supporting none of these advanced features, the SQL dialect of MySQL differs significantly from that of the SQL standard. These extensive differences are described in a section of the MySQL documentation, MySQL 9 Differences from Standard SQL .
8 9

Also known as *Common Table Expressions (CTE) http://dev.mysql.com/doc/mysql-reslimits-excerpt/5.5/en/differences-from-ansi.html

Copyright 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)

2ndQuadrant: 10 reasons why MySQL Users should move to PostgreSQL

Support for popular ORMs


PostgreSQL is supported by all popular ORMs, including Hibernate, ADO.NET (the npgsql project, PostgreSQLs .NET driver, has an ADO.NET provider), Django, SQLAlchemy and Active Record.

About 2ndQuadrant
2ndQuadrant are the leading experts in PostgreSQL administration and performance tuning, specialising in critical operational databases for customers worldwide. We provide hands-on services covering the following areas: Operational DBA Operational Support Development/Consulting Custom PostgreSQL development

2ndQuadrants customers include BBC, Comodo, Enova Financial, Ericsson, F-Secure, McAfee, Skype and many others. 2ndQuadrant has more than 20 staff supporting customers in more than 16 countries from offices in US, UK, France, Italy and Germany, with staff located worldwide. 2ndQuadrants headquarters are in the UK. 2ndQuadrants differentiator is our ability to deploy top-level skills when and where they are needed. Our strategy is to employ leading solution experts people who have contributed directly to the core technologies around PostgreSQL and at the same time have extensive experience on real world solutions. 2ndQuadrant are also a close partner of EMC, having worked with Greenplum closely since 2005, as a result of our skills and experience with Data Warehousing database systems. 2ndQuadrant are the only company worldwide able to offer both PostgreSQL and Greenplum services, with a team of 6 staff experienced with Greenplum technology.

Business and Contact Information


2nd Quadrant Limited UK Company Number 4075460 7200 The Quorum Oxford Business Park North Oxford OX4 2JZ United Kingdom Phone: +44 870 766 7756 Website: http://www.2ndQuadrant.com/ E-mail: info@2ndQuadrant.com

Copyright 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)