Sunday, March 20, 2016

I can't help it - I love PostgreSQL!

Recently, in regards to a new project, a colleague asked me to justify my panning of MySQL in favor of PostgreSQL. While I wrote it to him personally, I think it stands on it's own and gives a nice breakdown of PostgreSQL vs. MySQL for those trying to make the same decision. I expect some flaming comments but it would be nice if flamers will include any references/proofs:

In regards to MySQL - there are several reasons, perhaps not so much that MySQL is poor, but rather where PostgreSQL surpasses it:
  • I find I'm most often dependent on JSON now (every app I make is a RESTful API) and I find that JSON in MySQL is a bit of an afterthought in comparison to PostgreSQL where indexing is fast and JSON-related functions are numerous and powerful 
  • PostgreSQL JSON functions allow me to access deep properties of a json string that I can use in computed index for lightning fast aggregates and scans - as of v5.7 I understand that MySQL can do this now (indexes on generated columns from deep properties in json) but it's only been stable for a few months (Johnny-come-lately) while PostgreSQL has had JSON for a couple of years now 
  • Since v9.4 PostgreSQL has been outperforming MongoDB in benchmarks when using their new JSONB data type with their common GIN index engine now relieving the need to pluck values out for computed indexes (easier) 
  • Since v9.5, PostgreSQL now allows updates to (deep) properties within a JSONB datatype (the type does not have to be rewritten as a whole) - this will also likely place PostgreSQL ahead of MongoDB in the only remaining benchmark where MongoDB beats PostgreSQL - atomic updates of (deep) json properties. I have yet to see benchmarks on that 
  • PostgreSQL can run the Javascript V8 engine (PL/V8) which allows me to write some really cool triggers that can happen at the data domain when it is ideal (it's not ideal to try to shove everything in the middle-tier) 
  • In comparison, MySQL JSON benchmarks are weak and if PostgreSQL were a4rbsolutely off the table then I would recommend a NoSQL database (MongoDB) over MySQL if the intent were large-scale *and* the application was intended for RESTful API (JSON) 
  • I have become expert in partitioning and sharding with PostgreSQL - I can't say the same for for MySQL although it looks alright 
  • PostgreSQL outperforms MySQL, often notably and always consistently in most adhoc benchmarks I have been able to find over the years 
  • In general, I have seen several large-scale solutions using PostgreSQL with impressive results. I hear about projects having success using MySQL in large-scale, I have no doubt about it but I haven't worked on one myself 
  • MySQL is owned by Oracle who offers a competing non-opensource non-free RDBMS that is the basis of their company. If it were not for PostgreSQL keeping the competition in open-source RDBMS brisk (leading the field certainly) then I'm certain that Oracle would probably hobble MySQL in order to distinguish and push their paid Oracle RDBMS on the marketplace. I believe it's a case of "use it or lose it"

Sorry, some of my opinions are subjective and my mastery of PostgreSQL makes it hard for me to not be biased towards it. Also, my comparisons to MongoDB may seem a little out of place but it is notable that not only is PostgreSQL likely the best open-source RDBMS, it also seems poised to take the (performance) crown for Document DBs as well.

Even compose.io (a MongoDB company) acknowledges this ( https://www.compose.io/articles/is-postgresql-your-next-json-database/) where the main thrust of the article contends that, in the DocumentDB space, MongoDB is still king unless "your JSON is fairly unchanging and needs to be queried a lot".

"If you update your JSON documents in place, the answer is no. What PostgreSQL is very good at is storing and retrieving JSON documents and their fields. But even though you can individually address the various fields within the JSON document, you can't update a single field. Well, actually you can, but by extracting the entire JSON document out, appending the new values and writing it back, letting the JSON parser sort out the duplicates"


As of v9.5, PostgreSQL allows fast json property updates clobbering their argument because "what data doesn't need to be queried a lot?"

No comments:

Post a Comment