Rich Freeman on 31 Jul 2016 15:23:41 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] Article on Postres/MySQL Storage Internals / Performance |
On Sun, Jul 31, 2016 at 5:28 PM, Thomas Delrue <delrue.thomas@gmail.com> wrote: > http://use-the-index-luke.com/blog/2016-07-29/on-ubers-choice-of-databases > > From the article (emphasis in original): > "In my opinion Uber’s article basically says that they found MySQL to be > a better fit /for their environment/ as PostgreSQL." > Thanks, this article was a useful counterpoint and it also had a few other things to add. It seems like they glossed over the issues with coping with large databases. It seems like database engines really need to be built to cope with errors rather than requiring restoration, and that they need to be able to do major operations like upgrades when running. When you have datasets of many TB you really don't want to have to ever restore a backup (which was probably millions of updates out of date before it even was completely saved to disk), or do any significant operation offline. I think this article also glossed over the secondary index issue a bit. I don't know how bad it is in practice, but it seems to me that having a secondary index would at most double the time to complete a select. While that is significant, it is just a fixed penalty that does not scale with the number of records. The same is true of updating multiple indexes, but unless your selects significantly outnumber your updates I suspect that the benefits for selects are going to not matter as much for a few reasons: 1. To the extent that the DB is in RAM, the selects will always be a RAM-only operation. Updates will always require writing to disk and thus the multiple-index penalty is affecting disk IO. 2. The Uber article didn't talk about cleanup, but I imagine that all those extra touples require more time to vacuum, possibly also on disk. Those are just some random thoughts, and I'll confess that I'm not a database expert. I'd certainly be interested in the opinions of anybody who is knowledgeable about such things. One thing I will say is that Postgres seems to have more of a reputation for quality in general. I still remember the days when InnoDB was not the default and the MySQL guys seemed to think that transactions were overrated. Of course, that was in the past, and perhaps the two platforms are more comparable now. The other thing the article doesn't touch on at all is Oracle. We're talking about Uber. Surely they can afford to run Oracle. Is there any reason that it would or wouldn't be a better platform for a database of their scale? I'll confess I don't know much about the inner workings of Oracle, and I'd be reluctant to go near it just because it seems so closed off. -- Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug