Rich Freeman on 31 Jul 2016 15:23:41 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Article on Postres/MySQL Storage Internals / Performance


On Sun, Jul 31, 2016 at 5:28 PM, Thomas Delrue <delrue.thomas@gmail.com> wrote:
> http://use-the-index-luke.com/blog/2016-07-29/on-ubers-choice-of-databases
>
> From the article (emphasis in original):
> "In my opinion Uber’s article basically says that they found MySQL to be
> a better fit /for their environment/ as PostgreSQL."
>

Thanks, this article was a useful counterpoint and it also had a few
other things to add.

It seems like they glossed over the issues with coping with large
databases.  It seems like database engines really need to be built to
cope with errors rather than requiring restoration, and that they need
to be able to do major operations like upgrades when running.  When
you have datasets of many TB you really don't want to have to ever
restore a backup (which was probably millions of updates out of date
before it even was completely saved to disk), or do any significant
operation offline.

I think this article also glossed over the secondary index issue a
bit. I don't know how bad it is in practice, but it seems to me that
having a secondary index would at most double the time to complete a
select.  While that is significant, it is just a fixed penalty that
does not scale with the number of records.  The same is true of
updating multiple indexes, but unless your selects significantly
outnumber your updates I suspect that the benefits for selects are
going to not matter as much for a few reasons:
1.  To the extent that the DB is in RAM, the selects will always be a
RAM-only operation.  Updates will always require writing to disk and
thus the multiple-index penalty is affecting disk IO.
2.  The Uber article didn't talk about cleanup, but I imagine that all
those extra touples require more time to vacuum, possibly also on
disk.

Those are just some random thoughts, and I'll confess that I'm not a
database expert.  I'd certainly be interested in the opinions of
anybody who is knowledgeable about such things.

One thing I will say is that Postgres seems to have more of a
reputation for quality in general.  I still remember the days when
InnoDB was not the default and the MySQL guys seemed to think that
transactions were overrated.  Of course, that was in the past, and
perhaps the two platforms are more comparable now.

The other thing the article doesn't touch on at all is Oracle.  We're
talking about Uber.  Surely they can afford to run Oracle.  Is there
any reason that it would or wouldn't be a better platform for a
database of their scale?  I'll confess I don't know much about the
inner workings of Oracle, and I'd be reluctant to go near it just
because it seems so closed off.

-- 
Rich
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug