Thomas Delrue on 1 Aug 2016 13:25:07 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Article on Postres/MySQL Storage Internals / Performance


Here's some more reactions from the pgsql-hackers@postgresql.org mailing
list:
https://www.postgresql.org/message-id/flat/579795DF.10502%40commandprompt.com

(note: the thread goes on for a long time :) )


On 07/31/2016 06:23 PM, Rich Freeman wrote:
> On Sun, Jul 31, 2016 at 5:28 PM, Thomas Delrue <delrue.thomas@gmail.com> wrote:
>> http://use-the-index-luke.com/blog/2016-07-29/on-ubers-choice-of-databases
>>
>> From the article (emphasis in original):
>> "In my opinion Uber’s article basically says that they found MySQL to be
>> a better fit /for their environment/ as PostgreSQL."
>>
> 
> Thanks, this article was a useful counterpoint and it also had a few
> other things to add.
> 
> It seems like they glossed over the issues with coping with large
> databases.  It seems like database engines really need to be built to
> cope with errors rather than requiring restoration, and that they need
> to be able to do major operations like upgrades when running.  When
> you have datasets of many TB you really don't want to have to ever
> restore a backup (which was probably millions of updates out of date
> before it even was completely saved to disk), or do any significant
> operation offline.
> 
> I think this article also glossed over the secondary index issue a
> bit. I don't know how bad it is in practice, but it seems to me that
> having a secondary index would at most double the time to complete a
> select.  While that is significant, it is just a fixed penalty that
> does not scale with the number of records.  The same is true of
> updating multiple indexes, but unless your selects significantly
> outnumber your updates I suspect that the benefits for selects are
> going to not matter as much for a few reasons:
> 1.  To the extent that the DB is in RAM, the selects will always be a
> RAM-only operation.  Updates will always require writing to disk and
> thus the multiple-index penalty is affecting disk IO.
> 2.  The Uber article didn't talk about cleanup, but I imagine that all
> those extra touples require more time to vacuum, possibly also on
> disk.
> 
> Those are just some random thoughts, and I'll confess that I'm not a
> database expert.  I'd certainly be interested in the opinions of
> anybody who is knowledgeable about such things.
> 
> One thing I will say is that Postgres seems to have more of a
> reputation for quality in general.  I still remember the days when
> InnoDB was not the default and the MySQL guys seemed to think that
> transactions were overrated.  Of course, that was in the past, and
> perhaps the two platforms are more comparable now.
> 
> The other thing the article doesn't touch on at all is Oracle.  We're
> talking about Uber.  Surely they can afford to run Oracle.  Is there
> any reason that it would or wouldn't be a better platform for a
> database of their scale?  I'll confess I don't know much about the
> inner workings of Oracle, and I'd be reluctant to go near it just
> because it seems so closed off.
> 

Attachment: signature.asc
Description: OpenPGP digital signature

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug