K.S. Bhaskar via plug on 6 Dec 2021 07:41:23 -0800


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Thoughts on MySQL from a developer


For what it's worth, YottaDB is developing Octo, yet another SQL engine, but one which uses YottaDB as the underlying storage engine.

Slides from a recent presentation are at https://yottadb.com/wp-content/uploads/2021/11/211022-SQLAccessToVistA.pdf (video of talk with slides at https://opensourcevista.net/NancysVistAServer/39thVCMVideosAndSlides/39thVCM20211023OCTO-Habiel.mp4). The slides and talk have a VistA (the healthcare information system developed by the VA) slant, as that was the conference audience.

User documentation is at https://docs.yottadb.com/Octo/grammar.html#mapping-to-existing-yottadb-global-variables

Octo is a 100% free / open source project (AGPL v3) at https://gitlab.com/YottaDB/DBMS/YDBOcto/

Functionally, and optimization-wise, it has some way to go to catch up with PostgreSQL, MariaDB/MySQL, and others. Octo is intended to add SQL capabilities to the YottaDB hierarchical key-value database, so that our users can have their cake and eat it too. I would be happy to do a talk (or request one of the developers to do a talk) at a PLUG meeting.

Regards
– Bhaskar

On Mon, Dec 6, 2021 at 1:50 AM Keith via plug <plug@lists.phillylinux.org> wrote:
On 12/5/21 5:47 PM, brent timothy saner via plug wrote:
> On 12/5/21 17:25, Keith C. Perry wrote:
>> Not sure I was say Pg is a memory "hog" outright but relative to MySQL I get what you mean.  Scaling issues are relative too- 100M clients with 1% concurrency is less load than 100k clients with 20% concurrency.  Most people obsess about this in the application and storage systems when they really should be looking at their network.  I find concurrency issues are usually approached top down instead of bottom up.  Memory isn't the premium it used to be and if you're pooling, the necessity for having top of the line hardware for each node goes away (again, networking, networking, networking).
>>
> I'd definitely say network has a much higher influence than any other
> factor in the modern context. But keep in mind:
>
> - this memory usage per client is a *minimum* outside of any queries, and
> - this is bad news for scaling in general, regardless of actual scale size.
>
> Sure, RAM is cheap. Boards also have RAM limits. :)

LOL, true that.  Makes you wonder why people ever obsessed about scaling
up compared to scaling out.  Oh, wait- networking (again) 😈

>> That said, I'll take Pg's data resiliency over MySQL's 1 billion times over.  Last year when my Zimbra server crashed I lost a week's worth of data because the MySQL database would not open and could not be recovered.  Fortunately, I had run a closed backup of the VM a week ago which saved my ass (otherwise it would have been closer to a 30 days loss).  Apparently, MySQL still needs special care to be backed up while the system is up.  As far as I can recall I've always using InnoDB tables instead of ISAM too (which are supposed to be more "durable").  Such issues for Pg were resolved 21 years ago.  I'll admit there's a tricky balancing act no matter what you choose but personally, I'll always take database durability over performance and that skews me towards Pg out of the box.
>>
> That's fair, yeah. Clustering with hot slaves is a good way around this
> for MySQL (well... hell, any high-availability needs), but PostgreSQL is
> definitely on top with this.
>
> Now, that being said, I *personally* haven't hit any corrupt tables in
> MySQL that weren't the result of my own failure (e.g. disk filling up,
> for instance).
>
> As for special care during backups, shouldn't be necessary - though I'm
> not sure what you mean by "special care".
>
> Between the mysqldump options I have here[0] and doing a dump
> per-database[1], you should be set for any kind of disaster.
Special care = closed database, i.e. the system is not running thus
everything [should] be on disk at rest.  Putting aside some system fsync
issue that was how I used to deal with MySQL backups before using the
native online dump program.  What I didn't realize (or pay enough
attention to) is that Zimbra wasn't doing dumps or any sort of database
protection and I kept putting off confirming to myself was actually was
happening.  I took the inappropriate mindset that "I'm running InnoDB
tables so it should be just a durable as Pg now- I'll work on this
later".  That was dumb so I put this more on me than MySQL but at the
same time I wish Zimbra would make PostgreSQL an option because in my
experience it survives the typical bad things that can happen to systems
better than better than MySQL.
>
>> I'm surprised you didn't mention SQLite for basic query needs.  That running on top of XFS is pretty sweet.  BTW, I "wrote" (i.e. created a database) an inventory system for a client so I could track software and hardware assignments in MySQL.  Only thing I use to access it is phpMyAdmin and its works perfectly for my needs since it is not client facing.  Its one of the few things I don't see moving to Pg so I'm definitely not knocking MySQL 100%.
>>
> You know, I thought about it (re: SQLite) but it's not really applicable
> to the point I was getting towards - SQLite is great for small purposes!
> It flies, and it's STUPID easy to create a DB right entirely in memory
> in most libraries. BUT it doesn't have a concept of users, access
> permissions, etc. nor of a multi-client approach. It doesn't even run
> persistently via a UDS or something, and obviously networking is right
> out. So I purposely didn't mention it because it felt a little
> out-of-place in the comparison.
Gotcha.
>
> And yep, I think you understand my point. For a large amount of database
> usage out there that needs mutli-client access, MySQL is fine and
> PostgreSQL is kind of just overkill. Don't get me wrong, I think
> PostgreSQL is great and really excels at things. But there are things
> where yes, MySQL really is going to get you more bang for your buck and
> fits the project spec better and quite frankly, I'm tired of the tech
> community pretending that's not the case and PostgreSQL is a magic bullet.

I don't think its about Pg being a magic bullet but I do think that Pg
has a comfort level and loyalty that is different from the MySQL/MariaDB
community.  The way I think about it is that both are available and for
a lot of work loads its not worth getting into the weeds over non-major
gains in performance one way or the other.  Major issues, sure but most
people are going to lean into their comfort zone unless they can't.

>
> [0]
> https://git.r00t2.io/BorgExtend/tree/plugins/mysql.py?id=7ef21e8059f4d1db817a5552aaca2f702c89e20e#n35
>
> [1]
> https://git.r00t2.io/BorgExtend/tree/plugins/mysql.py?id=7ef21e8059f4d1db817a5552aaca2f702c89e20e#n60
> ___________________________________________________________________________
> Philadelphia Linux Users Group         --        http://www.phillylinux.org
> Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
> General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug

--
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Keith C. Perry, MS E.E.
Managing Member, DAO Technologies LLC
(O) +1.215.525.4165 x2033
(M) +1.215.432.5167
www.daotechnologies.com

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug