Barry Roomberg on Wed, 9 Apr 2003 21:18:08 -0400


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Can Open Source Replace Oracle?


Wow!

You people have WAY too much time on your hands.
So I'll throw a bit more of gas on to the fire.

About a month ago a friend of mine asked the same question 
on a site I hang out on.  After bantering about a bit
about PostgreSQL crappy performance and lousy ODBC drivers,
someone asked if he had looked at SAP-DB, extolling the virtues
of it. 

Note:  Sorry about the URLs.  The web interface this was posted on eats
the visual portion of it.

My initial response was:
----------------------------------------------------
----------------------------------------------------

Looked. Then look away 

Dangerous choice.

Unlikely.

1st, I believe we all agree that of the various 
databases mentioned, all of them has the "kiss of death"
in one way to the other, ie: performance, lousy 
integration, iffy support, etc, in the "enterprise"
level. 

Other than SAP-DB, right?


Ok, the Google test shows:

Postgresql:
http://www.google.co...59-1&q=postgresql [*]
2.5 million. 

MySQL:
http://www.google.co...SO-8859-1&q=mysql [*]
8.19 million.

SAPDB and SAP-DB:
http://www.google.co...SO-8859-1&q=sapdb [*]
95 thousand
http://www.google.co...-1&q=%22sap-db%22 [*]
27 thousand

And lest I piss off CRC:
Interbase
http://www.google.co...859-1&q=interbase [*]
468 thousand


Not that I consider these "stats" meaningful, I merely use them
as an example to show what level of pushback I am getting on
ANY open source database that is not Postgresql or MySQL.

A database has too much integration with too many pieces of the
company to be installed in an isolated fashion. You have multiple
sets of people who have to code for it, link web screens to it, install
it, back it up, etc. And the incumbent Unix databases (Oracle and DB2)
are not LOATHED by the people who use them. The people who HATE 
them are the people paying the bills. The people who admin them
usually like them a LOT. 

Linux had multiple things going for it that allowed it to overwhelm 
the inertia of the installed based of NT.

1) Admin population pissed with NT looking for alternatives.

2) End user population mostly clueless, not caring, but not happy
when the servers went down.

3) Admin user base from Unix people liked it right away. They were
wary, and continue to use "real" Unix as needed, but it is not 
mutually exclusive. Add a Linux box the a bunch of Unix machines
is pretty painless, and if it doesn't work out you can quickly move
to a Unix box before getting fired.

4) Seemless drop in install. Setup a Samba box and the end users could
not tell the difference.

5) Installed at no charge by advocates. They weren't going to bring 
problems to management, so Linux only got GOOD press by the people
who set it up.

6) Ran faster on older equipment, so there was no performance drop to
explain away.

7) Was more stable than NT, users were happier.

8) When there was a problem, people sshed from home so things got
fixed faster.

Samba installs were a HUGE skunkworks project in companies across the
world. Then add project specific Apache installs. For static pages,
it was the same as IIS from an end-user perspecitve, but then it started
getting press and was noticed. Add a bit of network glue, dns, dchp,
RAS servers. Toss in an occasional Oracle install. Setup a highly 
publicised beowulf cluster.

So for YEARS there was a march forward until one day, people woke up,
and it was EVERYWHERE. But that STILL wasn't good enough for a large
number of companies. Until IBM blessed it, it was rarely publicly 
acknowledged.

Now try to imagine it playing out with an Open Source database. There
are conflicts almost every step of the way. 

1) Admin popluation quite happy with the current database.

2) End users (programmers in this case) with highly specific
knowledge who would need to work twice as hard in an unfamilier
environment. And don't bother telling me about generic SQL. SQL
(as implemented) is NOT generic. Add in the various procedural 
extensions and people are LOCKED.

3) Same problem with admin base. They like what they have, and now
need to work a lot harder. The syntax is not the same, features are 
missing, and it is a LARGE effort to move between databases in the 
event of an emergency. Fireable offence for choosing the wrong ones.

4) Nothing seemless. Huge amounts of political arm twisting and 
then trying to get users do things a different way. Everywhere.
People just waiting for it to fall apart and scream to the boss
how production is down and for what?

So the damn geek can save $17K? DAMMIT - WE ARE LOSING MILLIONS 
RIGHT NOW!!!!

Even worse, lots of current drivers just get "wierd". You get
3 weeks into a project and then have to scramble, looking for 
alternatives. I JUST hit that wall. 

PHB said: No Oracle for this project, use Postgresql.

Required 3rd party software said: PUKE - HURL - DIE.

So we brought in a SQL/Server install just for this project.
I am SCREWED on every bad thing about moving databases, while
simultaenously being screwed on cost and forced to deal
with NT, all under project timeline crush.

Since I like my PHB, I'm not screaming too loudly about it,
but it still sucks.

5) Installed poorly by people who want it to fail or installed at 
consulting expense by mercenaries. They'll be happy to do it, 
and be happy to charge you to maintain it until you staff gets up 
to speed. And then, the staff brings LOTS of problems to management, 
so alternative database only get BAD press by the people who are 
forced to admin and use it.

6) Run slower on current equipment. People really have no clue
of how fast Oracle is compared to the other bases. It costs me time
and money everytime I use something else. But now that we've
install a few million record on SQL/Server, it is starting to
prove itself out, so the door has opened for it.

7) Is less stable the current Oracle or DB2. And the people running
like it that way so they can kill it. But until then, you will only 
hear about it dying for no reason. Or certain syntax STOP working.
Or it forgetting there are indexes. Or how snapshot backups
MIGHT not work. Lots of FUD will be spread.

8) When there was a problem, people sshed from home so things got
fixed faster. OK, this stays the same. EXCEPT WHEN I NEED TO DEAL
WITH THE NT DAMMIT!


Now assume SAPDB is so damn good, all the above technical concerns are 
dealt with. It still won't matter, You need enough "buzz" for a 
political champion to risk his job over it.

Say the words: Hey, lets run our mission critical system on SAB-DB!

Most likely, the response will be: SAP-DB? Never heard of it.

If there person has ANY technical leaning but isn't a real techie,
the response will be: Isn't that the piece of crap that SAP had to 
jettison on order to get into real accounts, and then they rewrote 
everything for Oracle and DB2? Didn't they open source it when they 
decided to stop supporting it?

Right or wrong, you need to convince some really high up people to 
risk their careers with this type of migration, and it will not
happen very often, if EVER.


----------------------------------------------------
End of original post.
----------------------------------------------------

So then the flamefest began, with SAB-DB person deciding I was the
anti-Christ.  We went a few rounds, had a little fun (at least I did),
and then I decided to give it a try.

Here was my followup post 2 weeks later:



----------------------------------------------------
----------------------------------------------------


SAB-DB review 

So I'm talking to my boss about a week ago.

He told me he's under pressure from his boss to dump
Oracle and move EVERYTHING to SQL Server.

It was enough for the failed Postgres project that got rescued
by SQL Server to put this evil thought in the boss at
the next level. 

So my boss, who LOVES M$ desktop tools and HATES M$
backend stuff wants to try MySQL again. He's been told
by the Open Source gnomes that the ODBC drivers are
MUCH better. I told him he was insane. Just because
the ODBC driver works does not mean the tool generated
SQL will work.

So, could we please try SAB-DB?

His reponse (I could have scripted it, actually I DID 
script it!): What the hell is that?

Note: This guy was a C programmer, adminned Oracle/Solaris
systems, is very good in Perl, and reads the industry rags.
And HE said: What the hell is that?

So I explained that is was SAP's database product that they used
to have for the back-end installs. They open sourced it but
they still sell it, and I'm told they have great support for 
it.

His response: So, it's another Interbase?

Me: I don't believe so. They still sell it as part of their
whole suite and they make good money supporting it.

Him: So it could quickly become another Interbase?

Me: Maybe. Dunno. Doubt it since they have lots of installations
as part of their full suite. If you really want to use an open 
source base that WORKS, as opposed to a 1/2 ass one like MySQL,
I think we should investigate this.

Him: Ok. You'll be freed up in about 3 weeks. You can spend a 
week on it then.

Rather than wait to be freed up (never happen) I started installing
immediately. 1st to my WinXP laptop, then to a dual PIII/1.4Ghz 
Linux box.

Note: This is the BEST tutorial page I came across:
http://www.rst-consu...mcli_us_html.html [*]

I stumbled across a bunch of stuff. It took me DAYS to figure out
how to load data. And I DID ask our resident SAP-DB (guru|bigot),
who pointed me to web pages that did not help. All the docs refer 
to loadercli, which is renamed to repmcli in Linux. Then it took
another day to to figure out how to tell it to use tab delimited.

Really.

They call it compressed, and you have to place the following line
in you loader control file to make it work:

SET COMPRESSED '/ //'

Note the embedded tab!

Dates are painful. I never did figure out how to load multiple date 
formats at once. I could not even figure out how to load null 
dates. I must be an idiot.

I downloaded all the PDFs. I spent about 6 hours reading them and
taking notes. I didn't expect it to make me a guru, but I did
expect it to give me enough information to know where to look.
I was wrong.

I WANT this to work. It has an Oracle SQL compatible mode. It
is native Unix. It claims really good ODBC. It's own tools
use the ODBC connection. It has a pretty Window Admin application
while still allowing for back-end scripting of everything. So
I'm willing to work through the stupid stuff.

I finally created a decent sized base (12 GB) and loaded 3 million 
records and indexed them. The load was pretty quick. About the same 
speed as SQL Server, about 1/2 as fast as Oracle. And yes, I used
the nologging fastload option.

I did a full optimize statistics (not sample). This should at this point
give me a best case scenario on "ok" hardware. Not the best, but better
than what I am running SQL/Server on.

A simple count(*) picked it up of the index. Fine, it doesn't have the
stupid PostGres implementation of forcing a table walk for that particular 
query.

A simple 2 level group by the resulted in 1,263 rows took over 6 minutes.
Uhoh. The most common thing my interactive users do is create break-down
counts across multiple fields.

I do a sanity check. I log into the SQL/Server box and run the same query.
40 seconds. Note: This is a shared SQL/Server box with just slower CPUs
and the same amount of memory, and is 8 times faster.

Uhoh. Pray I'm bottlenecking on disk. This is because I see that the CPUs
are only 30% active during the query.

I shmooze a bit and commandeer a dual 2.4 GHz Xeon box. It has about 2 weeks
before it is expected to be rolled out as an Oracle compute server. It has
5 fast SCSI disks and a high end RAID controller. I spent extra $$$ on
the RAID controller. It does NOT get any faster than this for a single 
tasking test. I am the only user on this box.

Loading goes noticibly faster, pinning the CPU. OK, no disk bottleneck here.

Group by query run 3 minutes, 32 seconds. Since I expect to have this same
type of box for my final SQL/Server system, I expect that to drop to 15
seconds for SQL/Server. But even if it doesn't there is NO comparison.

Add up the time saved in productions, the happier interactive customer, and
the dark side has just won this one.

Unless someone can point out any tuning problem I might have?

----------------------------------------------------
----------------------------------------------------

The resident SAP-DB person decided not to respond, even though he
was active in other forums and I pointed the post out to him later.

It seems that the advocates for these solutions are betting OTHER people's
time and effort.  And they do it with little or no understanding of the
decisions that got us here in the 1st place.  


_________________________________________________________________________
Philadelphia Linux Users Group        --       http://www.phillylinux.org
Announcements - http://lists.netisland.net/mailman/listinfo/plug-announce
General Discussion  --   http://lists.netisland.net/mailman/listinfo/plug