gabriel rosenkoetter on Wed, 9 Apr 2003 14:42:10 -0400 |
On Wed, Apr 09, 2003 at 10:46:27AM -0400, Edmund Goppelt wrote: > I'd like Philadelphians to be able to look up land ownership records. Is it, at present, possible for us to do this by USPS? Is there a law saying we should be able to? > It turns out the City already has a web site that offers this > information. For three years, a group of 50 private companies have > been able to look at deed images and related info. at no cost, by > going here: At no cost? Are you sure they didn't front some cash for the city's initial layout? Why can't they just use the one of those license they hopefully preserved to do a SQL dump of their Oracle tablespace and had that to you on CD? (I presume there's no IP issues... this is all public information, right?). > 1. Doing so would overburden the City's Internet connection. Quite possible. Colocation doesn't solve this problem; if they only pay for a certain amount of bandwidth per month and get cut off when it's exceeded, they're removing the information from the public domain again. > 2. Their license with Oracle Corporation restricted them to 50 named > users (i.e., not simultaneous users, but the same 50 people). Sounds true to me. > I asked her recently why she didn't just ditch Oracle and use > PostgreSQL or MySQL. Because PostgreSQL can't come close to serving large databases at the speed that Oracle can (we're talking orders of magnitude here), because MySQL fails the ACID test, because neither is capable of accessing raw disk in an even remotely sane way (Veritas quick I/O is what they're almost definitely using with Oracle) and because switching to *anything* is a HUGE development cost. SQL may be a "standard", but a database in use within one DBMS cannot just be magically transferred to another DBMS without some significant work by a DBA skilled in both DBMSes (a rarity; no, really!) and some (probably even more) signficant work by a developer on the external interfaces to the database. > For the record, Hallwatch runs MySQL and Zope on an 800 Mhz > Celeron, 512 MB RAM, 40 GB HD off of a shared T-1 connection. Which would be totally insufficient to the task you'd like to ask of it. > In your opinion, what hardware configuration does this application > require? Well, you're out of your mind to use commidity PC hardware. If you insist on doing this on an IA32 machine (why? No, really; why?) using Linux (again, why?), you're looking at maybe an IBM xSeries For years' data is around 1 million rows. You don't happen to know how much data's in a rwo, do you? How many years do you intend to keep online? What indexing do you intend to do across them? You could *very* quickly be looking at terabyte quantities of disk. I really doubt Postgres will hold up under more than a couple of years of data (MySQL isn't really an option even for one year of data; sapdb might do better than Postgres, but not by much). Before you suggest that you'd be content to get a new system ever couple of years to manage the next couple of years of data, consider the utility of being able to compare between years, over a decade, so forth. > Do you know of any government entities that are using one of the Open > Source databases? Do you know of any companies providing the level of support that companies like Oracle and Veritas do? (And I mean actually *providing* it. I've been very disappointed with Red Hat, for instance...) > Unless I hear from you otherwise, I will assume that it is ok to show > your comments to the Commissioner or other City officials. I'm not sure how that could possibly make a difference. The city has no reason to believe that we're not completely figments of your imagination unless we show up and testify, do they? On Wed, Apr 09, 2003 at 11:45:48AM -0400, Jeff Weisberg wrote: > the largest postgres installation I know of is ".org" How many rows is that? How many bytes per row? On what bases is it indexed? On Wed, Apr 09, 2003 at 12:28:43PM -0400, Michael Leone wrote: > Hardware is secondary - you can throw bunches of hardware at it, but if > the software doesn't offer a needed feature, more hardware won't (always) > help. There's software and then there's software. A big part of Oracle's being faster (above and beyond how much faster it just *is* than open source DBMSes) is using Veritas Volume Manager with Quick I/O to access backing disk as raw but through the file system, meaning that adding tablespace is very easy (as opposed to very painful with true raw partitions), and that operations like backups function along normal FS lines. In addition, Veritas permits of checkpoints, snapshots, and a variety of mechanisms to assure data integrity over time, across branch analysis, and so forth. These are all things that are theoretically feasible under Linux and with a DBMS other than Oracle, but they are siginficantly more difficult. Difficulty of management matters. On Wed, Apr 09, 2003 at 04:33:41PM -0000, greg@turnstep.com wrote: > > 1. Doing so would overburden the City's Internet connection. > The city exists to serve the people. If the bandwidth becomes an > issue (and I seriously doubt it will), then the city should upgrade > their connection. That's like arguing that new roads could overburden > the city's traffic, so they should not be built. No, it's like arguing that running SEPTA lines like the R3 every ten minutes rather than every hour would overburden the rails and the infrastructure (architectural and human) to support the trains. It's *possible* for the city to spend more on Internet service, but where do you think it's written that it's their obligation to provide this information to you at your convenience swallowing all costs themselves? (And if your suggestion is that our taxes should pay for this, then you need to take your argument to the people who allocate tax funds.) > > 2. Their license with Oracle Corporation restricted them to 50 named > > users (i.e., not simultaneous users, but the same 50 people). > Sounds like a very poor licensing decision. It's not a decision, it's all Oracle will sell you these days. > PostgreSQL would definitely be up to the job. I strongly disagree. How many rows are in the largest Postgres database you've dealt with? How much data (byte-wise)? What indexes? > Support for PostgreSQL can be purchased from many companies. Name two. (I know of exactly one.) > Even after buying such support, the money saved from not using Oracle would > be quite substantial. Could you compare the costs, including the labor time of development to convert, please? On Wed, Apr 09, 2003 at 01:00:13PM -0400, Chris Hedemark wrote: > Sucks. If it overburdens the city's internet connection, then so be > it. And what of non-public-informational uses of the Internet by the city? And what of the charge that, when their bandwidth is saturated, they're again restricting access to the information? > I think there are licensing options on Oracle for concurrent > connections rather than named users, so there should be some options > there too. Not any more. We have such a license at work, Chris, but it is IMPOSSIBLE to get those from Oracle now, and we're very careful to pay Oracle bills on time to avoid their revoking it. > Someone made some stupid decisions and fixing it might call > into question who & why the original bad decisions were made in the > first place. Which decisions, precisely, do you think were stupid? > If they can provide you with a temporary read-only account to their > Oracle server, and the database schema, you should be able to push that > to your own machine with no problem if you're sitting on their LAN. They can't do that, but I fail to see why they can't give him a SQL dump. > While there are some people here who would disagree, most of the > objections against PostgreSQL that I have heard are based off of flawed > evaluations. I know precisely which evaluation you're referring to Chris, and you really are wrong, and Barry really is right. He really did examine the ways in which Postgres manages its memory usage (both on disk and in memory proper) and performs operations on it, and it really does lose. The suggestion was never that Postgres on a random Dell 1RU server should compete with Oracle on a fully-populated E450, but that operations should happen at an appropriate speed guaged by the relative processor, I/O, and memory speed of the systems. They don't. (And Oracle running on identical hardware under Linux wins too, btw. Well it used to... with the stupid interactions we've seen between it and ext3 lately, who knows.) > Anyway, it's moot. You just want the data, right? They can give that > to you without burning an Oracle license and without burning their > bandwidth. Agreed. And I think this is the route that's most likely to get Ed the end result he wants. -- gabriel rosenkoetter gr@eclipsed.net Attachment:
pgpF8aEN2GCDz.pgp
|
|