Cassius Rosenthal on 5 Sep 2007 19:17:45 -0000

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PhillyOnRails] jruby + hadoop?

Just what Oracle wants you to think: "This is serious app. It needs
serious database."

Well . . . yeah. But I use postgresql. (^_^)

In my view pretty much anything beats having all your data locked up
in a schema'd but not versioned, highly stateful, monolithic DB

I'm not sure why you would object to the stateful aspect of a DB -- either an app needs stateful data, or it does not, right? Clearly there are solutions for versioning and clustering. I've only gone through the map/reduce slides once, but it seems to me that pgpool/pgcluster take very similar approaches when they breaks up queries from the log and send them out to multiple servers to run, then bring the results back together for a result.

It's only "application agnostic" because all your
applications already are committed to SQL.

This is true -- but Rails itself is a convention-based framework. We have already surrendered to that principal, and SQL is an open standard.

If you try to integrate an
RDF app it won't seem so agnostic anymore.

XUL embraces RDF as datasources for its template engine, and I'm pretty sure that I'm not alone when I opine that RDF is the biggest obstacle for XUL developers. It is as awkward as the spelling of 'awkward'. I admit that I have little experience with cases that would be best served by RDF, but wouldn't those cases be contrary to MVC anyway? When inference logic is in the data tree, is the controller on a cigarette break?

I'm sure there are many other grammars that would be non-trivial to extract from SQL as well, but I don't see that any of them would be superior for common/general use. On the flip side, I don't see any argument proving that SQL is a best-fit for common/general use either, but since it is an open standard, that argument doesn't need to be made. At the very least, I can say that it is just as awkward to extract SQL-like tables from RDF as it is to go in the opposite direction.

I can say this for Oracle and Postgres: on both I have seen extremely impressive number crunching, data integrity, and great flexibility in data presentation. On Oracle, the performance boosts that you can get by optimizing query statistics is fascinating, to say the least. I don't see how map/reduce could even theoretically provide the same degree of optimization, because when all of the parts are sent out to nodes to be solved, it depends on the slowest node, not on the intelligence of the master to preselect the order in which the conditions should be resolved. Good for performing a general computation, but not good for retrieving data that we know something about. Of course, Google is using map/reduce -- so I'd appreciate it if someone could tell me why I am wrong.

I would like to see more examples of CouchDB being used where it clearly makes more sense than a RDBMS. Right now, I think I would want to use it for quick-and-dirty Rails apps (which could be the majority of web applications), but not for complex apps.

Would CouchDB make sense as a filesystem?

To unsubscribe or change your settings, visit: