Kyle R. Burton on 10 Feb 2011 13:26:22 -0800 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] Speakers needed for all chapters! |
> I'd like to hear a talk on non-relational databases such as CouchDB or > MongoDB. Toby has given a talk in the past about Cassandra - might want to hit him up and see if he'll talk about it again. Someone from Philly Lambda gave a talk on Mongo DB. > Laying out data in terms of rows and columns seems really > natural to me. I'd love to have someone familiar with NoSQL concepts > tell me why I'm wrong. I am not an experienced user of any of popular nosql databases, but I have used a non-relational datastore a few companies ago. Some aspects that are different (IMO) are: - many nosql databases are effectively document stores - many relax ACID (few if any support transactions outside of CUD for a single document) - they do this (at least in spirit) to gain performance - cassandra bills itself as 'eventually consistent' for exactly this reason - they have no fixed data model, documents are heterogeneous - don't have to plan a strict schema up front - don't have to migrate old documents to update the schema (somewhat backward/forward compatible) - you give up being able to use SQL to access the datastore - 'indexes' are often just a function over the document store, stored right back into the system - since there is no equivalent to a join, sharding across multiple systems can be easier SalesForce.com uses a non-relational schema so that it's many customers can define their own data models and their app can still have (somewhat) consistent access methods. Some that I've seen are: xpath, javascript (I think I remember that in couch db, indexes are created with a map/reduce), or just a single key lookup (which may return multiple documents). At the company where we used a non-relational store we did it for the same kind of reason: we created a data integration system that took snapshots (data and schema) of relational databases and merged them together (via matching) and could extract out of that consolidated form a relational snapshot (model and data) of your choosing (even multiples with different consolidation rules). At the time I felt it was a good approach because it kept us from applying non-reversible changes to input data - at least up until we needed to export it to send it back to a customer. Today my impression is that many go after nosql because the barrier for getting one up and running, even across a cluster, is easier than with a relational database (eg: to get clustering or replication set up). They're seen as simpler to administer too because, well, they're simpler than a relational database. I tend to agree with many of these sentiments (esp the ability for something like cassandra to scale), though I will still choose an RDBMS for most projects, if only so that business users can use the tools they're accustomed to (sql, odbc, etc). > I also second the call for an IPv6 talk. It seems like something many > of us don't know enough about. Is there anyone on the list that could give one on converting a corporate network to ipv6? Kyle -- Twitter: @kyleburton Blog: http://asymmetrical-view.com/ Fun: http://snapclean.me/ ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug