Conferences for specific databases or soltion sets.
So I put the book code out on github. As nice as it is that Pragmatic makes all book code available in a zip file for download, I’m the type of person who will never actually bother to download it. But I’ll fork a project in an instant…
Feel free to fork and pull request any bugs you might encounter.
Another week, another database launch. This time, surprisingly, it’s not yet another NoSQL offering. Instead, it’s a loose MySQL fork written by Nikita Shamgunov, former SQL Server engineer and ACM wunderkind, called MemSQL.
The MemSQL magic sauce is supposedly that it compiles SQL queries (I’m not sure how that makes it different from prepared statements, though until I benchmark it I’ll give them the benefit of the doubt). It seems to gain its speed by keeping most of the database resident in memory, so again, I don’t see what makes it so different from other in memory databases. So far, not much there.
MemSQL might employ a coding prodigy, but VoltDB has an in-memory database backed by Mike Stonebreaker, and only one of the two men has his own wikipedia entry. Yeah, I know that’s an Appeal to accomplishment, but I’ll change my tune when MemSQL provides something a dozen other databases already don’t (not that I want to imply it won’t… it’s just getting a bit too much of the TC-VC-Industrial Complex hype for my taste).
Relational, columnar, graph or key-value store,
document datastores too.
So much to discover, in this song we’ll cover
from each type at least one or two.
Neo4J, Postgres and HBase and Redis then
CouchDB, Mongo and Riak.
of partitions, consistency, availability:
pick two, you can’t have all three-ach.
Postgres is relational, stable, transactional.
Tables have columns and rows.
Rules, window functions and SQL for querying;
vertically is how it grows.
Riak’s key-value store implements Dynamo,
shards data out to a ring.
It’s REST-based with mapreduce link-walking functions and
vector-clocks; made in Erlang.
HBase is columnar just like BigTable:
distributed, sorted and sparse.
Hadoop’s ecosystem provides extra features but
setup’s a pain in the arse.
Oh, Mongo stores JSON—-its documents speedily
replicate so it’s webscale.
Indexes and updates your deep nested attributes
in-line, so data’s not stale.
Neo4J is so ACID compliant; this
graph database really shines.
You query through edges that connect two vertices.
No ORM-based designs.
The Redis key-value holds rich data structures;
is RAM-based or writes them to disk.
Expiry’s for caching. PUB/SUB message passing,
and queueing by block reading lists.
The CouchDB doc store has partial mapreduce;
is RESTful, crash-only and stable.
Great for embedding, ad-hoc replicating,
though don’t try to join, it’s not able.
The database world is now rich with complexity;
so much to research and know.
You have many options you’ll need to consider like…
Disk read and writing and
Bloom filters, buffering
consistent hashing and more!