Upcoming Database/NoSQL Conferences

Conferences for specific databases or soltion sets.

General Confs

Down the NoSQL Rabbit Hole @ Railsberry, Eric Redmond

Hashrocket Interview with Eric Redmond about Seven Databases.

Seven Databases in Seven Weeks Code

So I put the book code out on github. As nice as it is that Pragmatic makes all book code available in a zip file for download, I’m the type of person who will never actually bother to download it. But I’ll fork a project in an instant…

https://github.com/sevenweeks/databases

Feel free to fork and pull request any bugs you might encounter.

MemSQL

Another week, another database launch. This time, surprisingly, it’s not yet another NoSQL offering. Instead, it’s a loose MySQL fork written by Nikita Shamgunov, former SQL Server engineer and ACM wunderkind, called MemSQL.

The MemSQL magic sauce is supposedly that it compiles SQL queries (I’m not sure how that makes it different from prepared statements, though until I benchmark it I’ll give them the benefit of the doubt). It seems to gain its speed by keeping most of the database resident in memory, so again, I don’t see what makes it so different from other in memory databases. So far, not much there.

MemSQL might employ a blogging prodigy (Scott Chow? Visit his blog), but VoltDB has an in-memory database backed by Mike Stonebreaker, and only one of the two men has his own wikipedia entry. Yeah, I know that’s an Appeal to accomplishment, but I’ll change my tune when MemSQL provides something a dozen other databases already don’t (not that I want to imply it won’t… it’s just getting a bit too much of the TC-VC-Industrial Complex hype for my taste).

Apache HBase on Amazon EMR

Finally! Amazon EMR is going to leverage the power of HBase. I love EMR, but I hated its restrictions. A step in the right direction, for sure…

AWS has already given you a lot of storage and processing options to choose from, and today we are adding a really important one.

You can now use Apache HBase to store and process extremely large amounts of data (think billions of rows and millions of columns per row) on AWS. HBase offers a number of powerful features including:

  • Strictly consistent reads and writes.
  • High write throughput.
  • Automatic sharding of tables.
  • Efficient storage of sparse data.
  • Low-latency data access via in-memory operations.
  • Direct input and output to Hadoop jobs.
  • Integration with Apache Hive for SQL-like queries over HBase tables, joins, and JDBC support.

HBase is formally part of the Apache Hadoop project, and runs within Amazon Elastic MapReduce. You can launch HBase jobs (version 0.92.0) from the command line or the AWS Management Console.

10 months ago -

Seven Weeks Community

It’s hard to install some of these databases… just like it was hard to get started with a few of the seven languages as well. With that in mind, we’re toying around with hosting a public community where people going through about the seven weeks books can ask/answer, and be searchable.

10 months ago -

Lyrics to Seven Databases in Song

Relational, columnar, graph or key-value store,
document datastores too.
So much to discover, in this song we’ll cover
from each type at least one or two.

Neo4J, Postgres and HBase and Redis then
CouchDB, Mongo and Riak.
of partitions, consistency, availability:
pick two, you can’t have all three-ach.

Postgres is relational, stable, transactional.
Tables have columns and rows.
Rules, window functions and SQL for querying;
vertically is how it grows.

Riak’s key-value store implements Dynamo,
shards data out to a ring.
It’s REST-based with mapreduce link-walking functions and
vector-clocks; made in Erlang.

HBase is columnar just like BigTable:
distributed, sorted and sparse.
Hadoop’s ecosystem provides extra features but
setup’s a pain in the arse.

Oh, Mongo stores JSON—-its documents speedily
replicate so it’s webscale.
Indexes and updates your deep nested attributes
in-line, so data’s not stale.

Neo4J is so ACID compliant; this
graph database really shines.
You query through edges that connect two vertices.
No ORM-based designs.

The Redis key-value holds rich data structures;
is RAM-based or writes them to disk.
Expiry’s for caching. PUB/SUB message passing,
and queueing by block reading lists.

The CouchDB doc store has partial mapreduce;
is RESTful, crash-only and stable.
Great for embedding, ad-hoc replicating,
though don’t try to join, it’s not able.

The database world is now rich with complexity;
so much to research and know.
You have many options you’ll need to consider like…

Disk read and writing and
Bloom filters, buffering
CPU
querying
TTL
caching plus
consistent hashing and more!

Seven Databases in Song! This was so much fun to work on, and Jim Wilson totally deserves all the credit. Thanks also to Todd Yard for his sexy vocals.

(Seven Plus Or Minus Two) Databases For Computational Journalists

10 months ago -