<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"><channel><title>A VC - Latest Comments in Hacker News and the NoSQL Movement</title><link>http://avc.disqus.com/</link><description></description><language>en</language><lastBuildDate>Mon, 20 Jul 2009 04:22:03 -0000</lastBuildDate><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12942951</link><description>Nice summary - another candidate for leveraging existing RDBMS's is through the use of in-memory data grids. I touch on this in my post about the NoSQL movement in the context of legacy systems:&lt;br&gt;&lt;a href="http://bigdatamatters.com/bigdatamatters/2009/07/nosql-vs-rdbms.html" rel="nofollow"&gt;http://bigdatamatters.com/bigdatamatters/2009/0...&lt;/a&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">kevinglenny</dc:creator><pubDate>Mon, 20 Jul 2009 04:22:03 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12411067</link><description>yes, we are investors in 10gen, the company that has built the MongoDB open&lt;br&gt;source datastore</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">fredwilson</dc:creator><pubDate>Thu, 09 Jul 2009 18:11:20 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12410842</link><description>Are you talking about MongoDB? I am trying it today, because I need a proper document database, instead of a simple key-value store.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">turian</dc:creator><pubDate>Thu, 09 Jul 2009 18:04:47 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12299031</link><description>Not sure how much help I'd be, but sure. My email is &lt;a href="mailto:firstname@firstnamelastname.net" rel="nofollow"&gt;firstname@firstnamelastname.net&lt;/a&gt; (pdx isn't part of my last name).</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">jasonwatkinspdx</dc:creator><pubDate>Tue, 07 Jul 2009 23:53:26 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12254651</link><description>jason, can i email you? i want to run something by you.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">fredwilson</dc:creator><pubDate>Tue, 07 Jul 2009 10:27:44 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12229366</link><description>Damn you guys are smart!!!!!! I feel very fortunate to have people like you commenting here. Thank you</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">fredwilson</dc:creator><pubDate>Mon, 06 Jul 2009 17:38:21 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12181433</link><description>The scalable structure you describe sounds interesting, I'd love pointers to literature you think is interesting on the topic (I'm familiar with linearizing onto hilbert curves or the like). You're likely already familiar, but just in case, you may find some of the work descending from SDDS's like LH interesting. I can't recall the specific paper, but there was one that descended from CAN that handles partitioning a hypercube amongst peers and the routing necessary on such a network to execute queries.&lt;br&gt;&lt;br&gt;But, I also think it the focus on SQL or index structures actually misses the point slightly in the context of data storage for large web applications. &lt;br&gt;&lt;br&gt;The big problem is the generals.&lt;br&gt;&lt;br&gt;It's not just about moving away from SQL, it's about being forced to deal with the reality that maintaining one consistent view of data stored on multiple nodes has unacceptable performance penalties. Inktomi was the first to run into this in the setting of internet services and all large internet properties are treading onward from the same trailhead they established.&lt;br&gt;&lt;br&gt;Consensus protocols have unacceptable overhead, so in practice they're generally only used for maintaining critical lookup values for some larger system that leverages weaker consistency requirements.&lt;br&gt;&lt;br&gt;Scalable partial query result merging doesn't matter for real systems because Read All Write One or Read One Write All end up working better. &lt;br&gt;&lt;br&gt;For example, let's look at Google's inverse word index for search queries. It's too large to fit in one host, so we must partition the state in one way or another. At first glance the best way would probably seem to allocate the inverse lists for specific words to specific hosts, via hash, directory or some other method. This way a given search only needs to query the specific servers that hold inverse word lists for terms in the search.&lt;br&gt;&lt;br&gt;This isn't how google actually does it. &lt;br&gt;&lt;br&gt;Instead they partition by document, and a given search is broadcast to all servers. At first this seems quite wasteful, but it's actually a better fit to current hardware constraints. Each server can apply the full query against each document, avoiding the intermediate merge problem entirely and only returning full matches.&lt;br&gt;&lt;br&gt;This works better because bisection bandwidth is the least scalable resource in large systems.&lt;br&gt;&lt;br&gt;This system is straightforward to scale as well. Need to store more documents? Add nodes. No need to worry about a single inverse word list growing beyond a single host and requiring more complicated behavior. Need to process more searches? Replicate the entire cluster as many times as necessary, and accept that replicas will never be exactly in sync. As long as your load balancing sends users preferentially to the same replica you'll avoid making replication lag visible for most users.&lt;br&gt;&lt;br&gt;Scaling and operating RAWO systems like this is simple. I'd love to see a SQL database that used the same architecture: INSERTS to random nodes, all other statements broadcast, and each replica cluster does async or partially async multi-master replication.&lt;br&gt;&lt;br&gt;ROWA systems don't scale, but for systems that can live within those limits they're a far simpler method of high availability (example: atomic broadcast like zookeeper vs implementing paxos).&lt;br&gt;&lt;br&gt;And of course there are many interesting approximate quorum systems that live in the region between those extremes (such as amazon S3, which I gather is R2 W4).</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">jasonwatkinspdx</dc:creator><pubDate>Sun, 05 Jul 2009 19:43:49 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12179716</link><description>Facebook is a poor example for several reasons:&lt;br&gt;&lt;br&gt;- their needs are only typical of a handful of the largest web properties&lt;br&gt;- their use of mysql is heavily sharded, which means throwing out most of what SQL gives you&lt;br&gt;- most of their actual data access is done to memcached&lt;br&gt;&lt;br&gt;This architecture resembles a key value store much more than it resembles a traditional RDBMS installation. They're essentially using MySQL as a recovery log, memcached as their hot data store, and implement query processing in the application.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Jason Watkins</dc:creator><pubDate>Sun, 05 Jul 2009 17:51:04 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12173551</link><description>Same here</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">ShanaC</dc:creator><pubDate>Sun, 05 Jul 2009 11:37:44 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12170147</link><description>James,&lt;br&gt;&lt;br&gt;It is clear that this is your area of expertise, so I had to think twice (well, in fact it was even more than that) before replying to your comments.&lt;br&gt;&lt;br&gt;It is my understanding that your comment implies that there is a solution that would address problems of both worlds: relational and KV. I'm not an expert in the field, but my perception is that if this solution would exists it will be once again something that will try to fit every problem. And this is exactly what happened with RDBMS (what is now called the one size fits all RDBMS).&lt;br&gt;&lt;br&gt;While I cannot formulate a general rule and I'm basing this hypothesis only on my experience, I'd say that special problems will always need a special solution. Not to mention that I still don't believe in the existence of a panacea. Starting to trust again in a single approach will just limit the solution space we are looking into.&lt;br&gt;&lt;br&gt;I think that what we should be looking for is a way to address the impedance mismatch between RDBMS and KV and make the two cooperate in a much easier and standardized way, to address the impedance mismatch between current programming paradigms and storage by aligning the access layer and figuring out the 'interaction language', etc.&lt;br&gt;&lt;br&gt;A radical new paradigm is definitely welcome, but I think that will take a long time to be proved theoretically and even more to be proved by real live apps.&lt;br&gt;&lt;br&gt;cheers.&lt;br&gt;&lt;br&gt;ps: I'd love to learn more from you about this new approach, so I'm wondering if it would be possible to move this offline somehow.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">alexpopescu</dc:creator><pubDate>Sun, 05 Jul 2009 07:28:19 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12169063</link><description>Quite interesting! I must confess that even if I knew about 10gen and MongoDB I haven't put the story together, so now everything makes sense.&lt;br&gt;&lt;br&gt;Thanks a ton</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">alexpopescu</dc:creator><pubDate>Sun, 05 Jul 2009 06:25:34 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12164999</link><description>I see that issue in my head better than I can describe it- I could draw it.&lt;br&gt;&lt;br&gt;The issue for me with a hypercube, especially once I looked it up- is its shape and its ability to rotate. If you solve the issue of what shape (because there are many possible n) and how you want to rotate it (there are more axes with each n added) (and those rotations reveal all of the potential shortest distances within the hypercube, or your where statements) then you resolve the problem.&lt;br&gt;&lt;br&gt;Being "above" the shape, or having the ability to re-look at the shape from a totally different perspective is what is giving me trouble.&lt;br&gt;&lt;br&gt;Ideally, the more dimensions you have, the more possible hypecubes, and imagining yourself from above the dimension(s) allows you to "play" with each block more efficiently in your head to find out shape one needs in order to really get the shortest distance from point a to point b if one can only take the lines within the hypercube of any n dimensions.&lt;br&gt;&lt;br&gt;Hence the need for a really large and fast spaceship.  You need to be "above n hypercube,"  and then think about what happens when you add a new n about how it looks and behaves.  Spaceships move and are above the earth and if the earth was my hypercube- I could see and use my spaceship to figure out where to go one I got back down onto it by looking around it.&lt;br&gt;&lt;br&gt;Unless this is more lost than before?  I understand the idea of cube trying to stick itself into arbitrary space as we define it- it's more where do I look at the cube that I am trying to understand, or how should I look at the cube to see as much of it as possible.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">ShanaC</dc:creator><pubDate>Sat, 04 Jul 2009 22:07:47 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12164635</link><description>Never smoked a hookah in my life- I'm quite the innocent.&lt;br&gt;&lt;br&gt;I'm an art student- it's for a sculpture made up of various kinds of computer data storage materials throughout the "ages."  My language in the end is very visual.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">ShanaC</dc:creator><pubDate>Sat, 04 Jul 2009 21:37:45 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12163525</link><description>I totally agree. But the same hacker once he grows his business, guess where he will lookup for the storage solutions? It's Oracle &amp; NetApp's of the world.&lt;br&gt;&lt;br&gt;Here is a classic example: Facebook using Oracle: (note the date- April 09)&lt;br&gt;&lt;a href="http://www.oracle.com/customers/snapshots/facebook-ebs-snapshot.pdf" rel="nofollow"&gt;http://www.oracle.com/customers/snapshots/faceb...&lt;/a&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">uds</dc:creator><pubDate>Sat, 04 Jul 2009 19:55:35 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12158751</link><description>So true. But the hacker who builds the next facebook on hadoop can't get fired because he works for himself</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">fredwilson</dc:creator><pubDate>Sat, 04 Jul 2009 15:23:48 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12157241</link><description>Another factor in this space is people issues and this is nothing to do with technology.&lt;br&gt;&lt;br&gt;If a DBA runs Oracle/SQL Server/DB2 and looses data, he is OK.&lt;br&gt;If a DBA runs any of these un-proven DB and looses data, he will be FIRED.&lt;br&gt;&lt;br&gt;So it catch-22 for any new DB's. They need big customers to prove that they can scale and sustain the load and customers wants to see the real time deployment before the adapt it!</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">uds</dc:creator><pubDate>Sat, 04 Jul 2009 14:06:58 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12153344</link><description>Well mongo came out of an attempt to build an entire open source stack in the cloud that we backed (10gen). That turned out to be way to ambitious and as we were getting ready to call it quits, we saw some interesting activity around mongo. So we re-oriented the company around mongo and now its starting to gain adoption&lt;br&gt;&lt;br&gt;So like most things we invest in, the opportunity changed post investment</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">fredwilson</dc:creator><pubDate>Sat, 04 Jul 2009 09:38:22 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12153342</link><description>Agreed. That is the consistent message coming out of this excellent comment thread&lt;br&gt;&lt;br&gt;Thanks!!</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">fredwilson</dc:creator><pubDate>Sat, 04 Jul 2009 09:38:18 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12141125</link><description>re: HN, TechMeme, Reddit, Digg: might it be the signal-to-noise ratio that made you like more HN and Techmeme? (I confess I have a supposition about these sites and their target audience and I'm just trying to validate it)&lt;br&gt;&lt;br&gt;re: MongoDB&lt;br&gt;While I'm familiar with MongoDB, I don't know much about MongoDB plans, so I'll just speculate here -- in my extremely obvious attempt :-) -- to make you say more about what convinced you to invest in MongoDB (and leaving aside the team argument).  &lt;br&gt;Do you think MongoDB will possibly become a MySQL equivalent (not saying replacement or alternative though) and start using the same model? Or is it that you see opportunity in the open source backed services (f.e. cloudera, springsource, etc)?</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">alexpopescu</dc:creator><pubDate>Sat, 04 Jul 2009 07:31:40 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12138360</link><description>Fred, it would be my pleasure. Contact away.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">James</dc:creator><pubDate>Sat, 04 Jul 2009 01:47:57 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12138046</link><description>First, I am NOT arguing for/against any technology here...just some comments based on my experience.&lt;br&gt;&lt;br&gt;It was OODBMS vs RDBMS in 90's and this following CNET article says it all-&lt;br&gt;&lt;a href="http://news.cnet.com/Jasmines-time-has-come/2100-1001_3-206252.html" rel="nofollow"&gt;http://news.cnet.com/Jasmines-time-has-come/210...&lt;/a&gt;&lt;br&gt;(This article dated Dec'97 says time has come. That time never came for Jasmine!)&lt;br&gt;&lt;br&gt;Now it's key-value vs RDBMS and I think RDBMS will again. &lt;br&gt;&lt;br&gt;I agree with the comment by a key-value developer:&lt;br&gt;"A few specialized applications can and have been built on a plain DHT, but most applications built on DHTs have ended up having to customize the DHT's internals to achieve their functional or performance goals."&lt;br&gt;&lt;a href="http://spyced.blogspot.com/2009/05/why-you-wont-be-building-your-killer.html#" rel="nofollow"&gt;http://spyced.blogspot.com/2009/05/why-you-wont...&lt;/a&gt;&lt;br&gt;&lt;br&gt;Then the computer world article is wrong. FB uses how many thousands of MySQL? Why FB hired a MySQL expert recently? Are they going to throw out all the MySQL instances in the near future? no way...&lt;br&gt;&lt;br&gt;Why Facebook/Yahoo is building SQL on top of Hadoop? Why Cassandra alias is discussing SQL like for Cassendra?&lt;br&gt;&lt;br&gt;Even I myself said, Good bye SQL: (i may be also wrong; it might have come out from my past OODBMS experience)&lt;br&gt;&lt;a href="http://uds-web.blogspot.com/2008/12/good-bye-sql.html" rel="nofollow"&gt;http://uds-web.blogspot.com/2008/12/good-bye-sq...&lt;/a&gt;&lt;br&gt;&lt;br&gt;So, the question is NOT about SQL vs NoSQL and is about what type of database technologies needed for the next gen web apps?</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Uday Subbarayan</dc:creator><pubDate>Sat, 04 Jul 2009 01:22:05 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12132775</link><description>It is simpler than it sounds, a hypercube is just a cube in a logical space of arbitrary dimensionality i.e. a rectangle is a "cube" in 2-dimensional space.  &lt;br&gt;&lt;br&gt;Consider an "employee" table with three attributes "name", "title", "employment date".  The three attributes are 'dimensions', and every record in the "employee" table can be located in the space of all possible employee records using those three attributes as coordinates.  In this way, the employee table describes a 3-dimensional logical space in which all possible records will exist. A query on the employee table is searching for a region of that 3-dimensional space that contains the coordinates described in the WHERE clause of your SQL.&lt;br&gt;&lt;br&gt;&lt;br&gt;The complication is that in traditional databases we would use a collection of 1-dimensional indexes that are merged to simulate a true 3-dimensional index, and that merge operation does not scale.  This problem becomes much worse if you have true spatial data types, like geospatial or constraints (such as what one might use for real-time search).&lt;br&gt;&lt;br&gt;&lt;br&gt;Ideally, you want a data structure that can directly represent several &lt;br&gt;dimensions such that any subset of the space those dimensions describe can be very efficiently accessed in a distributed system *and* the data structure can efficiently store data types that can be described as a subset of that space.  It is a very tricky problem that has generated thousands of pages of research over decades.  We've been using workarounds for years but the rapid growth of data sets has exposed the lack of scalability intrinsic to those workarounds, forcing us to deal with the underlying problem directly.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">James</dc:creator><pubDate>Fri, 03 Jul 2009 19:33:11 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12103488</link><description>Steady, dude-ess, there's a difference between old and ancient!&lt;br&gt;&lt;br&gt;And whilst I am curious to know how you managed to connect databases and spaceships, what I would really like is for you to pass the bong over...&lt;br&gt;&lt;br&gt;As for the cards, all I can suggest is eBay (they're quite boring, when you seen one you've seen em all).&lt;br&gt;&lt;br&gt;Take care now.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">hymanroth</dc:creator><pubDate>Fri, 03 Jul 2009 18:54:14 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12093311</link><description>I was hoping you would say- when computer programs were still on cards (actually does anyone know where to get those cards?)&lt;br&gt;&lt;br&gt;Have you ever read Flatland by Edwin Abbott? I'm thinking I and my data live in Flatland with and I need some spaceships from an intruder from the next dimension to get to my data.  (If not- &lt;a href="http://xahlee.org/flatland/index.html" rel="nofollow"&gt;http://xahlee.org/flatland/index.html&lt;/a&gt;)&lt;br&gt;&lt;br&gt;A spaceship that can go "through" the earth?  Because Flatland only ends in the third dimension- whereas the spaceship can deal with other since it is in a mysterious "above" plane.&lt;br&gt;&lt;br&gt;(and I was hoping you would say you programmed on those cards, I really could use some of those cards...)</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">ShanaC</dc:creator><pubDate>Fri, 03 Jul 2009 18:30:09 -0000</pubDate></item><item><title>Re: Hacker News and the NoSQL Movement</title><link>http://www.avc.com/a_vc/2009/07/hacker-news-and-the-nosql-movement.html#comment-12090309</link><description>Thanks for trying Shana, but if I'm honest you're just making things worse...&lt;br&gt;&lt;br&gt;When I was a nipper we just stuffed n-dimensional data into an n-dimensional array, but that was when computer screens came in any color as long as it was green.&lt;br&gt;&lt;br&gt;But I loved the 'really fast big spaceship' analogy (whatever it means)&lt;br&gt;&lt;br&gt;+1</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">hymanroth</dc:creator><pubDate>Fri, 03 Jul 2009 17:04:54 -0000</pubDate></item></channel></rss>