Cloud World

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Thursday, 19 February 2009

Back to the Future for Data Storage

Posted on 11:07 by Unknown

Building a massive, distributed datastore which can service requests at an extremely high throughput is something that we've focused on at Google. We created something called Bigtable that underlies the datastore in App Engine. The design for Bigtable focused on scalability across a distributed system so it may operate a bit differently than databases you've worked with before, such as not supporting joins. This isn't an accident -- when you build a system that can scale to the size that Bigtable can there's no way to do a general purpose join on data sets that size and still have them be performant.

Google isn't alone in offering an non-Relational datastore to enable scaling. For example, Amazon has SimpleDB:

A traditional, clustered relational database requires a sizable upfront capital outlay, is complex to design, and often requires a DBA to maintain and administer. Amazon SimpleDB is dramatically simpler, requiring no schema, automatically indexing your data and providing a simple API for storage and access.

There are also a range of non-relational open source datastores now available such as CouchDB and Hypertable. Those are just two examples, there are many more.

While you might think this is all new, it's actually a bit of a return to the past. You see, there was a time when "RDBMS" wasn't always the answer regardless of what the question was. At the time Codd published his paper, "A Relational Model of Data for Large Shared Data Banks," there were many different approaches to datastores. It was only in the '80s that relational databases won the majority of the mindshare. Having settled on a single metaphor the industry has developed many tools and techniques to make developing on a relational database easier.

Unfortunately that majority mindshare is also a problem because while RDBMS' are useful in many situations, they are not useful in all situations. Their dominance in the mindshare means that useful alternatives aren't used, and huge amounts of time and money can be wasted trying to force non-relational problems into a relational model.

We are in the middle of a renaissance in data storage with the application of many new ideas and techniques; there's huge potential for breaking out of thinking about data storage in just one way. Michael Stonebraker pointed out in his paper, "One Size Fits All": An Idea Whose Time Has Come and Gone, that there are common datastore use cases, such as Data Warehousing and Stream Processing that are not well served by a general purpose RDBMS and that abandoning the general purpose RDBMS can give you a performance increase of one or two orders of magnitude.

It's an exciting time, and the takeaway here isn't to abandon the relational database, which is a very mature technology that works great in its domain, but instead to be willing to look outside the RDBMS box when looking for storage solutions.



Posted by Joe Gregorio, Google App Engine Team
Email ThisBlogThis!Share to XShare to Facebook
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • A Day in the Cloud, new articles on scaling, and fresh open source projects for App Engine
    The latest release of Python SDK 1.2.3, which introduced the Task Queue API and integrated support for Django 1.0, may have received a lot ...
  • Tutorial: Adding a cloud backend to your application with Android Studio
    Android Studio lets you easily add a cloud backend to your application, right from your IDE. A backend allows you to implement functionality...
  • Outfit 7’s Talking Friends built on Google App Engine, recently hit one billion downloads
    Today’s guest blogger is Igor Lautar, senior director of technology at Outfit7 (Ekipa2 subsidiary), one of the fastest-growing media enterta...
  • Bridging Mobile Backend as a Service to Enterprise Systems with Google App Engine and Kinvey
    The following post was contributed by Ivan Stoyanov , VP of Engineering for Kinvey, a mobile Backend as a Service provider and Google Cloud ...
  • TweetDeck and Google App Engine: A Match Made in the Cloud
    I'm Reza and work in London, UK for a startup called TweetDeck . Our vision is to develop the best tools to manage and filter real time ...
  • New Admin Console Release
    Posted by Marzia Niccolai, App Engine Team Today we've released some new features in our Admin Console to make it easier for you to mana...
  • Qubole helps you run Hadoop on Google Compute Engine
    This guest post comes form Praveen Seluka, Software Engineer at Qubole, a leading provider of Hadoop-as-a-service.  Qubole is a leading pr...
  • The new Cloud Console: designed for developers
    In June, we unveiled the new Google Cloud Console , bringing together all of Google’s APIs, Services, and Infrastructure in a single interfa...
  • Pushing Updates with the Channel API
    If you've been watching Best Buy closely, you already know that Best Buy is constantly trying to come up with new and creative ways to...
  • Google BigQuery goes real-time with streaming inserts, time-based queries, and more
    Google BigQuery is designed to make it easy to analyze large amounts of data quickly. This year we've seen great updates: big scale JOI...

Categories

  • 1.1.2
  • agile
  • android
  • Announcements
  • api
  • app engine
  • appengine
  • batch
  • bicycle
  • bigquery
  • canoe
  • casestudy
  • cloud
  • Cloud Datastore
  • cloud endpoints
  • cloud sql
  • cloud storage
  • cloud-storage
  • community
  • Compute Engine
  • conferences
  • customer
  • datastore
  • delete
  • developer days
  • developer-insights
  • devfests
  • django
  • email
  • entity group
  • events
  • getting started
  • google
  • googlenew
  • gps
  • green
  • Guest Blog
  • hadoop
  • html5
  • index
  • io2010
  • IO2013
  • java
  • kaazing
  • location
  • mapreduce
  • norex
  • open source
  • partner
  • payment
  • paypal
  • pipeline
  • put
  • python
  • rental
  • research project
  • solutions
  • support
  • sustainability
  • taskqueue
  • technical
  • toolkit
  • twilio
  • video
  • websockets
  • workflows

Blog Archive

  • ►  2013 (143)
    • ►  December (33)
    • ►  November (15)
    • ►  October (17)
    • ►  September (13)
    • ►  August (4)
    • ►  July (15)
    • ►  June (12)
    • ►  May (15)
    • ►  April (4)
    • ►  March (4)
    • ►  February (9)
    • ►  January (2)
  • ►  2012 (43)
    • ►  December (2)
    • ►  November (2)
    • ►  October (8)
    • ►  September (2)
    • ►  August (3)
    • ►  July (4)
    • ►  June (2)
    • ►  May (3)
    • ►  April (4)
    • ►  March (5)
    • ►  February (3)
    • ►  January (5)
  • ►  2011 (46)
    • ►  December (3)
    • ►  November (4)
    • ►  October (4)
    • ►  September (5)
    • ►  August (3)
    • ►  July (4)
    • ►  June (3)
    • ►  May (8)
    • ►  April (2)
    • ►  March (5)
    • ►  February (3)
    • ►  January (2)
  • ►  2010 (38)
    • ►  December (2)
    • ►  October (2)
    • ►  September (1)
    • ►  August (5)
    • ►  July (5)
    • ►  June (6)
    • ►  May (3)
    • ►  April (5)
    • ►  March (5)
    • ►  February (2)
    • ►  January (2)
  • ▼  2009 (47)
    • ►  December (4)
    • ►  November (3)
    • ►  October (6)
    • ►  September (5)
    • ►  August (3)
    • ►  July (3)
    • ►  June (4)
    • ►  May (3)
    • ►  April (5)
    • ►  March (3)
    • ▼  February (7)
      • New! Grow your app beyond the free quotas!
      • Back to the Future for Data Storage
      • Web App Wednesday, Mashup, Backup, and Decay
      • The sky's (almost) the limit! "High CPU" is no more.
      • SDK version 1.1.9 Released
      • A roadmap update!
      • Best Buy's Giftag on App Engine
    • ►  January (1)
  • ►  2008 (46)
    • ►  December (4)
    • ►  November (3)
    • ►  October (10)
    • ►  September (5)
    • ►  August (6)
    • ►  July (4)
    • ►  June (2)
    • ►  May (5)
    • ►  April (7)
Powered by Blogger.

About Me

Unknown
View my complete profile