Cloud World

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Monday, 9 December 2013

DataStax Enterprise feels right at home in Google Compute Engine

Posted on 13:00 by Unknown
Today’s guest post comes from Martin Van Ryswyk, Vice President of Engineering at DataStax.



The cloud promises many things for database users: transparent elasticity and scalability, high availability, lower cost and much more. As customers evaluate their cloud options -- from porting a legacy RDBMS to the cloud to solutions born in the cloud -- we would like to share our experience from running more than 300+ customers’ live systems in a cloud-native way.



At DataStax, we drive Apache CassandraTM. Designed for the cloud, Cassandra is a massively scalable, open-source NoSQL database designed from the ground up to excel at serving modern online applications. Cassandra easily manages the distribution of data across multiple data centers and cloud availability zones, can add capacity to live systems without impacting your application’s availability and provides extremely fast read/write operations.



One of the advantages of Google Compute Engine is its use of Persistent Disks. When an instance is terminated, the data is still persisted and can be re-connected to a new instance. This gives great flexibility to Cassandra users. For example, you can upgrade a node to a higher CPU/Memory limit without re-replicating the data or recover from the loss of a node without having to stream all of the data from other nodes in the cluster.



DataStax and Google engineers recently collaborated on running DataStax Enterprise (DSE) 3.2 on Google Compute Engine. The goal was to understand the performance customers can expect on Google’s Persistent Disk, which recently announced new performance and pricing tiers. DataStax Enterprise supports a purely cloud-native solution and can span on-premise and cloud instances for customers wanting a hybrid solution.



Tests and results of DataStax Enterprise on Google Compute Engine

We were very interested to see how consistent the latency would be on Persistent Disks, as it represents a highly consistent storage with predictable and highly competitive pricing. Our tests started at the operational level and then moved into testing the robustness of our cluster (Cassandra ring) during failure and I/O under heavy load. All tests were run by DataStax, with Google providing configuration guidance. The resulting configuration file and methodology can be found here.



The key to consistent latency in Google Compute Engine is sizing one’s cluster so that each node stays within the throughput limits. Taking that guidance with our recommended configuration, we believe the results are readily replicable and applicable to your application. We tested three scenarios, all with positive outcomes:


  1. Operational stability of 100 nodes spread across two physical zones.


    • Objective: longevity test at 6,000 record per second (60 record/sec/node) for 72 hours.

    • Results: we saw trouble-free operation, where data tests completed without issue. Replication completed, where data streamed effortlessly across dual zones.


  2. Robustness during a reboot/failure through reconnecting Persistent Disks to an instance.


    • Objective: measure impact of terminating a node and re-connecting its disk to a new node.

    • Results: new nodes joined the Cassandra ring without having to be repaired and with no data loss (no streaming required). We did need to manage IP address changes for the new node.


  3. Push the limits of disk performance for a three node cluster.


    • Objective: measure response under load when approaching the disk throughput limit.

    • Results: Our tests showed a good distribution of latency during the tests, and 90% of the I/O write times were less than 8ms (see figures below depicting the medium latency and latency distribution). These results were while our load did not exceed the published throughput (I/O) thresholds (see caps for thresholds).



What’s next

We find Google Compute Engine and the implementation of Persistent Disks to be very promising as a platform for Cassandra. The next step in our partnership will be more extensive performance benchmarking of DataStax Enterprise. We look forward to publishing the results in a future blog post.



Figures for reference

The graph below shows median latency, a figure of merit indicating how much time it takes to satisfy a write request (in milliseconds):





The figure below depicts the distribution of latencies (ms) for write latencies. As noted above, 90% of write latencies were below 8ms, indicating the consistency of performance. The tight distribution within 1-4ms speaks to the predictability of performance.





-Contributed by Martin Van Ryswyk, Vice President of Engineering, DataStax
Email ThisBlogThis!Share to XShare to Facebook
Posted in Compute Engine, partner | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Bridging Mobile Backend as a Service to Enterprise Systems with Google App Engine and Kinvey
    The following post was contributed by Ivan Stoyanov , VP of Engineering for Kinvey, a mobile Backend as a Service provider and Google Cloud ...
  • Tutorial: Adding a cloud backend to your application with Android Studio
    Android Studio lets you easily add a cloud backend to your application, right from your IDE. A backend allows you to implement functionality...
  • 2013 Year in review: topping 100,000 requests-per-second
    2013 was a busy year for Google Cloud Platform. Watch this space: each day, a different Googler who works on Cloud Platform will be sharing ...
  • Easy Performance Profiling with Appstats
    Since App Engine debuted 2 years ago, we’ve written extensively about best practices for writing scalable apps on App Engine. We make writ...
  • TweetDeck and Google App Engine: A Match Made in the Cloud
    I'm Reza and work in London, UK for a startup called TweetDeck . Our vision is to develop the best tools to manage and filter real time ...
  • Scaling with the Kindle Fire
    Today’s blog post comes to us from Greg Bayer of Pulse , a popular news reading application for iPhone, iPad and Android devices. Pulse has ...
  • Who's at Google I/O: Mojo Helpdesk
    This post is part of Who's at Google I/O , a series of guest blog posts written by developers who are appearing in the Developer Sandbox...
  • A Day in the Cloud, new articles on scaling, and fresh open source projects for App Engine
    The latest release of Python SDK 1.2.3, which introduced the Task Queue API and integrated support for Django 1.0, may have received a lot ...
  • SendGrid gives App Engine developers a simple way of sending transactional email
    Today’s guest post is from Adam DuVander, Developer Communications Director at SendGrid. SendGrid is a cloud-based email service that deliv...
  • Qubole helps you run Hadoop on Google Compute Engine
    This guest post comes form Praveen Seluka, Software Engineer at Qubole, a leading provider of Hadoop-as-a-service.  Qubole is a leading pr...

Categories

  • 1.1.2
  • agile
  • android
  • Announcements
  • api
  • app engine
  • appengine
  • batch
  • bicycle
  • bigquery
  • canoe
  • casestudy
  • cloud
  • Cloud Datastore
  • cloud endpoints
  • cloud sql
  • cloud storage
  • cloud-storage
  • community
  • Compute Engine
  • conferences
  • customer
  • datastore
  • delete
  • developer days
  • developer-insights
  • devfests
  • django
  • email
  • entity group
  • events
  • getting started
  • google
  • googlenew
  • gps
  • green
  • Guest Blog
  • hadoop
  • html5
  • index
  • io2010
  • IO2013
  • java
  • kaazing
  • location
  • mapreduce
  • norex
  • open source
  • partner
  • payment
  • paypal
  • pipeline
  • put
  • python
  • rental
  • research project
  • solutions
  • support
  • sustainability
  • taskqueue
  • technical
  • toolkit
  • twilio
  • video
  • websockets
  • workflows

Blog Archive

  • ▼  2013 (143)
    • ▼  December (33)
      • 2013 Year in review: topping 100,000 requests-per-...
      • 2013 Year in review: making Google Compute Engine ...
      • 2013 Year in review: bringing App Engine to the PH...
      • Now Get Programmatic Access to your Billing Data W...
      • 2013 year in review: making scalability easy with ...
      • 2013 Year in review: taking Google Cloud Platform ...
      • 2013 Year in review: pushing the limits of Big Data
      • 2013 Year in review: enabling native connections f...
      • 2013 Year in review: bringing Offline Disk Import ...
      • Best practices for App Engine: memcache and eventu...
      • 2013 Year in review: giving time back to developers
      • 2013 Year in review: bringing together mobile and ...
      • Go on App Engine: tools, tests, and concurrency
      • Qubole helps you run Hadoop on Google Compute Engine
      • Alert Logic security and compliance solutions for ...
      • Outfit 7’s Talking Friends built on Google App Eng...
      • You can now deliver any-screen streaming media usi...
      • Using Google Compute Engine with open source software
      • DataTorrent offers massive-scale, real-time stream...
      • DataStax Enterprise feels right at home in Google ...
      • Why We Deployed Zencoder on Google Cloud Platform
      • Scalr and Google Compute Engine
      • Cloud9 IDE on Google Compute Engine
      • Fishlabs architects upcoming game with Compute Eng...
      • An ode to Sharkon
      • SaltStack for Google Compute Engine
      • Google Compute Engine and App Engine give Evite fr...
      • SUSE Linux Enterprise Server Now Available on Goog...
      • Google Compute Engine is now Generally Available w...
      • The new Persistent Disk - faster, cheaper and more...
      • Red Hat and Google Compute Engine – Extending the ...
      • Google Compute Engine helps Mendelics diagnose gen...
      • CoolaData digs into the “why” of online consumer b...
    • ►  November (15)
    • ►  October (17)
    • ►  September (13)
    • ►  August (4)
    • ►  July (15)
    • ►  June (12)
    • ►  May (15)
    • ►  April (4)
    • ►  March (4)
    • ►  February (9)
    • ►  January (2)
  • ►  2012 (43)
    • ►  December (2)
    • ►  November (2)
    • ►  October (8)
    • ►  September (2)
    • ►  August (3)
    • ►  July (4)
    • ►  June (2)
    • ►  May (3)
    • ►  April (4)
    • ►  March (5)
    • ►  February (3)
    • ►  January (5)
  • ►  2011 (46)
    • ►  December (3)
    • ►  November (4)
    • ►  October (4)
    • ►  September (5)
    • ►  August (3)
    • ►  July (4)
    • ►  June (3)
    • ►  May (8)
    • ►  April (2)
    • ►  March (5)
    • ►  February (3)
    • ►  January (2)
  • ►  2010 (38)
    • ►  December (2)
    • ►  October (2)
    • ►  September (1)
    • ►  August (5)
    • ►  July (5)
    • ►  June (6)
    • ►  May (3)
    • ►  April (5)
    • ►  March (5)
    • ►  February (2)
    • ►  January (2)
  • ►  2009 (47)
    • ►  December (4)
    • ►  November (3)
    • ►  October (6)
    • ►  September (5)
    • ►  August (3)
    • ►  July (3)
    • ►  June (4)
    • ►  May (3)
    • ►  April (5)
    • ►  March (3)
    • ►  February (7)
    • ►  January (1)
  • ►  2008 (46)
    • ►  December (4)
    • ►  November (3)
    • ►  October (10)
    • ►  September (5)
    • ►  August (6)
    • ►  July (4)
    • ►  June (2)
    • ►  May (5)
    • ►  April (7)
Powered by Blogger.

About Me

Unknown
View my complete profile