Cloud World

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Wednesday, 18 September 2013

Google BigQuery goes real-time with streaming inserts, time-based queries, and more

Posted on 08:00 by Unknown
Google BigQuery is designed to make it easy to analyze large amounts of data quickly. This year we've seen great updates: big scale JOINs and GROUP BYs, unlimited result sizes, smarter functions, bigger quotas, as well as multiple improvements to the web UI. Today we've gone even further, announcing several updates that give BigQuery the ability to work in real-time, query subsets of the latest data, more functions and browser tool improvements.



Real-time analytics

BigQuery is now able to load and analyze data in real time through a simple API call, the new tabledata().insertAll() endpoint. This enables you to store data as it comes in, rather than building and maintaining systems just to cache and upload in batches. The best part? The new data is available for querying instantaneously. This feature is great for time sensitive use cases like log analysis and alerts generation.



Using it is as easy as calling the new endpoint with your data in a JSON object (with a single row or multiple rows of data).



Here's a Python example:

body = {"rows":[
{"json": {"column_name":7.7,}}
]}
response = bigquery.tabledata().insertAll(
projectId=PROJECT_ID,
datasetId=DATASET_ID,
tableId=TABLE_ID,
body=body).execute()



Streaming data into BigQuery is free for an introductory period until January 1st, 2014. After that it will be billed at a flat rate of 1 cent per 10,000 rows inserted. The traditional jobs().insert() method will continue to be free. When choosing which import method to use, check for the one that best matches your use case. Keep using the jobs().insert() endpoint for bulk and big data loading. Switch to the new tabledata().insertAll() endpoint if your use case calls for a constantly and instantly updating stream of data.



Table decorators for recent imported data

This release introduces a new syntax for "table decorators". Now you can define queries that only scan a range or spot in the previous 24 hours. Traditionally BigQuery has always done a "full column scan" when querying data, while the new syntax will allow you to focus only on a specific subset of the latest data. Querying a partial subset of data with these decorators will result in lower querying costs — proportional to the size of the subset of the queried data.



With the new capabilities you can query only the last hour of inserted data, or query only what was inserted before that hour, or get a snapshot of the table at a specific time. The table decorators also work in all table related operations (list, export, copy, and so on), giving you the ability to do operations like copying a table as it was before a bad import job.



New window and statistical functions

To the previously announced window functions, we've added SUM(), COUNT(), AVG(), MIN(), MAX(), FIRST_VALUE, and LAST_VALUE(). Refer to the previous release announcement for a run through of how window functions work.



We also recently introduced the new Pearson correlation function, and showed how to find interesting and anomalous patterns in ambient sensors, or how to predict tomorrow's flight delays using 70 million flights dataset. Today's release adds new statistical functions, for even richer capabilities: COVAR_POP(), COVAR_SAMP(), STDDEV_POP(), STDDEV_SAMP(), VAR_POP() and VAR_SAMP().



BigQuery browser tool: Improved history retrieval

You can now query your history faster in the browser tool using the Query History panel: More information about the queries is being surfaced and convenient buttons have been added for common tasks.





You can try out the new UI features, table decorators, and new functions at https://bigquery.cloud.google.com/. You can post questions and get quick answers about BigQuery usage and development on Stack Overflow. We love your feedback and comments. Join the discussion on +Google Cloud Platform using the hashtag #BigQuery.



- Posted by Felipe Hoffa, Developer Programs Engineer
Email ThisBlogThis!Share to XShare to Facebook
Posted in Announcements, bigquery | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Bridging Mobile Backend as a Service to Enterprise Systems with Google App Engine and Kinvey
    The following post was contributed by Ivan Stoyanov , VP of Engineering for Kinvey, a mobile Backend as a Service provider and Google Cloud ...
  • Tutorial: Adding a cloud backend to your application with Android Studio
    Android Studio lets you easily add a cloud backend to your application, right from your IDE. A backend allows you to implement functionality...
  • JPA/JDO Java Persistence Tips - The Year In Review
    If you’re developing a Java application on App Engine you probably already know that you can use JPA and JDO, both standard Java persistence...
  • Google App Engine + Google Data APIs: A Match Made on the Web
    Posted by Marzia Niccolai, Google App Engine Team Don't you think it'd be cool if you could blog about a party, add it to your calen...
  • 2013 Year in review: topping 100,000 requests-per-second
    2013 was a busy year for Google Cloud Platform. Watch this space: each day, a different Googler who works on Cloud Platform will be sharing ...
  • Easy Performance Profiling with Appstats
    Since App Engine debuted 2 years ago, we’ve written extensively about best practices for writing scalable apps on App Engine. We make writ...
  • New in Google Cloud Storage: auto-delete, regional buckets and faster uploads
    We’ve launched new features in Google Cloud Storage that make it easier to manage objects, and faster to access and upload data. With a tiny...
  • TweetDeck and Google App Engine: A Match Made in the Cloud
    I'm Reza and work in London, UK for a startup called TweetDeck . Our vision is to develop the best tools to manage and filter real time ...
  • Python SDK version 1.2.4 released.
    We're psyched to release version 1.2.4 of the App Engine SDK for Python. Some highlights of what you'll find in this release: Remote...
  • Scaling with the Kindle Fire
    Today’s blog post comes to us from Greg Bayer of Pulse , a popular news reading application for iPhone, iPad and Android devices. Pulse has ...

Categories

  • 1.1.2
  • agile
  • android
  • Announcements
  • api
  • app engine
  • appengine
  • batch
  • bicycle
  • bigquery
  • canoe
  • casestudy
  • cloud
  • Cloud Datastore
  • cloud endpoints
  • cloud sql
  • cloud storage
  • cloud-storage
  • community
  • Compute Engine
  • conferences
  • customer
  • datastore
  • delete
  • developer days
  • developer-insights
  • devfests
  • django
  • email
  • entity group
  • events
  • getting started
  • google
  • googlenew
  • gps
  • green
  • Guest Blog
  • hadoop
  • html5
  • index
  • io2010
  • IO2013
  • java
  • kaazing
  • location
  • mapreduce
  • norex
  • open source
  • partner
  • payment
  • paypal
  • pipeline
  • put
  • python
  • rental
  • research project
  • solutions
  • support
  • sustainability
  • taskqueue
  • technical
  • toolkit
  • twilio
  • video
  • websockets
  • workflows

Blog Archive

  • ▼  2013 (143)
    • ►  December (33)
    • ►  November (15)
    • ►  October (17)
    • ▼  September (13)
      • We're building a community for the cloud-minded
      • How to mix and match web services, batch processin...
      • App Engine 1.8.5 released – featuring Search API a...
      • AppScale brings choice to App Engine developers
      • How Streak makes debugging easy in App Engine Data...
      • Sendgrid now supports Google Compute Engine for se...
      • Google BigQuery goes real-time with streaming inse...
      • Check out technical solutions and sample apps to g...
      • Where the cloud meets the road
      • Running Apache Hadoop on Google Cloud Platform
      • Cloud Platform updates: App Engine 1.8.4 and new C...
      • Introducing CORR() to Google BigQuery
      • Time to show what you can do with Google Cloud Pla...
    • ►  August (4)
    • ►  July (15)
    • ►  June (12)
    • ►  May (15)
    • ►  April (4)
    • ►  March (4)
    • ►  February (9)
    • ►  January (2)
  • ►  2012 (43)
    • ►  December (2)
    • ►  November (2)
    • ►  October (8)
    • ►  September (2)
    • ►  August (3)
    • ►  July (4)
    • ►  June (2)
    • ►  May (3)
    • ►  April (4)
    • ►  March (5)
    • ►  February (3)
    • ►  January (5)
  • ►  2011 (46)
    • ►  December (3)
    • ►  November (4)
    • ►  October (4)
    • ►  September (5)
    • ►  August (3)
    • ►  July (4)
    • ►  June (3)
    • ►  May (8)
    • ►  April (2)
    • ►  March (5)
    • ►  February (3)
    • ►  January (2)
  • ►  2010 (38)
    • ►  December (2)
    • ►  October (2)
    • ►  September (1)
    • ►  August (5)
    • ►  July (5)
    • ►  June (6)
    • ►  May (3)
    • ►  April (5)
    • ►  March (5)
    • ►  February (2)
    • ►  January (2)
  • ►  2009 (47)
    • ►  December (4)
    • ►  November (3)
    • ►  October (6)
    • ►  September (5)
    • ►  August (3)
    • ►  July (3)
    • ►  June (4)
    • ►  May (3)
    • ►  April (5)
    • ►  March (3)
    • ►  February (7)
    • ►  January (1)
  • ►  2008 (46)
    • ►  December (4)
    • ►  November (3)
    • ►  October (10)
    • ►  September (5)
    • ►  August (6)
    • ►  July (4)
    • ►  June (2)
    • ►  May (5)
    • ►  April (7)
Powered by Blogger.

About Me

Unknown
View my complete profile