Cloud World

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Tuesday, 1 October 2013

Jump-start your data pipelining into Google BigQuery

Posted on 10:16 by Unknown
Once you get your data into Google BigQuery, you don’t have to worry about running out of machine capacity, because you use Google’s machines as if they were your own. But what if you want to transform your source data before putting it into BigQuery and you don’t have the server capacity to handle the transformation? In this case, how about using Google Compute Engine to run your Extract, Transform and Load (ETL) processing? To learn how, read our paper Getting Started With Google BigQuery. To get started, download the sample ETL tool for Google Compute Engine from GitHub.



The sample ETL tool is an application that automates the steps of getting the Google Compute Engine instance up and running, and installing the software you need to rapidly design, create and execute the ETL workflow. The application includes a sample workflow that uses KNIME to help you understand the entire process, as shown here:







If you already have an established process for performing the ETL process to prepare the data and load it into Google Cloud Storage, but need a reliable way to load the data from there into BigQuery, we have a another sample application to help you. The Automated File Loader for BigQuery sample app demonstrates how to automate data loading from Google Cloud Storage to BigQuery.



This application uses the Cloud Storage Object Change Notification API to receive notifications that files have been uploaded to a bucket in Google Cloud Storage, then uses the BigQuery API to load the data from the bucket into BigQuery. Download it now from GitHub.









Both these sample applications accompany the article Getting Started With Google BigQuery, which provides an overview of the end-to-end process from loading data into BigQuery to visualization, and design practices that should be considered when using BigQuery.



-Posted by Wally Yau, Solutions Architect
Email ThisBlogThis!Share to XShare to Facebook
Posted in bigquery, solutions | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Tutorial: Adding a cloud backend to your application with Android Studio
    Android Studio lets you easily add a cloud backend to your application, right from your IDE. A backend allows you to implement functionality...
  • A Day in the Cloud, new articles on scaling, and fresh open source projects for App Engine
    The latest release of Python SDK 1.2.3, which introduced the Task Queue API and integrated support for Django 1.0, may have received a lot ...
  • New Admin Console Release
    Posted by Marzia Niccolai, App Engine Team Today we've released some new features in our Admin Console to make it easier for you to mana...
  • JPA/JDO Java Persistence Tips - The Year In Review
    If you’re developing a Java application on App Engine you probably already know that you can use JPA and JDO, both standard Java persistence...
  • The new Cloud Console: designed for developers
    In June, we unveiled the new Google Cloud Console , bringing together all of Google’s APIs, Services, and Infrastructure in a single interfa...
  • Best practices for App Engine: memcache and eventual vs. strong consistency
    We have published two new articles about best practices for App Engine. Are you aware of the best ways to keep Memcache and Datastore in syn...
  • Pushing Updates with the Channel API
    If you've been watching Best Buy closely, you already know that Best Buy is constantly trying to come up with new and creative ways to...
  • Outfit 7’s Talking Friends built on Google App Engine, recently hit one billion downloads
    Today’s guest blogger is Igor Lautar, senior director of technology at Outfit7 (Ekipa2 subsidiary), one of the fastest-growing media enterta...
  • Bridging Mobile Backend as a Service to Enterprise Systems with Google App Engine and Kinvey
    The following post was contributed by Ivan Stoyanov , VP of Engineering for Kinvey, a mobile Backend as a Service provider and Google Cloud ...
  • Easy Performance Profiling with Appstats
    Since App Engine debuted 2 years ago, we’ve written extensively about best practices for writing scalable apps on App Engine. We make writ...

Categories

  • 1.1.2
  • agile
  • android
  • Announcements
  • api
  • app engine
  • appengine
  • batch
  • bicycle
  • bigquery
  • canoe
  • casestudy
  • cloud
  • Cloud Datastore
  • cloud endpoints
  • cloud sql
  • cloud storage
  • cloud-storage
  • community
  • Compute Engine
  • conferences
  • customer
  • datastore
  • delete
  • developer days
  • developer-insights
  • devfests
  • django
  • email
  • entity group
  • events
  • getting started
  • google
  • googlenew
  • gps
  • green
  • Guest Blog
  • hadoop
  • html5
  • index
  • io2010
  • IO2013
  • java
  • kaazing
  • location
  • mapreduce
  • norex
  • open source
  • partner
  • payment
  • paypal
  • pipeline
  • put
  • python
  • rental
  • research project
  • solutions
  • support
  • sustainability
  • taskqueue
  • technical
  • toolkit
  • twilio
  • video
  • websockets
  • workflows

Blog Archive

  • ▼  2013 (143)
    • ►  December (33)
    • ►  November (15)
    • ▼  October (17)
      • Compute Engine Persistent Disk Backups using Snaps...
      • Google Cloud SQL is now accessible from just about...
      • Learn about building global applications on Google...
      • Five Options for Cloud to Cloud Data Migration
      • Google App Engine for PHP with PhpStorm
      • Total Eclipse of the Apps Script
      • App Engine 1.8.6 released
      • R/GA shares why digital agencies are turning to th...
      • Unlocking Big Data
      • The new Cloud Console: designed for developers
      • New features and tutorials for Compute Engine Load...
      • Speed up iOS development with Google Cloud Platform
      • Google App Engine PHP Runtime now available to eve...
      • How to get auto scaling of Google Compute Engine "...
      • The Cloud Platform Support team is here to help
      • One platform, many uses
      • Jump-start your data pipelining into Google BigQuery
    • ►  September (13)
    • ►  August (4)
    • ►  July (15)
    • ►  June (12)
    • ►  May (15)
    • ►  April (4)
    • ►  March (4)
    • ►  February (9)
    • ►  January (2)
  • ►  2012 (43)
    • ►  December (2)
    • ►  November (2)
    • ►  October (8)
    • ►  September (2)
    • ►  August (3)
    • ►  July (4)
    • ►  June (2)
    • ►  May (3)
    • ►  April (4)
    • ►  March (5)
    • ►  February (3)
    • ►  January (5)
  • ►  2011 (46)
    • ►  December (3)
    • ►  November (4)
    • ►  October (4)
    • ►  September (5)
    • ►  August (3)
    • ►  July (4)
    • ►  June (3)
    • ►  May (8)
    • ►  April (2)
    • ►  March (5)
    • ►  February (3)
    • ►  January (2)
  • ►  2010 (38)
    • ►  December (2)
    • ►  October (2)
    • ►  September (1)
    • ►  August (5)
    • ►  July (5)
    • ►  June (6)
    • ►  May (3)
    • ►  April (5)
    • ►  March (5)
    • ►  February (2)
    • ►  January (2)
  • ►  2009 (47)
    • ►  December (4)
    • ►  November (3)
    • ►  October (6)
    • ►  September (5)
    • ►  August (3)
    • ►  July (3)
    • ►  June (4)
    • ►  May (3)
    • ►  April (5)
    • ►  March (3)
    • ►  February (7)
    • ►  January (1)
  • ►  2008 (46)
    • ►  December (4)
    • ►  November (3)
    • ►  October (10)
    • ►  September (5)
    • ►  August (6)
    • ►  July (4)
    • ►  June (2)
    • ►  May (5)
    • ►  April (7)
Powered by Blogger.

About Me

Unknown
View my complete profile