Eurovision is a song contest where each European country sends one singer to compete in a televised competition (similar to American Idol for our American readers). It is the one of the most watched non-sporting TV events in the world, with an estimated 125 million live viewers every year!
This year, Eurovision created a second screen application that included singer biographies, real-time updates, contest voting and results. The “smartmrs” backend for the Eurovision companion app, developed by grandcentrix, was powered by Google Cloud Platform. grandcentrix leveraged Google Compute Engine for VMs and used our product at Scalr for orchestration.
Capacity planning without a target
Initially, Eurovision didn’t know how much traffic its companion app would receive, so they decided to work with Scalr and Compute Engine because of its flexibility. grandcentrix needed infrastructure that could scale up and down quickly, with instances that would instantly start serving user requests. Without knowing expected traffic levels, the objective was to take the backend service to a point where it could scale horizontally - that is, where adding twice the capacity would result in twice the throughput.
We had the following components running on Google Compute Engine:
- Nginx as a load balancer
- Apache running the app’s PHP code
- Redis as a datastore for most queries
- MySQL as a datastore for relationally heavy queries
Scalr was used as a control panel to launch instances and orchestrate the pieces together through automated configuration and DNS management.
How Compute Engine helped us get there
The network
Google Compute Engine has a high performance network - packets move consistently and quickly. To take full advantage of this we went for Compute Engine’s largest compute offering and tuned our network settings a bit to accommodate more connections (think net.ipv4.tcp_tw_reuse, net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait, and net.nf_conntrack_max, among others).
The elasticity, provisioning times, and billing
During the first Eurovision semifinal voting phase, traffic went up by a factor of 5. We were able to quickly spin up extra capacity in just a few minutes and handle the traffic that we were receiving.
During the finals, we were extra careful and decided to spin up 2x capacity just before the voting. We kept those instances up for 30 minutes, and shut them down as soon as the voting phase ended. Compute Engine’s sub hour billing was greatly appreciated by the grandcentrix team and saved them approximately 50% of what it would have cost on other providers.
The (complete) flexibility
Google Compute Engine gives us full access to the instances, so we can understand what’s happening under the hood and optimize it. Here’s an example: DNS resolution.
Here, we connected to the DB instances by pointing the app to a Scalr-managed hostname that lists their IP addresses and gets updated when we add or remove DB servers.
Having low-level (socket) access let us understand the need for and implement randomization logic to distribute traffic evenly across our database servers and get consistent performance throughout the show.
Ready for showtime!
In the end, the infrastructure was ready for the Eurovision finals on Saturday. Google Cloud Platform, grandcentrix and Scalr were able to deliver 50,000 RPS, with 99% of the requests completed within 35ms at the app server layer.
The traffic was higher than expected when voting started, but significantly lower than expected during the results phase (turns out people watch a TV show on TV!), and grandcentrix was able to shut down a large part of the cluster to save on cost and take advantage of Compute Engine’s sub-hour billing!
In the end, Google Cloud Platform provided the technology, pricing, and robustness that grandcentrix and Scalr needed to deliver a high performance solution for Eurovision.
0 comments:
Post a Comment