AWS Blog
New – CloudWatch Events for EBS Snapshots
Cloud computing can improve upon traditional IT operations by giving you the power to automate complex high-level operations that were formerly kept in a runbook or passed along as tribal knowledge. Far too many of these operations involve backup and recovery operations, especially in smaller and less mature organizations.
Many AWS customers make great use of Amazon Elastic Block Store (EBS) volumes, especially given the ease with which they can generate and manage snapshot backups. They are also copying snapshots between regions on a regular basis for disaster recovery and other operational reasons.
Today we are bringing the benefits of automation to EBS with the addition of new CloudWatch Events for EBS snapshots. You can use these events to add additional automation to your cloud-based backup environment. Here are the new events:
- createSnapshot – Fired after the status of a newly created EBS snapshot changes to Complete.
- copySnapshot – Fired after the status of a snapshot copy changes to Complete.
- shareSnapshot – Fired after a snapshot is shared with your AWS account.
A lot of AWS customers monitor the status of their snapshots by making repeated calls to the DescribeSnapshots
function and then stepping through the paginated output in order to locate a specific snapshot. These new events open the door to all sorts of event-driven automation, including the cross-region copy that I mentioned earlier.
Using Snapshot Events
In order to get a better understanding of how this feature helps to automate data backup workflows, I’ll create a workflow that copies a completed snapshot to another region. First, I’ll create an IAM policy that grants appropriate permissions. Then I will incorporate an AWS Lambda function (created by my colleagues) that takes action on the createSnapshot event. Finally, I’ll create a CloudWatch Events rule to capture the event and route it to the Lambda function.
I start out by creating an IAM role (CopySnapshotToRegion) with this policy:
Then I created a new Lambda function (you can find the code at Amazon CloudWatch Events for EBS):
Next, I hopped over to the CloudWatch Events Console, clicked on Create rule, and set it up to handle successful createSnapshot events:
And gave it a name:
To test it out, I create a new EBS snapshot in my source region:
The function was invoked as expected and the snapshot was copied to the target region within seconds (in practice, the copy time will depend on the size of the snapshot):
You can also use these events to make copies of snapshots that are shared with you from other accounts. Many AWS customers partition their usage across multiple accounts for various organizational and security reasons; take a look at our AWS Multiple Account Security Strategy to see our in-depth recommendations in this area. Here are two of the five models included therein:
Available Now
The new events are available in the US East (Northern Virginia), US East (Ohio), US West (Northern California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), EU (Frankfurt), EU (Ireland), and South America (São Paulo) Regions and you can start using them today! Take a look and let me know what you come up with.
— Jeff
PS – If you are a developer, development manager, or a product manager and would like to build systems like this, check out the EBS Jobs page.
AWS Week in Review – November 7, 2016
Let’s take a quick look at what happened in AWS-land last week. Thanks are due to the 17 internal and external contributors who submitted pull requests!
New & Notable Open Source
- Sippy Cup is a Python nanoframework for AWS Lambda and API Gateway.
- Yesterdaytabase is a Python tool for constantly refreshing data in your staging and test environments with Lambda and CloudFormation.
- ebs-snapshot-lambda is a Lambda function to snapshot EBS volumes and purge old snapshots.
- examples is a collection of boilerplates and examples of serverless architectures built with the Serverless Framework and Lambda.
- ecs-deploy-cli is a simple and easy way to deploy tasks and update services in AWS ECS.
- Comments-Showcase is a serverless comment webapp that uses API Gateway, Lambda, DynamoDB, and IoT.
- serverless-offline emulates Lambda and API Gateway locally for development of Serverless projects.
- aws-sign-web is a JavaScript implementation of AWS Signature v4 for use within web browsers.
- Zappa implements serverless Django on Lambda and API Gateway.
- awsping is a console tool to check latency to AWS regions.
New Customer Success Stories
- Cambia Health Solutions – Cambia Health Solutions creates and invests in innovations designed to serve the changing needs of individuals and families, including a wide range of companies in its Direct Health Solutions Network. Within the portfolio of Direct Health Solutions companies, Wildflower uses AWS to deliver its pregnancy app to more than 50,000 women and HealthSparq delivers its healthcare price transparency app to more than 70 health plans covering 70 million members
- Europol – Using AWS, Europol deployed its anti-ransomware website in three days, supported 2.6 million visitors on the first day, and has supported 12 million visitors since the website’s launch. Europol, the European Union’s law-enforcement agency, assists European Union member states in their fight against international crime and terrorism.
- Goodwill Industries – Goodwill Industries of Southern New Jersey and Philadelphia reduces downtime at its stores, schools, and offices, backs up servers hourly, and can restore servers within moments of failure using AWS. The non-profit organization, based in Maple Shade, New Jersey, helps put people to work so they can realize their economic potential.
- Infor – Infor saves 75 percent on monthly database backup costs, completes application backups 30 percent faster, and keeps pace with global business growth by going all in on AWS. The organization provides enterprise resource planning and other software solutions to a range of enterprises worldwide. Infor runs more than 30 customer-facing applications on the AWS Cloud.
- Olympusat – Olympusat uses AWS to support its microservices architecture, saving $25,000 a month by eliminating the use of similar, more expensive services. Olympusat is a large independent media company specializing in Spanish-language movie, music, and entertainment television channels.
- PayFort – PayFort delivers trusted, highly secure services to its customers by using AWS’s Payment Card Industry Data Security Standard (PCI DSS) compliance credentials. The startup provides its payment solutions to organizations across the Middle East, giving them an easy way to conduct online transactions.
New SlideShare Presentations
- Cloud Storage State of the Union.
- AWS in Media: Cloud and Serverless Architectures.
- Application Migrations at Scale.
- Building Your Cloud Strategy.
- Real Time Analytics on AWS: Optimized Architectures.
- Financial Services in the Cloud.
- AWS Security for Financial Services.
- AWS Services for Content Production.
- Sony MCS Cloud.
- Moving Your Media Supply Chain to the AWS Cloud.
- Insights from Amazon Studios.
- AWS Elemental Services for Video Processing and Delivery.
- DDoS Resiliency.
- Towards Full Stack Security.
- Get Started and Migrate Your Data to AWS.
Upcoming Events
- November 14 (Seoul, Korea) – AWSKRUG & JAWS Kobe IoT Online Seminar (Korean | Japanese).
- November 16 (Seoul, Korea) – AWSomeDay Online.
- November 14 (New Delhi, India) – AWS DevDay India.
- November 16 (Pune, India) – AWS DevDay India.
- November 17 (Bangalore, India) – AWS DevDay India.
- November 22 (Oslo, Norway) – AWS User Group Norway: Deployments pipelines in Spinnaker.
- November 23 (Cardiff, Wales) – Meetup #4 of the AWS South Wales User Group.
Help Wanted
Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.
— Jeff
EC2 Price Reduction (C4, M4, and T2 Instances)
I am happy to be able to announce that an EC2 price reduction will go in to effect on December 1, 2016, just in time to make your holiday season just a little bit more cheerful! Our engineering investments, coupled with our scale and our time-tested ability to manage our capacity, allow us to identify and pass on the cost savings to you.
We are reducing the On-Demand, Reserved Instance (Standard and Convertible) and Dedicated Host prices for C4, M4, and T2 instances by up to 25%, depending on region and platform (Linux, RHEL, SUSE, Windows, and so forth). Price cuts apply across all AWS Regions. For example:
- C4 – Reductions of up to 5% in US East (Northern Virginia) and EU (Ireland) and 20% in Asia Pacific (Mumbai) and Asia Pacific (Singapore).
- M4 – Reductions of up to 10% in US East (Northern Virginia), EU (Ireland), and EU (Frankfurt) and 25% in Asia Pacific (Singapore).
- T2 – Reductions of up to 10% in US East (Northern Virginia) and 25% in Asia Pacific (Singapore).
As always, you do not need to take any action in order to benefit from the reduction in On-Demand prices. If you are using billing alerts or our newly revised budget feature, you may want to consider revising your thresholds downward as appropriate.
— Jeff
PS – By my count, this is our 53rd price reduction.
New – HIPAA Eligibility for AWS Snowball
Many of the tools and technologies now in use at your local doctor, dentist, hospital, or other healthcare provider generate massive amounts of sensitive digital data. Other prolific data generators include genomic sequencers and any number of activity and fitness trackers. We all want to benefit from the insights that can be produced by this “data tsunami,” but we also want to be confident that it will be stored in a protected fashion and processed in a responsible manner.
In the United States, protection of healthcare data is governed by HIPAA (the Health Insurance Portability and Accountability Act). Because many AWS customers would like to store and process sensitive health care data on the cloud, we have worked to make multiple AWS services HIPAA-eligible; this means that the services can be used to process Protected Health Information (PHI) and to build applications that are HIPAA-compliant (read HIPAA in the Cloud to learn more about what Cleveland Clinic, Orion Health, Eliza, Philips, and other AWS customers are doing).
Last year I introduced you to AWS Snowball. This is an AWS-owned storage appliance that you can use to move large amounts of data (generally 10 terabytes or more) to AWS on a one-time or recurring basis. You simply request a Snowball from the AWS Management Console, connect it to your network when it arrives, copy your data to it, and then send it back to us so that we can copy the data to the AWS storage service of your choice. Snowball encrypts your data using keys that you specify and control.
Today, I am happy to announce that we are adding Snowball to the list of HIPAA-eligible services, joining Amazon DynamoDB, Amazon Elastic Compute Cloud (EC2), Amazon Elastic Block Store (EBS), Elastic Load Balancing, Amazon EMR, Amazon Glacier, Amazon Relational Database Service (RDS) (MySQL and Oracle), Amazon Redshift, and Amazon Simple Storage Service (S3). This brings the total number of eligible services to 10 and represents our commitment to make the AWS Cloud a safe, secure, and reliable destination for PHI and many other types of sensitive data. If you already have a Business Associate Agreement (BAA) with AWS, you can begin using Snowball to transfer data into your HIPAA accounts immediately.
With Snowball now on the list of HIPAA-eligible services, AWS customers in the Healthcare and Life Sciences space can quickly move on-premises data to Snowball and then process it using any of the services that I just mentioned. For example, they can use the new HDFS Import feature to migrate an existing on-premises Hadoop cluster to the cloud and analyze it using a scalable EMR cluster. They can also move existing petabyte-scale data (medical images, patient records, and the like) to AWS and store it in S3 or Glacier, both already HIPAA-eligible. These services are proven, easy to use, and offer high data durability at low cost.
— Jeff
New – Burst Balance Metric for EC2’s General Purpose SSD (gp2) Volumes
Many AWS customers are getting great results with the General Purpose SSD (gp2) EBS volumes that we launched in mid-2014 (see New SSD-Backed Elastic Block Storage for more information). If you’re unsure of which volume type to use for your workload, gp2 volumes are the best default choice because they offer balanced price/performance for a wide variety of database, dev and test, and boot volume workloads. One of the more interesting aspects of this volume type is the burst feature.
We designed gp2‘s burst feature to suit the I/O patterns of real world workloads we observed across our customer base. Our data scientists found that volume I/O is extremely bursty, spiking for short periods, with plenty of idle time between bursts. This unpredictable and bursty nature of traffic is why we designed the gp2 burst-bucket to allow even the smallest of volumes to burst up to 3000 IOPS and to replenish their burst bucket during idle times or when performing low levels of I/O. The burst-bucket design allows us to provide consistent and predictable performance for all gp2 users. In practice, very few gp2 volumes ever completely deplete their burst-bucket, and now customers can track their usage patterns and adjust accordingly.
We’ve written extensively about performance optimization across different volume types and the differences between benchmarking and real-world workloads (see I/O Characteristics for more information). As I described in my original post, burst credits accumulate at a rate of 3 per configured GB per second, and each one pays for one read or one write. Each volume can accumulate up to 5.4 million credits, and they can be spent at up to 3,000 per second per volume. To get started, you simply create gp2 volumes of the desired size, launch your application, and your I/O to the volume will proceed as rapidly and efficiently as possible.
New Metric
Effective today, we are making the Burst Balance metric available for each General Purpose (SSD) volume. You can observe this metric in the CloudWatch Console and you can set up an alarm that will be triggered if the balance becomes too low. The metric is expressed as percentage; 100% means that the volume has accumulated the maximum number of credits.
I launched a c4.8xlarge instance and attached a 100 GB volume to it:
Then I created an alarm to let me know if the volume’s burst balance went below 40% (in a real-world scenario you might want to set this considerably lower, but I was impatient and it takes a fair amount of time to drain the balance):
I confirmed my SNS subscription, and then ran fio
to generate a load:
$ sudo fio --filename=/dev/sdb --rw=randread --bs=16k --runtime=2400 --time_based=1 \
--iodepth=32 --ioengine=libaio --direct=1 --name=gp2-16kb-burst-bucket-test
Then I watched as the balance declined:
As expected, I received a notification email:
In a production scenario, I could choose to increase the size of the volume, fine-tune my application’s I/O behavior, or simply note for the record that I was making good use of the burst-bucket.
After the end of the test, I had lunch and watched the burst balance increase (I used the updated CloudWatch Console this time):
If you’re one of the few customers where your burst-bucket is depleting more than you’d like, you can either increase the size of your gp2 volume for more performance or transition to a Provisioned IOPS SSD (io1) volume, which delivers consistent, provisioned performance 99.9% of the time.
Available Now
This feature is available now and you can start using it today in all AWS Commercial Regions at no charge. The usual charges for CloudWatch Alarms will apply.
— Jeff
PS – If you are a developer, development manager, or a product manager and would like to build systems like this, check out the EBS Jobs page.
Genome Engineering Applications: Early Adopters of the Cloud
Our friends at the Commonwealth Scientific and Industrial Research Organization (CSIRO) in Australia sent along the guest post below to tell us about how AWS powers an important new genome editing technique.
— Jeff
Recent developments in molecular engineering technology now enables the accurate editing of genomes. The new technology, called CRISPR-Cas9, can be programmed to recognize and edit specific locations in the genome by pattern-matching unique sequences of DNA. While this is a powerful new tool for researchers, the ability to scan and identify targets across the entire genome has created unprecedented demand for large-scale computation. Earlier this year, the US National Institutes of Health (NIH) has approved the use of these technologies for human health. This has the potential to revolutionize cancer treatments and also adds a new time-critical dimension to the compute requirements.
A New Approach to Cancer Treatments
Approximately two in five people will be diagnosed with cancer at some point during their lifetime and while overall cancer survival has doubled, there are still cancer types with very low survival rate, for example just 1% for pancreatic cancer. This is mainly due to the difficulty of finding therapeutic interventions that kill cancer cells but not harm the healthy tissue in the body.
The new NIH approved trial will leverage breakthroughs in the genome editing technology, CRISPR-Cas9, to develop a different treatment approach. In this, the patient’s own immune system is boosted through specific modifications of the cells that natively fight cancer. This has the potential of being effective for a wide range of different tumors, with the current trial including patients with specific blood and solid cancers, as well as melanoma.
Cloud Services for Computationally Guided Genome Engineering
This new application in human health requires an increase in robustness and efficiency of CRISPR-Cas9 design in order to meet the time constraints of clinical care. Built on AWS cloud-services, researchers in the eHealth program of the Commonwealth Scientific and Industrial Research Organization (CSIRO) in Australia, developed GT-Scan2, a novel software tool to address this issue.
“Compared to other available methods, GT-Scan2 identifies genomic location with higher sensitivity and specificity,” says Dr. Denis Bauer who is leading the transformational bioinformatics team.
GT-Scan2 shows the identified CRISPR target sites at the genomic position and annotates them with high or low activity as well as their off-target potential.
GT-Scan2 improves the effectiveness of the system by finding sites that are unique in the genome. This avoids diluting the effect due to “off-target”, which are other sites in the genome with high sequence similarity. It also optimizes robustness by finding sites that are easier to modify.
“While it was known that the three-dimensional genome organization plays a role in CRISPR binding, GT-Scan2 is the first tool to also leverage other components that are crucial for Cas9 activity,” says Dr. Laurence Wilson whose research focuses on computational genome engineering.
Specifically the off-target search is a compute intensive task traditionally reserved for researchers at large institutes with high-performance-compute infrastructure as every location in the 3 billion letter long genomic sequence needs to be investigated. GT-Scan2 democratizes the ability to find optimal sites by offering this complex computation as a cloud-service using AWS Lambda functions.
Scaling Instantaneously for Personalized Treatments
GT-Scan2 leverages the instantaneous scalability that the event-driven AWS Lambda service offers. This is crucial for personalized treatment, as complexity of the targeted gene can vary dramatically.
“The off-target search as well as the robustness analysis can be subdivided into independent, modular tasks that can run in parallel” says Aidan O’Brien who designed and implemented the system within weeks after its official Asia-Pacific launch in April this year at the AWS Summit 2016 attesting to the intuitive nature of the service. A typical job takes less than a minute and the variation between jobs range from 1 second to 5 minutes. This fast fluctuation in load over minutes rather than hours ruled out an EC2-based solution as new instances would come online too slowly to keep the runtime stable.
GT-Scan2 is served directly from S3 making it a static web app without server-side processing. It retrieves the dynamic content (such as job results and parameters) via API calls using API Gateway from a database (DynamoDB) using a JavaScript framework.
When a user submits a job, GT-Scan2 inserts the job parameters as an item into a DynamoDB table via an API call. This allows the solution to be freely scalable without creating a bottleneck. The database entry triggers the first Lambda function, which finds all putative CRISPR targets in the user-specified DNA sequence (fetched automatically upon user submission). Potential CRISPR target sites have fixed rules and can be easily found using a regular expression that completes in seconds and are inserted into a second DynamoDB table.
Adapting to leverage the power of Lambda-based microservices
All potential targets need to be evaluated for their off-target risk using the efficient string matching tool, Bowtie. Though Bowtie only requires a reduced representation of the 3 billion letter genomic sequence, the sizes of these index files exceed the storage limitation for each Lambda instance. “GT-Scan2 divides the genome into smaller blocks to fit the Lambda specifications” explains Adrian White (Research & Technical Computing, APAC) who supported the CSIRO team during development. For an average run, GT-Scan2 hence triggers 500-1000 individual Lambda functions, which simultaneously update the scores for the different putative targets in DynamoDB. During this process, the frontend is polling this table via API Gateway and updating the webpage as results come in, eliminating the need for server-side compute.
“AWS’s Lambda has given us a great framework to develop a future-ready software package able to support medical genome engineering applications,” says Dr. Bauer. “We are specifically impressed with the ability to instantaneously scale at run time by spawning more Lambda functions to cope with the varying complexity of the different genes.” Other benefits Dr. Bauer quotes include only paying for storage during periods of no use and jobs not competing with web server resources as the website is a static page with dynamic content updated through Angular 2 and the API Gateway, as well as not needing to maintain compute instances (security patches of OS).
“One of the best things about Lambda is that users will be able to easily swap-in different machine learning algorithms that are better suited for specific CRISPR applications” says Dr. Wilson.
The GT-Scan2 Team, from left, Denis Bauer, Laurence Wilson, Aidan O’Brien
“The computational genome engineering community is one of the early adopters of our AWS Lambda technology,” explains Dr. Mia Champion (Technical Business Development Manager, Scientific Computing). “GT-Scan2’s use of API Gateway and DynamoDB is a very neat solution to ensure scalability and their clever use of epigenomics really sets them apart from other recent applications using lambda to perform CRISPR searches. I am looking forward to seeing GT-Scan2 adopted in medical applications.”
CloudWatch Update – Jump From Metrics to Associated Logs
A few years ago I showed you how to Store and Monitor OS & Application Log Files with Amazon CloudWatch. Many AWS customers now create filters for their logs, publish the results as CloudWatch metrics, and then raise alarms when something is amiss. For example, they watch their web server logs for 404 errors that denote bad inbound links and 503 errors that can indicate an overload condition.
While the monitoring and alarming are both wonderful tools for summarizing vast amounts of log data, sometimes you need to head in the opposite direction. Starting from an overview, you need to be able to quickly locate the log file entries that were identified by the filters and caused the alarms to fire. If you, like many of our customers, are running hundreds or thousands of instances and monitoring multiple types of log files, this can be statistically more difficult than finding a needle in a haystack.
Today we are launching a new CloudWatch option that will (so to speak) reduce the size of the haystack and make it a lot easier for you to find the needle!
Let’s say that I am looking at this graph and want to understand the spike in the ERROR metric at around 17:00:
I narrow down the time range with a click and a drag:
Then I click on the logs icon (it, along with the other icons, appears only when the mouse is over the graph), and select the log file of interest (ERROR):
CloudWatch opens in a second tab, with a view that shows the desired log files, pre-filtered to the desired time span. I can then expand an entry to see what’s going on (these particular errors were manufactured for demo purposes; they are not very exciting or detailed):
This feature works great for situations where filters on log files are used to publish metrics. However, what if I am looking at some CloudWatch system metrics that are not associated with a particular log file? I can follow the steps above, but select View logs in this time range from the menu:
I can see all of my CloudWatch Log Groups, filtered for the time range in the graph:
At this point I can use my knowledge of my application’s architecture to guide my decision-making and to help me to choose a Log Group to investigate. Once again, the events in the log group will be filtered so that only those in the time frame of interest will be visible. If a chart contains metrics in the Lambda namespace, links to the log group will be displayed even if no metric filters are in effect.
This new feature is available now and you can start using it today!
— Jeff
AWS Week in Review – October 31, 2016
Over 25 internal and external contributors helped out with pull requests and fresh content this week! Thank you all for your help and your support.
New & Notable Open Source
- awsapigatewayupsert enables upserting and export of API Gateway instances from Swagger 2.0 JSON definitions, with support for managing Lambda function integration permissions.
- AWS IoT Embedded C ESP32 Port is a port of the AWS IoT embedded C SDK to the ESP32 platform.
- aws-lambda-go-net allows to run existing Go web applications on AWS Lambda without modification.
- pipeline-aws-plugin adds Jenkins Pipeline steps to interact with the AWS API.
- elastic-ci-stack-for-aws is a simple, flexible, auto-scaling cluster of build agents running in your own AWS VPC.
- docker-aws-tools is a Docker image with AWS CLI and common tools used to interact with AWS APIs.
- js-aws is a full JavaScriptification of AWS.
- aws-team-cost-reporter reports on AWS Team costs using data from CloudCheckr.
- lambda-cd is a proof of concept implementation of continuous delivery for Lambda functions.
- cloud-search-query is an ORM-like wrapper for building AWS CloudSearch structured queries.
New Customer Success Stories
- Apposphere – Using AWS and bitfusion.io from the AWS Marketplace, Apposphere can scale 50 to 60 percent month-over-month while keeping customer satisfaction high. Based in Austin, Texas, the Apposphere mobile app delivers real-time leads from social media channels.
- CADFEM – CADFEM uses AWS to make complex simulation software more accessible to smaller engineering firms, helping them compete with much larger ones. The firm specializes in simulation software and services for the engineering industry.
- Mambu – Using AWS, Mambu helped one of its customers launch the United Kingdom’s first cloud-based bank, and the company is now on track for tenfold growth, giving it a competitive edge in the fast-growing fintech sector. Mambu is an all-in-one SaaS banking platform for managing credit and deposit products quickly, simply, and affordably.
- Okta – Okta uses AWS to get new services into production in days instead of weeks. Okta creates products that use identity information to grant people access to applications on multiple devices at any time, while still enforcing strong security protections.
- PayPlug – PayPlug is a startup created in 2013 that developed an online payment solution. It differentiates itself by the simplicity of its services and its ease of integration on e-commerce websites. PayPlug is a startup created in 2013 that developed an online payment solution. It differentiates itself by the simplicity of its services and its ease of integration on e-commerce websites
- Rent-a-Center – Rent-a-Center is a leading renter of furniture, appliances, and electronics to customers in the United States, Canada, Puerto Rico, and Mexico. Rent-A-Center uses AWS to manage its new e-commerce website, scale to support a 1,000 percent spike in site traffic, and enable a DevOps approach.
- UK Ministry of Justice – By going all in on the AWS Cloud, the UK Ministry of Justice (MoJ) can use technology to enhance the effectiveness and fairness of the services it provides to British citizens. The MoJ is a ministerial department of the UK government. MoJ had its own on-premises data center, but lacked the ability to change and adapt rapidly to the needs of its citizens. As it created more digital services, MoJ turned to AWS to automate, consolidate, and deliver constituent services.
New SlideShare Presentations
- IoT End-to-End Security.
- Building a Data Lake on AWS.
- AWSome Day Bucharest:
- Towards Full Stack Security.
- Secure Content Delivery with AWS.
- Architecting for Resiliency.
- Getting Started with Amazon ElastiCache.
- DynamoDB Deep Dive.
New YouTube Videos
- How AWS and Statcast Keep David Ortiz Sharp in Retirement.
- MLB Statcast – Year in Review.
- re:Invent:
- Building a Data Lake on AWS.
- AWS Knowledge Center:
- I accidentally disabled network connectivity to my EC2 instance running Windows. How do I fix it?
- Amazon RDS Multi-AZ: Converting an Amazon RDS database from Single-AZ to Multi-AZ.
- How do I enable the slow query log in an RDS MySQL DB instance to view or download them?
- How do I enable the general log in an RDS MySQL DB instance to view and download them?
- AWS Knowledge Center Videos: How do I manage logs in Amazon RDS MySQL?
- AWS Knowledge Center Videos: How do I create a snapshot for an EBS volume?
Upcoming Events
- November 3 (Doral, FL) – AWS User Group – Doral Kickoff Meetup.
- November 4 (Toronto, Canada) – Canadian Executive Cloud & DevOps Summit.
- November 8 (London, England) – AWS Enterprise Summit London.
- November 9 (New York, New York) – Security | AWS Loft Architecture Week | NY AWS Pop-up Loft.
- November 10 (Webinar) – Introduction to Three AWS Security Services.
- November 10 (Webinar) – Tune Your Cloud: Optimizing AWS Reserved Instances (Cloud Cruiser).
- November 10 (Stuttgart, Germany) – Meetup of the AWS User Group in Stuttgart.
- November 10 (Webinar) – Process & Deliver Video Content at Scale on AWS Using Amazon Elastic Transcoder.
- November 11 (Webinar) – Building Server-less Applications Using API Gateway and Lambda.
- November 14 (Seoul, Korea) – AWSKRUG & JAWS Kobe IoT Online Seminar (Korean | Japanese).
- November 16 (Seoul, Korea) – AWSomeDay Online.
- November 14 (New Delhi, India) – AWS DevDay India.
- November 16 (Pune, India) – AWS DevDay India.
- November 17 (Bangalore, India) – AWS DevDay India.
- November 23 (Cardiff, Wales) – Meetup #4 of the AWS South Wales User Group.
Help Wanted
Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.
MLB Statcast – Featuring David Ortiz, Joe Torre, and Dave O’Brien
One of my colleagues has been rooting for the Red Sox since she was in her teens. She recently had the opportunity to work with Red Sox legend and Hank Aaron award winner David Ortiz (aka “Big Papi”), Hall of Famer Joe Torre, and announcer Dave O’Brien (ESPN and NESN) to produce a fun video to commemorate Big Papi’s retirement from Major League Baseball. As you can see, he takes the idea of the quantified self very seriously, and is now measuring just about every aspect of his post-baseball life using AWS-powered Statcast in order to keep his competitive edge:
MLB Statcast uses high-resolution cameras and radar equipment to track the position of the ball (20,000 metrics per second) and the players (30 metrics per second) on the field, generating 7 terabytes of data per game, all stored in and processed by the AWS Cloud. The application uses multiple AWS services including Amazon CloudFront, Amazon DynamoDB, Amazon Elastic Compute Cloud (EC2), Amazon ElastiCache, Amazon Simple Storage Service (S3), AWS Direct Connect, and AWS Lambda; here’s a big-picture look at the architecture:
To learn more, read our case study: Major League Baseball Fields Big Data, and Excitement, with AWS.
If you are coming to AWS re:Invent, get your shoulders in to shape and come visit our Statcast-equipped batting cage! You will be able to measure your metrics and see how you’d do in the big leagues.
— Jeff
CodePipeline Update – Build Continuous Delivery Workflows for CloudFormation Stacks
When I begin to write about a new way for you to become more productive by using two AWS services together, I think about a 1980’s TV commercial for Reese’s Peanut Butter Cups! The intersection of two useful services or two delicious flavors creates a new one that is even better.
Today’s chocolate / peanut butter intersection takes place where AWS CodePipeline meets AWS CloudFormation. You can now use CodePipeline to build a continuous delivery pipeline for CloudFormation stacks. When you practice continuous delivery, each code change is automatically built, tested, and prepared for release to production. In most cases, the continuous delivery release process includes a combination of manual and automatic approval steps. For example, code that successfully passes through a series of automated tests can be routed to a development or product manager for final review and approval before it is pushed to production.
This important combination of features allows you to use the infrastructure as code model while gaining all of the advantages of continuous delivery. Each time you change a CloudFormation template, CodePipeline can initiate a workflow that will build a test stack, test it, await manual approval, and then push the changes to production.The workflow can create and manipulate stacks in many different ways:
As you will soon see, the workflow can take advantage of advanced CloudFormation features such as the ability to generate and then apply change sets (read New – Change Sets for AWS CloudFormation to learn more) to an operational stack.
The Setup
In order to learn more about this feature, I used a CloudFormation template to set up my continuous delivery pipeline (this is yet another example of infrastructure as code). This template (available here and described in detail here) sets up a full-featured pipeline. When I use the template to create my pipeline, I specify the name of an S3 bucket and the name of a source file:
The SourceS3Key points to a ZIP file that is enabled for S3 versioning. This file contains the CloudFormation template (I am using the WordPress Single Instance example) that will be deployed via the pipeline that I am about to create. It can also contain other deployment artifacts such as configuration or parameter files; here’s what mine looks like:
The entire continuous delivery pipeline is ready just a few seconds after I click on Create Stack. Here’s what it looks like:
The Action
At this point I have used CloudFormation to set up my pipeline. With the stage set (so to speak), now I can show you how this pipeline makes use of the new CloudFormation actions.
Let’s focus on the second stage, TestStage. Triggered by the first stage, this stage uses CloudFormation to create a test stack:
The stack is created using parameter values from the test-stack-configuration.json
file in my ZIP. Since you can use different configuration files for each CloudFormation action, you can use the same template for testing and production.
After the stack is up and running, the ApproveTestStack step is used to await manual approval (it says “Waiting for approval above.”). Playing the role of the dev manager, I verify that the test stack behaves and performs as expected, and then approve it:
After approval, the DeleteTestStack step deletes the test stack.
Now we are just about ready to deploy to production. ProdStage creates a CloudFormation change set and then submits it for manual approval. This stage uses the parameter values from the prod-stack-configuration.json
file in my ZIP. I can use the parameters to launch a modestly sized test environment on a small EC2 instance and a large production environment from the same template.
Now I’m playing the role of the big boss, responsible for keeping the production site up and running. I review the change set in order to make sure that I understand what will happen when I deploy to production. This is the first time that I am running the pipeline, so the change set indicates that an EC2 instance and a security group will be created:
And then I approve it:
With the change set approved, it is applied to the existing production stack in the ExecuteChangeSet step. Applying the change to an existing stack keeps existing resources in play where possible and avoids a wholesale restart of the application. This is generally more efficient and less disruptive than replacing the entire stack. It keeps in-memory caches warmed up and avoids possible bursts of cold-start activity.
Implementing a Change
Let’s say that I decide to support HTTPS. In order to do this, I need to add port 443 to my application’s security group. I simply edit the CloudFormation template, put it into a fresh ZIP, and upload it to S3. Here’s what I added to my template:
- CidrIp: 0.0.0.0/0
FromPort: '443'
IpProtocol: tcp
ToPort: '443'
Then I return to the Console and see that CodePipeline has already detected the change and set the pipeline in to motion:
The pipeline runs again, I approve the test stack, and then inspect the change set, confirming that it will simply modify an existing security group:
One quick note before I go. The CloudFormation template for the pipeline creates an IAM role and uses it to create the test and deployment stacks (this is a new feature; read about the AWS CloudFormation Service Role to learn more). For best results, you should delete the stacks before you delete the pipeline. Otherwise, you’ll need to re-create the role in order to delete the stacks.
There’s More
I’m just about out of space and time, but I’ll briefly summarize a couple of other aspects of this new capability.
Parameter Overrides – When you define a CloudFormation action, you may need to exercise additional control over the parameter values that are defined for the template. You can do this by opening up the Advanced pane and entering any desired parameter overrides:
Artifact References – In some situations you may find that you need to reference an attribute of an artifact that was produced by an earlier stage of the pipeline. For example, suppose that an early stage of your pipeline copies a Lambda function to an S3 bucket and calls the resulting artifact LambdaFunctionSource
. Here’s how you would retrieve the bucket name and the object key from the attribute using a parameter override:
{
"BucketName" : { "Fn::GetArtifactAtt" : ["LambdaFunctionSource", "BucketName"]},
"ObjectKey" : { "Fn::GetArtifactAtt" : ["LambdaFunctionSource", "ObjectKey"]}
}
Access to JSON Parameter – You can use the new Fn::GetParam
function to retrieve a value from a JSON-formatted file that is included in an artifact.
Note that Fn:GetArtifactAtt
and Fn::GetParam
are designed to be used within the parameter overrides.
S3 Bucket Versioning – The first step of my pipeline (the Source action) refers to an object in an S3 bucket. By enabling S3 versioning for the object, I simply upload a new version of my template after each change:
If I am using S3 as my source, I must use versioning (uploading a new object over the existing one is not supported). I can also use AWS CodeCommit or a GitHub repo as my source.
Create Pipeline Wizard
I started out this blog post by using a CloudFormation template to create my pipeline. I can also click on Create pipeline in the console and build my initial pipeline (with source, build, and beta deployment stages) using a wizard. The wizard now allows me to select CloudFormation as my deployment provider. I can create or update a stack or a change set in the beta deployment stage:
Available Now
This new feature is available now and you can start using it today. To learn more, check out the Continuous Delivery with AWS CodePipeline in CodePipeline Documentation.
— Jeff