Amazon Web Services provides a broad range of services to help you build and deploy big data analytics applications quickly and easily. AWS gives you fast access to flexible and low cost IT resources, so you can rapidly scale virtually any big data application including data warehousing, clickstream analytics, fraud detection, recommendation engines, event-driven ETL, serverless computing, and internet-of-things processing. With AWS you don’t need to make large upfront investments in time and money to build and maintain infrastructure. Instead, you can provision exactly the right type and size of resources you need to power big data analytics applications. You can access as many resources as you need, almost instantly, and only pay for what you use.
Build virtually any big data analytics application; support any workload regardless of volume, velocity, and variety of data. With 50+ services and hundreds of features added every year, AWS provides everything you need to collect, store, process, analyze, and visualize big data on the cloud.
Managed, distributed computing for big data
Amazon EMR
Easily provision a fully managed Hadoop framework in minutes. Scale your Hadoop cluster dynamically and pay only for what you use. Run popular frameworks such as Apache Spark, Apache Tez, and Presto. Learn more »
Amazon Elasticsearch Service
Setup and deploy an Elasticsearch cluster in minutes, using a web-based console. Seamlessly run your existing Elasticsearch applications using the Elasticsearch open-source API. Learn more »
Amazon Athena
Easily analyze petabytes of data in Amazon S3 using ANSI SQL. With Amazon Athena, there are no clusters or data warehouses to manage, so you can start analyzing data immediately. You don’t even need to load your data into Athena, it works directly with data stored in S3.
Learn more »
Yelp runs hundreds of Amazon EMR jobs to process over 30 terabytes of data every day. Using Amazon EMR, Yelp was able to save $55,000 in upfront hardware costs and get up and running in a matter of days not months.
Read the case study »
Powerful services to load and analyze streaming data
Easily load massive volumes of streaming data into AWS. Enable near real-time big data analytics with existing BI tools and dashboards you’re using today.
Learn more »
Build your own custom applications that process or analyze streaming data. Continuously capture and store terabytes of data per hour.
Learn more »
Easily analyze streaming data with standard SQL. Kinesis Analytics takes care of everything required to run your queries continuously and scales automatically to match your requirements.
Learn more »
With over 250 digital properties worldwide, including television stations and popular publications such as Cosmopolitan and Car & Driver, Hearst corporation uses Amazon Kinesis to deliver real-time insights to data scientists and business stakeholders.
Watch the video »
Secure, durable, highly scalable storage with a broad set of engines
Amazon S3
Amazon S3 provides developers and IT teams with a highly reliable, secure, and scalable object storage for all your data, big or small.
Learn more »
Amazon DynamoDB
A fully managed, fast, and flexible NoSQL database service for all applications – mobile, web, gaming, ad tech, IoT, and more – that need consistent, single-digit millisecond latency at any scale.
Learn more »
Amazon DynamoDB for Titan
Easily manipulate graphs at massive scale in AWS. Build your graph database using Titan and let DynamoDB handle the performance, scalability, and operational management of storing big data.
Learn more »
Apache HBase is a petabyte-scale, strictly consistent, open-source NoSQL database. Tight integration with the Apache Hadoop ecosystem allows you to combine big data analytics with fast data access. Easily create managed HBase clusters with Amazon EMR. Learn more »
Amazon Aurora is a relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. Deliver up to 5x the throughput of standard MySQL running on the same hardware. Learn more »
Amazon RDS
Easily setup, operate, and scale a relational database in the cloud. Choose from six familiar database engines, including Oracle, Microsoft SQL Server, PostgreSQL, MySQL and MariaDB.
Learn more »
Airbnb is a community marketplace that allows property owners and travelers to connect with each other for the purpose of renting unique vacation spaces. Airbnb uses Amazon S3 to house backups and static files, including 10TB of pictures. Airbnb also moved its main MySQL database to Amazon RDS, minimizing the time spent on database administrative tasks.
Watch the video »
Fully managed, petabyte-scale, data warehouse
Easily provision, configure and deploy a data warehouse within minutes. Amazon Redshift handles all the work needed to manage, monitor and scale it. Query & analyze big data for less than $1,000 per TB per year. Learn more »
Nasdaq achieved faster, richer analytics and data warehousing capabilities while reducing costs by 57% by shifting to Amazon Redshift.
Watch the video »
Fast, cloud-powered BI for 1/10th the cost of traditional solutions
Deliver rich BI functionality to everyone in your organization. Enable employees to easily build visualizations, perform ad-hoc analysis, and quickly get business insights from big data. Perform advanced calculations and render visualizations rapidly. Learn more »
Get suggestions for the best possible visualizations, optimized for your data, to help you get quick, actionable business insights.
Cloud-native machine learning and deep learning technologies to address a broad set of use cases and needs
Amazon Lex Lex provides the advanced deep learning functionalities of automatic speech recognition (ASR) and natural language understanding (NLU) to enable you to build applications with life-like conversational interactions.
Learn more »
Amazon Polly is a service that turns text into lifelike speech. Polly lets you create applications that talk, enabling you to build entirely new categories of speech-enabled products.
Learn more »
Amazon Rekognition is a service that makes it easy to add image analysis to your applications. With Rekognition, you can detect objects, scenes, and faces in images. You can also search and compare faces.
Amazon Machine Learning is a managed service that provides visualization tools and wizards that guide you through the process of creating machine learning (ML) models without having to learn complex ML algorithms and technology.
Learn more »
"We are excited about utilizing evolving speech recognition and natural language processing technology to enhance the lives of our customers.”
– Michael Krouse, Senior VP Operational Support and CIO, OhioHealth
Easily and securely connect devices to the cloud. Scale to billions of devices and trillions of messages.
Easily and securely connect devices to the cloud. Enable applications to interact with devices even when they are offline. Use AWS services to gather, process and act on data, without having to manage any infrastructure. Learn more »
Run local compute, messaging & data caching for connected devices in a secure way. With AWS Greengrass, connected devices can run AWS Lambda functions, keep device data in sync, and communicate with other devices securely – even when not connected to the Internet. Learn more »
How AWS IoT Works
Run code without thinking about servers. Pay for only the compute time you consume.
Run code without provisioning or managing servers. Pay only for the compute time you consume. Just upload your code and Lambda takes care of everything required to run and scale your code with high availability.
Learn more »
Zillow uses AWS Lambda and Amazon Kinesis to track a subset of mobile metrics in real time. With Kinesis and Lambda, Zillow was able to develop and deploy a cost effective solution in two weeks.
Watch the video »
Powerful Compute Instances for Big Data Analytics
Compute-optimized instances, such as C4 instances, feature the highest performing processors and the lowest price/compute performance in EC2. With support for clustering C4 instances are ideal for batch processing, distributed analytics, high performance science and engineering applications, ad serving, MMO gaming, and video-encoding. Learn more »
Featuring up to 48 TB of HDD-based local storage, dense storage instances deliver high throughput, and offer the lowest price per disk throughput performance on EC2. Ideal for Massively Parallel Processing (MPP), Hadoop, distributed computing, distributed file systems, network file systems, and big data processing applications. Learn more »
Memory optimized instances have the lowest cost per GiB of RAM among Amazon EC2 instance types. These instances are ideal for high performance databases, distributed memory caches, in-memory analytics, genome assembly and analysis, and other large enterprise applications. Learn more »
GPU instances are ideal to power graphics-intensive applications such as 3D streaming, machine learning, and video encoding. Each instance features high-performance NVIDIA GPUs with an on-board hardware video encoder designed to support up to eight real-time HD video streams (720p@30fps) or up to four real-time full HD video streams (1080p@30fps). Learn more »
Simple, fast, secure data migration services to and from AWS
AWS Direct Connect
Reduce your bandwidth costs, transfer data to and from AWS directly. Establish private connectivity between AWS and your datacenter, office, or colocation environment. Learn more »
Amazon Snowball
Avoid high network costs, long transfer times, and security concerns with Snowball. A petabyte-scale data transport appliance to securely transfer large amounts of data at 1/5 the cost of high-speed internet.
Learn more »
AWS Database Migration Service
Migrate databases to AWS easily and securely. Start with just a few clicks in the AWS Management Console, then let AWS manage all the complexities of the migration process. Learn more »
More Resources: Big Data Blog | Self-paced Labs | AWS Public Datasets | AWS Marketplace