Articles & Tutorials

The Articles and Tutorials section features in-depth documents designed to give practical help to developers working with AWS. They have been created by members of the AWS developer community or the Amazon Team and give structured examples, analysis, tips, tricks and guidelines based on real usage of AWS services.

Showing 1-25 of 165 results.
Sort by:
This article describes some helpful tips and tricks for developing applications on the AWS SDK for Java.
Last Modified: Aug 31, 2015 20:10 PM GMT

This tutorial is now deprecated. To learn more about Spark on Amazon EMR, click here.

This tutorial walks you through installing and operating Spark, a fast and general engine for large-scale data processing, on an Amazon EMR cluster. You will also create and query a dataset in Amazon S3 using Spark SQL, and learn how to monitor Spark on an Amazon EMR cluster with Amazon CloudWatch.
Last Modified: Jun 16, 2015 23:25 PM GMT
This guide introduces the AWS Elastic Beanstalk deployment feature of the AWS Toolkit for Eclipse and provides a walkthrough for getting started with AWS Elastic Beanstalk deployment for developers using the Eclipse IDE.
Last Modified: Jun 1, 2015 22:00 PM GMT
This article explains how to optimize performance for an Amazon Redshift data warehouse that uses a star schema design.
Last Modified: May 27, 2015 19:01 PM GMT
This article provides details on how to establish multicast traffic within Amazon Elastic Compute Cloud (Amazon EC2) instances. The article describes how to install and configure the software and the tools needed for this solution. The solution described in this article is just one of many ways to implement multicast on EC2.
Last Modified: Apr 29, 2015 21:16 PM GMT

An internet advertising company operates a data warehouse using Hive and Amazon Elastic MapReduce. This company runs machines in Amazon EC2 that serve advertising impressions and redirect clicks to the advertised sites. The machines running in Amazon EC2 store each impression and click in log files pushed to Amazon S3.

Last Modified: Apr 29, 2015 17:47 PM GMT
Optional Windows Server operating system components are typically added or configured using installation media. This tutorial describes how to add or configure optional Windows components within the Amazon EC2 environment.
Last Modified: Apr 27, 2015 19:59 PM GMT
Any application making requests directly to AWS APIs should take care in managing the AWS security credentials used to sign requests to AWS, including mobile applications. This article walks through current solutions for credential management in applications based on the mobile AWS SDKs.
Last Modified: Mar 21, 2015 0:14 AM GMT
Learn how to build and deploy a Node.js application on Amazon EMR
Last Modified: Feb 24, 2015 22:30 PM GMT
This article will show you how to install and use Apache Accumulo with Amazon Elastic MapReduce.
Last Modified: Feb 24, 2015 22:27 PM GMT
ItemSimilarity is a simple Hadoop streaming Python application that attempts to find similar items for each item in the input dataset. This example application finds similar artists using the Audioscrobbler user playlist dataset and Amazon Elastic MapReduce.
Last Modified: Feb 24, 2015 21:38 PM GMT
This tutorial shows you how to develop a simple, log parsing application using Pig and Amazon Elastic MapReduce. The tutorial walks you through using Pig interactively (via SSH) on a subset of your data, which enables you to prototype your script quickly. The tutorial then takes you through uploading the script to Amazon S3 and running on a larger set of input data.
Last Modified: Feb 24, 2015 21:37 PM GMT
Analyze your Apache logs using Pig and Amazon Elastic MapReduce.
Last Modified: Feb 24, 2015 21:37 PM GMT
This page lists documentation resources specific to using Hive on Amazon Elastic MapReduce.
Last Modified: Feb 24, 2015 21:36 PM GMT
This tutorial will show you how to use Karmasphere Studio to develop, debug and deploy Hadoop Jobs for Amazon Elastic MapReduce.
Last Modified: Feb 24, 2015 21:34 PM GMT
This document provides a quick guide on how to use Elastic MapReduce to develop, debug, and run job flows that have multiple steps.
Last Modified: Feb 24, 2015 21:32 PM GMT
This video provides an introduction to the use of Apache Hive tooperate a data warehouse with Amazon Elastic MapReduce. It takes youthrough the development of Hive script using an interactive job flowand shows you how to deploy this script in Amazon S3 and how to runjob flows to execute the script in batch mode.
Last Modified: Feb 24, 2015 21:15 PM GMT
This video walks you through using the AWS Console to start an interactive job flow for developing a simple log parsing application using Apache Pig, then uploading the finished application to S3 ready to be run through the Console on a regular basis.
Last Modified: Feb 24, 2015 21:09 PM GMT
This article describes the Hive extensions that make Hive work more easily with Amazon Elastic MapReduce.
Last Modified: Feb 24, 2015 20:09 PM GMT
This article describes the differences between the AWS SDK for Java and previous Java libraries from Amazon Web Services.
Last Modified: Feb 24, 2015 20:06 PM GMT
This article describes the Hive extensions that make Hive work more easily with Amazon Elastic MapReduce.
Last Modified: Feb 24, 2015 20:05 PM GMT
AWS infrastructure services are hosted in a number of regions, including locations in the US, Europe, and Asia Pacific. This article lists the web service API endpoints needed to make API requests and manage infrastructure in each region.
Last Modified: Feb 24, 2015 20:03 PM GMT
In this article, we will set up an example situation showing how to use the open source Squid proxy to control access to Amazon Simple Storage Service (S3) from within an Amazon Virtual Private Cloud (VPC). First, you will configure Squid to allow access to Linux Yum repositories. Next, you will configure Squid to restrict access to a list of approved Amazon S3 buckets. Then, you will configure Squid to direct traffic based on the URL, sending some requests to an Internet gateway (IGW) and other traffic to a virtual private gateway (VGW). Finally, you will explore options for making Squid highly available.
Last Modified: Feb 19, 2015 22:16 PM GMT
Communication on the Internet is susceptible to eavesdropping and malicious tampering. Amazon Web Services recommends you take action to protect the API requests you send.
Last Modified: Jan 29, 2015 22:40 PM GMT
Analyze your Amazon CloudFront Logs using Amazon Elastic MapReduce.
Last Modified: Dec 8, 2014 21:07 PM GMT
Results per page:
©2015, Amazon Web Services, Inc. or its affiliates. All rights reserved.