Apache Beam
Apache Beam provides an advanced unified programming model, allowing you to implement batch and streaming data processing jobs that can run on any execution engine.
Apache Beam is:
- UNIFIED - Use a single programming model for both batch and streaming use cases.
- PORTABLE - Execute pipelines on multiple execution environments, including Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow.
- EXTENSIBLE - Write and share new SDKs, IO connectors, and transformation libraries.
Get Started
To use Beam for your data processing tasks, start by reading the Beam Overview and performing the steps in the Quickstart. Then dive into the Documentation section for in-depth concepts and reference materials for the Beam model, SDKs, and runners.
Contribute
Beam is an Apache Software Foundation project, available under the Apache v2 license. Beam is an open source community and contributions are greatly appreciated! If you’d like to contribute, please see the Contribute section.
Blog
Feb 13, 2017 - Stateful processing with Apache Beam
Feb 1, 2017 - Media recap of the Apache Beam graduation
Jan 10, 2017 - Apache Beam established as a new top-level project
Jan 9, 2017 - Release 0.4.0 adds a runner for Apache Apex
Oct 20, 2016 - Testing Unbounded Pipelines in Apache Beam
Oct 11, 2016 - Strata+Hadoop World and Beam
Aug 3, 2016 - Apache Beam: Six Months in Incubation