Apache Kylin™ Overview
Apache Kylin™ is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets, original contributed from eBay Inc.
Apache Kylin™ lets you query massive data set at sub-second latency in 3 steps.
- Identify a Star Schema on Hadoop.
- Build Cube from the identified tables.
- Query with ANSI-SQL and get results in sub-second, via ODBC, JDBC or RESTful API.
What is Kylin?
- Extremely Fast OLAP Engine at Scale:
- ANSI SQL Interface on Hadoop:
- Interactive Query Capability:
- MOLAP Cube:
- Seamless Integration with BI Tools:
- Other Highlights:
- Compression and Encoding Support
- Incremental Refresh of Cubes
- Leverage HBase Coprocessor for query latency
- Both approximate and precise Query Capabilities for Distinct Count
- Approximate Top-N Query Capability
- Easy Web interface to manage, build, monitor and query cubes
- Security capability to set ACL at Cube/Project Level
- Support LDAP and SAML Integration
Kylin Ecosystem
Kylin Core: Fundamental framework of Kylin OLAP Engine comprises of Metadata Engine, Query Engine, Job Engine and Storage Engine to run the entire stack. It also includes a REST Server to service client requests
Extensions: Plugins to support additional functions and features
Integration: Lifecycle Management Support to integrate with Job Scheduler, ETL, Monitoring and Alerting Systems
User Interface: Allows third party users to build customized user-interface atop Kylin core
Drivers: ODBC and JDBC drivers to support different tools and products, such as Tableau