The Apache Hive ™ data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.
Check out the Getting Started Guide on the Hive wiki.
Hive is an open source volunteer project under the Apache Software Foundation. Previously it was a subproject of Apache Hadoop, but has now graduated to become a top-level project of its own. We encourage you to learn about the project and contribute your expertise. Here are some starter links: