Welcome to Apache Lucene

The Apache LuceneTM project develops open-source search software, including:

  • Lucene Core, our flagship sub-project, provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.
  • SolrTM is a high performance search server built using Lucene Core, with XML/HTTP and JSON/Python/Ruby APIs, hit highlighting, faceted search, caching, replication, and a web admin interface.
  • Open Relevance Project is a subproject with the aim of collecting and distributing free materials for relevance testing and performance.
  • PyLucene is a Python port of the Core project.

LuceneTM News

29 December 2014 - Apache Lucene 4.10.3 and Apache Solr 4.10.3 Available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.10.3 and Apache Solr 4.10.3.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases contain a number of bug fixes. The Solr release includes a security vulnerability fix that you can read about on the Solr news page: http://lucene.apache.org/solr/news.html

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details, and Happy Holidays!

31 October 2014 - Apache Lucene 4.10.2 and Apache Solr 4.10.2 Available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.10.2 and Apache Solr 4.10.2.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases contain a number of bug fixes.

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details, and Happy Halloween!

29 September 2014 - Apache Lucene 4.10.1 and Apache Solr 4.10.1 Available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.10.1 and Apache Solr 4.10.1.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases contain a number of bug fixes.

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

22 September 2014 - Apache Lucene 4.9.1 and Apache Solr 4.9.1 Available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.9.1 and Apache Solr 4.9.1.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases contain a number of bug fixes.

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

03 September 2014 - Apache Lucene 4.10.0 and Apache Solr 4.10.0 Available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.10.0 and Apache Solr 4.10.0.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

Highlights of the Lucene release include:

  • New TermAutomatonQuery using an automaton for proximity queries. http://blog.mikemccandless.com/2014/08/a-new-proximity-query-for-lucene-using.html

  • New OrdsBlockTree terms dictionary supporting ord lookup.

  • Simplified matchVersion handling for Analyzers with new setVersion method, as well as Analyzer constructors not requiring Version.

  • Fixed possible corruption when opening a 3.x index with NRT reader.

  • Fixed edge case in StandardTokenizer that caused extremely slow parsing times with long text which partially matched grammar rules.

Highlights of the Solr release include:

  • This release upgrades Solr Cell's (contrib/extraction) dependency on Apache POI to mitigate 2 security vulnerabilities.

  • Scripts for starting, stopping, and running Solr examples

  • Distributed query support for facet.pivot

  • Interval Faceting for Doc Values fields

  • New "terms" QParser for efficiently filtering documents by a list of values

18 August 2014 - Recommendation to update Apache POI in Apache Solr 4.8.0, 4.8.1, and 4.9.0 installations

Apache Solr versions 4.8.0, 4.8.1, 4.9.0 bundle Apache POI 3.10-beta2 with its binary release tarball. This version (and all previous ones) of Apache POI are vulnerable to the following issues: CVE-2014-3529 (XML External Entity (XXE) problem in Apache POI's OpenXML parser), CVE-2014-3574 (XML Entity Expansion (XEE) problem in Apache POI's OpenXML parser).

The Apache POI PMC released a bugfix version (3.10.1) today.

Solr users are affected by these issues, if they enable the "Apache Solr Content Extraction Library (Solr Cell)" contrib module from the folder "contrib/extraction" of the release tarball.

Users of Apache Solr are strongly advised to keep the module disabled if they don't use it. Alternatively, users of Apache Solr 4.8.0, 4.8.1, or 4.9.0 can update the affected libraries by replacing the vulnerable JAR files in the distribution folder. Users of previous versions have to update their Solr release first, patching older versions is impossible.

For detailed instructions, see Solr's News

25 June 2014 - Apache Lucene 4.9.0 and Apache Solr 4.9.0 Available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.9.0 and Apache Solr 4.9.0.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

Highlights of the Lucene release include:

  • New Terms.getMin/Max methods to retrieve the lowest and highest terms per field.

  • New IDVersionPostingsFormat, optimized for ID lookups that associate a monotonically increasing version per ID.

  • Atomic update of a set of doc values fields.

  • Numerous optimizations for doc values search-time performance.

  • New (default) Lucene49NormsFormat to better compress certain cases such as very short fields.

  • New SORTED_NUMERIC docvalues type for efficient processing of multi-valued numeric fields.

  • Indexer passes previous token stream for easier reuse.

  • MoreLikeThis accepts multiple values per field.

  • All classes that estimate their RAM usage now implement a new Accountable interface.

  • Lucene files are now written by (File)OutputStream on all platforms, completely disallowing seeking with simplified IO APIs.

  • Improve the confusing error message when MMapDirectory cannot create a new map.

Highlights of the Solr release include:

  • Numerous optimizations for doc values search-time performance

  • Allow a client application to request the minium achieved replication factor for an update request (single or batch) by sending an optional parameter "min_rf".

  • Query re-ranking support with the new ReRankingQParserPlugin.

  • A new [child ...] DocTransformer for optionally including Block-Join decendent documents inline in the results of a search.

  • A new (default) Lucene49NormsFormat to better compress certain cases such as very short fields.

11 June 2014 - Open Relevance sub-project closed

The Apache Lucene Project Management Committee decided in a vote, that the Apache Lucene sub-project "Open Relevance" will be discontinued. There was only modest activity during the last years and the project made no releases. Thank you to all committers for their support in this project!

20 May 2014 - Apache Lucene 4.8.1 and Apache Solr 4.8.1 Available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.8.1 and Apache Solr 4.8.1.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases contain a number of bug fixes.

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

28 April 2014 - Apache Lucene 4.8.0 and Apache Solr 4.8.0 Available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.8.0 and Apache Solr 4.8.0.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases now require Java 7 or greater (recommended is Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions have known JVM bugs affecting Lucene and Solr). In addition, both are fully compatible with Java 8.

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

Highlights of the Lucene release include:

  • All index files now store end-to-end checksums, which are now validated during merging and reading. This ensures that corruptions caused by any bit-flipping hardware problems or bugs in the JVM can be detected earlier. For full detection be sure to enable all checksums during merging (it's disabled by default).

  • Lucene has a new Rescorer/QueryRescorer API to perform second-pass rescoring or reranking of search results using more expensive scoring functions after first-pass hit collection.

  • AnalyzingInfixSuggester now supports near-real-time autosuggest.

  • Simplified impact-sorted postings (using SortingMergePolicy and EarlyTerminatingCollector) to use Lucene's Sort class to express the sort order.

  • Bulk scoring and normal iterator-based scoring were separated, so some queries can do bulk scoring more effectively.

  • Switched to MurmurHash3 to hash terms during indexing.

  • IndexWriter now supports updating of binary doc value fields.

  • HunspellStemFilter now uses 10 to 100x less RAM. It also loads all known OpenOffice dictionaries without error.

  • Lucene now also fsyncs the directory metadata on commits, if the operating system and file system allow it (Linux, MacOSX are known to work).

  • Lucene now uses Java 7 file system functions under the hood, so index files can be deleted on Windows, even when readers are still open.

  • A serious bug in NativeFSLockFactory was fixed, which could allow multiple IndexWriters to acquire the same lock. The lock file is no longer deleted from the index directory even when the lock is not held.

Highlights of the Solr release include:

  • <fields> and <types> tags have been deprecated from schema.xml. There is no longer any reason to keep them in the schema file, they may be safely removed. This allows intermixing of <fieldType>, <field> and <copyField> definitions if desired.

  • The new {!complexphrase} query parser supports wildcards, ORs etc. inside Phrase Queries.

  • New Collections API CLUSTERSTATUS action reports the status of collections, shards, and replicas, and also lists collection aliases and cluster properties.

  • Added managed synonym and stopword filter factories, which enable synonym and stopword lists to be dynamically managed via REST API.

  • JSON updates now support nested child documents, enabling {!child} and {!parent} block join queries.

  • Added ExpandComponent to expand results collapsed by the CollapsingQParserPlugin, as well as the parent/child relationship of nested child documents.

  • Long-running Collections API tasks can now be executed asynchronously; the new REQUESTSTATUS action provides status.

  • Added a hl.qparser parameter to allow you to define a query parser for hl.q highlight queries.

  • In Solr single-node mode, cores can now be created using named configsets.

  • New DocExpirationUpdateProcessorFactory supports computing an expiration date for documents from the "TTL" expression, as well as automatically deleting expired documents on a periodic basis.

15 April 2014 - Apache Lucene 4.7.2 and Apache Solr 4.7.2 Available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.7.2 and Apache Solr 4.7.2.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases contain a number of bug fixes.

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

02 April 2014 - Apache Lucene 4.7.1 and Apache Solr 4.7.1 Available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.7.1 and Apache Solr 4.7.1.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases contain a number of bug fixes.

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

12 March 2014 - Apache Lucene 4.8 and Apache Solr 4.8 will require Java 7

The Apache Lucene/Solr committers decided with a large majority on the vote to require Java 7 for the next minor release of Apache Lucene and Apache Solr (version 4.8)!

The next release will also contain some improvements for Java 7:

  • Better file handling (especially on Windows) in the directory implementations. Files can now be deleted on windows, although the index is still open - like it was always possible on Unix environments (delete on last close semantics).

  • Speed improvements in sorting comparators: Sorting now uses Java 7's own comparators for integer and long sorts, which are highly optimized by the Hotspot VM.

If you want to stay up-to-date with Lucene and Solr, you should upgrade your infrastructure to Java 7. Please be aware that you must use at least use Java 7u1. The recommended version at the moment is Java 7u25. Later versions like 7u40, 7u45,... have a bug causing index corrumption. Ideally use the Java 7u60 prerelease, which has fixed this bug. Once 7u60 is out, this will be the recommended version. In addition, there is no more Oracle/BEA JRockit available for Java 7, use the official Oracle Java 7. JRockit was never working correctly with Lucene/Solr (causing index corrumption), so this should not be an issue. Please also review our list of JVM bugs: http://wiki.apache.org/lucene-java/JavaBugs

EDIT (as of 15 April 2014): The recently released Java 7u55 fixes the above bug causing index corrumption. This version is now the recommended version for running Apache Lucene and Solr.

26 February 2014 - Apache Lucene 4.7.0 and Apache SolrTM 4.7.0 available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.7.0 and Apache Solr 4.7.0.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases contain a number of bug fixes.

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

28 January 2014 - Apache Lucene 4.6.1 and Apache SolrTM 4.6.1 available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.6.1 and Apache Solr 4.6.1.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases contain a number of bug fixes.

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

24 November 2013 - Apache Lucene 4.6.0 and Apache SolrTM 4.6.0 available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.6.0 and Apache Solr 4.6.0.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases contain a number of bug fixes.

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

24 October 2013 - Apache Lucene 4.5.1 and Apache SolrTM 4.5.1 available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.5.1 and Apache Solr 4.5.1.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases contain a number of bug fixes.

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

5 October 2013 - Apache Lucene 4.5 and Apache SolrTM 4.5 available

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.5 and Apache Solr 4.5.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

Highlights of the Lucene release include:

  • Added support for missing values to DocValues fields through AtomicReader.getDocsWithField.

  • Lucene 4.5 has a new Lucene45Codec with Lucene45DocValues, supporting missing values and with most datastructures residing off-heap.

  • New in-memory DocIdSet implementations which are especially better than FixedBitSet on small sets: WAH8DocIdSet, PFORDeltaDocIdSet and EliasFanoDocIdSet.

  • CachingWrapperFilter now caches filters with WAH8DocIdSet by default, which has the same memory usage as FixedBitSet in the worst case but is smaller and faster on small sets.

  • TokenStreams now set the position increment in end(), so we can handle trailing holes.

  • IndexWriter no longer clones the given IndexWriterConfig.

Lucene 4.5 also includes numerous optimizations and bugfixes.

Highlights of the Solr release include:

  • Custom sharding support, including the ability to shard by field.

  • DocValue improvements: single valued fields no longer require a default value, allowiing dynamicFields to contain doc values, as well as sortMissingFirst and sortMissingLast on docValue fields.

  • Ability to store solr.xml in ZooKeeper.

  • Multithreaded faceting.

  • CloudSolrServer can now route updates directly to the appropriate shard leader.

Solr 4.5 also includes numerous optimizations and bugfixes.

The Apache Software Foundation

The Apache Software Foundation provides support for the Apache community of open-source software projects. The Apache projects are defined by collaborative consensus based processes, an open, pragmatic software license and a desire to create high quality software that leads the way in its field. Apache Lucene, Apache Solr, Apache PyLucene, Apache Open Relevance Project and their respective logos are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners.