Welcome to the Apache UIMA project

Follow us on Twitter

Welcome to the Apache UIMA™ project. Our goal is to support a thriving community of users and developers of UIMA frameworks, tools, and annotators, facilitating the analysis of unstructured content such as text, audio and video.

What is UIMA?

Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at.

UIMA enables applications to be decomposed into components, for example "language identification" => "language specific segmentation" => "sentence boundary detection" => "entity detection (person/place names etc.)". Each component implements interfaces defined by the framework and provides self-describing metadata via XML descriptor files. The framework manages these components and the data flow between them. Components are written in Java or C++; the data that flows between components is designed for efficient mapping between these languages.

UIMA additionally provides capabilities to wrap components as network services, and can scale to very large volumes by replicating processing pipelines over a cluster of networked nodes.

Apache UIMA is an Apache-licensed open source implementation of the UIMA specification [pdf] [doc] (that specification is, in turn, being developed concurrently by a technical committee within OASIS , a standards organization). We invite and encourage you to participate in both the implementation and specification efforts.

Here you can find:

Frameworks

Components, and

Infrastructure,

all licensed under the Apache license. The dashed-line boxes above are placeholders for possible future additions.

The Frameworks run the components, and are available for both Java and C++. The Java Framework supports running both Java and non-Java components (using the C++ framework). The C++ framework, besides supporting annotators written in C/C++, also supports Perl, Python, and TCL annotators. The UIMA-AS and UIMA-DUCC are both Scaleout Frameworks and are addons to the base Java framework. The UIMA-AS supports very flexible scaleout capability based on JMS (Java Messaging Services) and ActiveMQ. The UIMA-DUCC extends UIMA-AS by providing cluster management services to automate the scale-out of UIMA pipelines over computing clusters. Visit the UIMA-DUCC live demo description and the UIMA-DUCC live demo itself.

The frameworks support configuring and running pipelines of Annotator components. These components do the actual work of analyzing the unstructured information. Users can write their own annotators, or configure and use pre-existing annotators. Some annotators are available as part of this project; others are contained in various repositories on the internet.

Additional infrastructure support components include a simple server that can receive REST requests and return annotation results, for use by other web services.

The Addons and Sandbox is for Addons (Annotators and other things) for UIMA, and a place where new ideas are developed for potential incorporation into the project.

UIMA News

26 Apr 2019: Apache UIMA-DUCC 3.0.0 released

16 Apr 2019: Apache uimaFIT 3.0.0 released

10 Apr 2019: Apache UIMA Java SDK 3.0.2 released

24 Feb 2019: Apache UIMA Ruta 2.7.0 released

17 Apr 2018: Apache UIMA-AS 2.10.3 released

19 Mar 2018: Apache UIMA-DUCC 2.2.2 released

07 Feb 2018: Apache UIMA-AS 2.10.2 released

14 Nov 2017: Apache uimaFIT 2.4.0 released

30 Aug 2017: Apache UIMA-DUCC 2.2.1 released

24 July 2017: Apache UIMA Ruta 2.6.1 released

4 April 2017: Apache UIMA Java SDK 2.10.0 released

29 Mar 2017: Apache uimaFIT 2.3.0 released

10 Mar 2017: Apache UIMA Ruta 2.6.0 released

23 Feb 2017: Apache UIMA-DUCC 2.2.0 released

15 Dec 2016: Apache UIMA-AS 2.9.0 released

28 Sep 2016: Apache UIMA Ruta 2.5.0 released

11 August 2015: Apache UIMA Java SDK 2.9.0 released

08 Aug 2016: Apache UIMA-DUCC 2.1.0 released

20 May 2016: Apache UIMA-AS 2.8.1 released

06 April 2016: Apache uimaFIT 2.2.0 released

15 February 2016: Apache UIMA Ruta 2.4.0 released

26 October 2015: Apache UIMA DUCC 2.0.1 released

25 August 2015: Apache UIMA Ruta 2.3.1 released

11 August 2015: Apache UIMA DUCC 2.0.0 released

11 August 2015: Apache UIMA Java SDK 2.8.1 released

22 July 2015: Apache UIMA Java SDK 2.8.0 released

08 June 2015: Apache UIMA Ruta 2.3.0 released

06 March 2015: Apache UIMA SDK 2.7.0 released

23 October 2014: Apache UIMA DUCC 1.1.0 released

23 September 2014: Apache UIMA Ruta 2.2.1 released

16 June 2014: Apache UIMA-AS 2.6.0 released

12 June 2014: Apache uimaFIT 2.1.0 released

12 May 2014: Apache UIMA SDK 2.6.0 released

15 April 2014: Apache UIMA Ruta 2.2.0 released

10 April 2014: Upcoming UIMA Workshop at COLING 2014

30 January 2014: Apache UIMA DUCC 1.0.0 released

14 January 2014: Apache UIMA SDK 2.5.0 released

20 September 2013: Apache UIMA Ruta 2.1.0 released

31 August 2013: Apache uimaFIT 2.0.0 released

09 August 2013: Apache UIMA SDK 2.4.2 released

26 July 2013: Apache UIMA SDK 2.4.1 released

29 May 2013: Apache UIMA Ruta 2.0.1 released

05 March 2013: Apache UIMA Ruta (was formerly named TextMarker) 2.0.0 released

15 November 2012: Apache UIMA-CPP 2.4.0 released

15 November 2012: Apache UIMA-AS 2.4.0 released

07 December 2011: Apache UIMA Java SDK 2.4.0 released

29 August 2011: Apache UIMA Addons 2.3.1 released

22 March 2011: Apache UIMA AS 2.3.1 released

14 February 2011: UIMA underlies Watson Jeopardy challenger

10 December 2010: UIMA Java SDK 2.3.1 released

18 March 2010: Apache UIMA graduates from the incubator

26 January 2010: UIMA 2.3.0-incubating released

19 March 2009: UIMA approved as an OASIS Standard

29 December 2008: UIMA C++ Framework Released

12 August 2008: UIMA Cas Editor Released

24 July 2008: UIMA Asynchronous Scaleout Released

16 July 2008: Hotfix-1 for UIMA core 2.2.2-incubating Released

07 May 2008: Version 2.2.2-incubating Released

19 December 2007: Version 2.2.1-incubating Released

23 August 2007: Version 2.2.0-incubating Released

7 April 2007: Hotfix-1 for 2.1.0 Component Description Editor (CDE) Released

14 March 2007: Version 2.1.0-incubating Released

20 December 2006: GLDV conference workshop (Germany)

13 December 2006: ApacheCon 2007 Europe coming up

07 December 2006: establish UIMA sandbox

05 October 2006: UIMA has been accepted by the Incubator PMC