Rich Bowen | 25 Nov 18:32 2015

[ANNOUNCE] CFP open for ApacheCon North America 2016

Community growth starts by talking with those interested in your
project. ApacheCon North America is coming, are you?

We are delighted to announce that the Call For Presentations (CFP) is
now open for ApacheCon North America. You can submit your proposed
sessions at
for big data talks and
for all other topics.

ApacheCon North America will be held in Vancouver, Canada, May 9-13th
2016. ApacheCon has been running every year since 2000, and is the place
to build your project communities.

While we will consider individual talks we prefer to see related
sessions that are likely to draw users and community members. When
submitting your talk work with your project community and with related
communities to come up with a full program that will walk attendees
through the basics and on into mastery of your project in example use
cases. Content that introduces what's new in your latest release is also
of particular interest, especially when it builds upon existing well
know application models. The goal should be to showcase your project in
ways that will attract participants and encourage engagement in your
community, Please remember to involve your whole project community (user
and dev lists) when building content. This is your chance to create a
project specific event within the broader ApacheCon conference.

Content at ApacheCon North America will be cross-promoted as
mini-conferences, such as ApacheCon Big Data, and ApacheCon Mobile, so
(Continue reading)

d.heidarpour | 24 Nov 17:45 2015

Ruta and Morphology Analyzing


I'm trying to implement a Morphology Analyzer (AE) for Farsi in UIMA. I
need a way to compile my words list and rules so it can be queried by
the AE for both bottom-up and top-down morphology analyzing of Farsi
words. There are a few FST libraries in Java for this task. But my
question is Can I use UIMA Ruta straightforwardly? or Can I use it in a
way to compile the words and rules in a structure like Trie? 


~Davood Heidarpour 
Matthias Koch | 23 Nov 13:41 2015

Bug in ExternalResourceFactory ???


We want to create an ExternalResourceDescription via

public static ExternalResourceDescription 
createExternalResourceDescription(Class<? extends Resource> aInterface, 
Object... aParams)

This resource has a couple of ConfigurationParameters and some of them 
are booleans.

If the object[] aParams contains those booleans an exception is thrown 
in line 177 ( param.setValue((String) aParams[i * 2 + 1]);) because the 
cast to String is not possible.

If we put them as Strings into the object array it seems to work.

This seems like an inconsistency, because we have to put boolean into 
the factory for an AnalysisEngineDescription but Strings for an 

Is this a bug or the expected behavior?

Best regards,

Wolf-Dietrich Materna | 17 Nov 09:05 2015

How to use string functions from ruta-core-ext in Ruta


I'd like to use the Ruta string functions referenced in the user guide here:

The problem is, they don't work out of the box in the Ruta Workbench 2.3.1 with Eclipse 4.4.2. I've set up a new
Ruta project with a script containing the (slightly modified) code from the substring example:

DECLARE Test; // I've added this line to make the example work.
CW{-> MATCHEDTEXT(s), ADD(sl, substring(s,0,8))};
CW{INLIST(sl) -> Test}; // Changed SW to CW to make the rule work.

The input file only contains two words, "Alexanderplatz" and "Alexander". What I expected was that the
first rule finds "Alexanderplatz" and stores the  first nine letters of it in the string list sl so that the
last line can annotate "Alexander" with "Test". In reality, however, nothing happens. I get no error
message, but after execution, there is no "Test" annotation.

Is there some additional configuration needed to access these string functions?

Any help would be appreciated.

Best Regards,
            Wolf-Dietrich Materna

Sean Crist | 13 Nov 19:36 2015

Annotator class name is required for a primitive Analysis Engine


I’m working through the UIMA tutorial at .  Before
writing to this list, I searched Google at length.  I also made a new project and did the whole tutorial
again, in case I missed something.

To test the annotator, I ran CVD as instructed.  Within CVD, I chose Run -> Load AE, and chose
RoomNumberAnnotatorDescriptor.xml.  At that point, I get this alert:

org.apache.uima.resource.ResourceInitializationException: Annotator class name is required for a
primitive Analysis Engine.  (Descriptor: file: /Users/scrist/Documents/workspace/RoomNumberAnnotator/desc/RoomNumberAnnotatorDescriptor.xml)

This would seem to suggest that my annotator class isn’t in the class path.  I did follow the earlier
instruction to add the RoomNumberAnnotator project to the classpath.

There are very few matches in Google for this particular error, so I’m guessing that something quirky and
unusual has gone wrong here.  I’m using Eclipse for Mac,  Mars.1, Release 4.5.1.

--Sean Crist

Olivier Austina | 12 Nov 14:12 2015

how to call UIMA Ruta from uimaFIT


I try to call a simple UIMA ruta script from uimaFIT in java. It works in
plain UIMA but it didn't work in uimaFIT for me.  Here is the script:

PACKAGE tutorial.entity;
WORDLIST MonthsList = 'month.txt';
Document{-> MARKFAST(Month, MonthsList )};
NUM{REGEXP("19..|20..") -> MARK(Year,1,2)};
Month Year {-> MARK(Date,1,2)};

Here is the java code (based on Ruta doc example):

//System.out.println( "Hello World!" );
        File specFile = new
            XMLInputSource in = new XMLInputSource(specFile);
            ResourceSpecifier specifier = UIMAFramework.getXMLParser().
            // for import by name... set the datapath in the ResourceManager
            AnalysisEngine ae =
            CAS cas = ae.newCAS();
            cas.setDocumentText("This is my document. March  June

(Continue reading)

Christopher Baechle | 6 Nov 16:12 2015

How to annotate based on document collection

I am working with an existing project that is built with UIMA. I am trying
to create a tf-idf style score that looks at the set of documents as a

Since the rest of the project uses UIMA heavily, I would like to implement
this as an annotator if possible, rather than a separate program. Is it
possible within UIMA to do this?
Balaji Vijayan | 4 Nov 03:08 2015

ClassNotFound Error when running the UIMA tutorial

Windows 8.1, Java 1.8.60, Eclipse 4.4 Luna, Maven 3.3

This same issue was observed on a Mac running Yosemite 10.10.5.

Following the tutorial <> for
UIMA I am able to get the document analyzer window to launch. When I
attempt to run the app I get a ClassNotFound error. Here's the stack trace:

Caused by: java.lang.ClassNotFoundException:
    at Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Unknown Source)

Someone else had the same issue in 2011 but I was unable to find a
resolution in that issue:
(Continue reading)

Wahed Hem | 2 Nov 13:40 2015

UIMA-AS dynamically switch remote delegates in Analysis Engine

is it possible to dynamically switch remote delegates in an UIMA-AS
analysis engine?
Lets say, i have a part-of-speech tagger. This tagger needs token
informations before running. I have multiple tokenizers. How can i let the
client choose between those tokenizers, when running the part-of-speech

Thanks in advance.

Mario Gazzo | 27 Oct 20:52 2015

Ruta MARKLAST and punctuations

Hi Peter,

It appears as if the MARKLAST action doesn’t annotate anything if the last token is a COLON, is this a
behaviour you can recognise? I assume then this action somehow behaves differently when the last token is
a punctuation mark in general.


Jaroslaw Cwiklik | 26 Oct 20:05 2015

[ANNOUNCE] Apache UIMA DUCC 2.0.1 released

The Apache UIMA team is pleased to announce the release of the UIMA DUCC,
version 2.0.1.

DUCC stands for Distributed UIMA Cluster Computing. DUCC is a cluster
management system providing tooling, management, and scheduling facilities
to automate the scale-out of applications written to the UIMA framework.
Core UIMA provides a generalized framework for applications that process
unstructured information such as human language, but does not provide a
scale-out mechanism. UIMA-AS provides a scale-out mechanism to distribute
UIMA pipelines over a cluster of computing resources, but does not provide
job or cluster management of the resources. DUCC defines a formal job model
that closely maps to a standard UIMA pipeline. Around this job model DUCC
provides cluster management services to automate the scale-out of UIMA
pipelines over computing clusters.

This is a bug release, addressing bugs found since DUCC 2.0.0 release. For
a full list of the changes, please refer to Jira report:

More information about DUCC can be found here:

-Jerry Cwiklik, for the Apache UIMA community