James Baker | 30 Oct 15:08 2014

UIMA DUCC - Multi-machine Installation

I've been working through the installation of UIMA DUCC, and have
successfully got it set up and running on a single machine. I'd now like to
move to running it on a cluster of machines, but it isn't clear to me from
the installation guide as to whether I need to install DUCC on each node,
or whether ducc_ling is the only thing that needs installing on the
non-head nodes.

Could anyone shed some light on the process please?

Kameron Cole | 28 Oct 21:21 2014

incorporate logic into UIMA

I would like to integrate logic processing into my UIMA code.  I have
looked into JLogic, but it doesn't provide just a simple API and some jars
- and no examples.  Are there any other Java implementations of Prolog?

I'm not stuck on Prolog, although I do have experience in it.  Is there
something "easier" - or cleaner, maybe?  What about iLOG JRules?  I think
that's a lot less like symbolic logic, and more just business rules.

Any recommendation welcome.  Thanks
Kameron Cole | 28 Oct 21:14 2014

looking for lots of example UIMA code

I have not been able to find really any example UIMA code online.  Does
anyone have links to collections of code examples?
Jaroslaw Cwiklik | 23 Oct 20:48 2014

[ANNOUNCE] Apache UIMA DUCC 1.1.0 released

The Apache UIMA team is pleased to announce the release of
Apache UIMA DUCC, version 1.1.0.

DUCC stands for Distributed UIMA Cluster Computing. DUCC is a cluster
management system providing tooling, management, and scheduling facilities
to automate the scale-out of applications written to the UIMA framework.
Core UIMA provides a generalized framework for applications that process
unstructured information such as human language, but does not provide
a scale-out mechanism. UIMA-AS provides a scale-out mechanism to distribute
UIMA pipelines over a cluster of computing resources, but does not provide
job or cluster management of the resources. DUCC defines a formal job model
that closely maps to a standard UIMA pipeline. Around this job model DUCC
provides cluster management services to automate the scale-out of UIMA
pipelines over computing clusters.

This is a maintenance release that contains fixes and improvements over
UIMA DUCC 1.0.0.

For a full list of changes, please refer to Jira:

More information about UIMA DUCC can be found here:

 - Jaroslaw Cwiklik, for the Apache UIMA development team
Armin.Wegner | 23 Oct 13:05 2014

PearPackagingMavenPlugin and CVS


PearPackagingMavenPlugin copies the CVS subdirs to the PEAR. Can this be changed? How?

Piyush Paliwal | 22 Oct 13:35 2014

UIMA Ruta into jar?


we are developing one Ruta Project and want to access it in java project.
Currently what we did is to add the descriptor (generated from ruta script)
into UIMA pipeline which is in java project.

The pipeline can only be run on workspace, we are not able to make a single
jar of that java project and run on command line because it can not access
Ruta project as dependency.

There is also a direct way to read ruta script within java, but the script
can not import annotations from type systems if we put in java project
(i.e. it needs Ruta editor).

Any way to add Ruta project dependency into java?




Piyush Paliwal
Amit Gupta | 16 Oct 01:11 2014

Scale out tuning for jobs

I've been trying to find the options related to configuration of scaleout
of a ducc job.

Thus far the only ones Ive found are:

which limits the maximum number of processes spawned by a ducc job.

At what point does DUCC decide to spawn a new process or spread processing
out to a new node. Is there a tuning parameter for an optimal number of
work items per process spawned? Can the user control this behavior?

For example,
I have a job large enough that DUCC natively spreads it across 2 nodes.
I havent been able to force this job, via a config parameter, to spread
across 4 nodes (or "X" nodes) for faster processing times.

Does anyone know if theres a parameter than can directly control scaleout
in this manner?



Amit Gupta
Amit Gupta | 15 Oct 00:40 2014

query about PEAR installation for DUCC


I had a query about pear installation on a "headless" system via the
command line.
I dont see any instructions in the documentation on how to proceed.

Specifically, I'm attempting to run the Raw Text Processing example
documented in the DUCC Book,

The problematic step is "Installing the OpenNLP Pear".

Almost everywhere I have seen, it instructs the use of the GUI Installer.

DUCC Book points to the runPearInstaller script (which is strangely not
shipped with the DUCC binaries)
I managed to find it in the shipped binaries of UIMA. I set unpacked the
binaries and set up the environment as instructed in the UIMA SDK
(the script resides in $UIMA_HOME/bin) and running it fails with the
following error.

runPearInstaller.sh --help

Exception in thread "AWT-EventQueue-0" java.awt.HeadlessException:

No X11 DISPLAY variable was set, but this program performed an operation
which requires it.

at java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:207)
(Continue reading)

Debbie Zhang | 9 Oct 12:58 2014

RE: Error in running UIMA Ruta sample file

Hi Peter,

It seems after I clean the project, then run "debug", then double click the .xmi file in the output folder,
the rules will appear.



> -----Original Message-----
> From: Debbie Zhang [mailto:debbie.d.zhang@...]
> Sent: Thursday, 9 October 2014 9:49 PM
> To: user@...
> Subject: RE: Error in running UIMA Ruta sample file
> Thanks Peter for your reply. I cleaned the project after I received your email.
> However, I still got the error when I tried to debug.
> Just now, I double clicked the output file Example1.txt.xmi. I got the rules
> displayed on the rule views. Does it mean that I don't need to run the debug
> to get rules on the rule views (The Reference document said debug need to be
> run)? It seems I only need to run the Main.ruta from "Annotation Test" to get
> the .xmi files.
> Regards,
> Debbie
> > -----Original Message-----
> > From: Peter Klügl [mailto:pkluegl@...]
(Continue reading)

Debbie Zhang | 9 Oct 11:33 2014

Error in running UIMA Ruta sample file


I am new to UIMA Ruta. I try to learn UIMA Ruta by following the Ruta Guide
and Reference:

So far, I am able to follow the guide to section 3.5. UIMA Ruta Explain
Perspective. According to the guide, I import the UIMA Ruta example project
and open the main Ruta script file 'Main.ruta'. I right click the mouse and
select “Debug As” “1 UIMA Ruta” on the "Main.ruta" file. However, I get
the following error:

Source not found for URLClassPath$JarLoader.getJarFile(URL) line: 644

For all “Applied Rules” “Failed Rules” “Matched Rules” views, the
following message is displayed:
The instance view is currently not available

Could someone tell me what I did wrong so I can see Rules in those views?

Thank you in advance.


Debbie Zhang

Peter Klügl | 9 Oct 11:06 2014

Publication about UIMA Ruta


it has been about one and a half year since we renamed the system to its
current and nice name. However, from then on, there was no main
publication to cite in order to refer to UIMA Ruta.

I can proudly announce that this changed now.

The journal Natural Language Engineering just published the new main
article for UIMA Ruta with the title "UIMA Ruta: Rapid development of
rule-based information extraction applications". It provides a nice
overview of the language and tooling, and additionally a comparison to
related systems and descriptions of some case studies. If you are
interested in UIMA Ruta or in rule-based approaches in general, this
article could be of interest.

The direct link to the FirstView version:
I temporarily added the pdf of the accepted manuscript to my personal

If you use UIMA Ruta in academic context, please consider to cite this
paper. A bibtex entry will be added to the ruta page.