Katrin Tomanek | 1 Jun 08:33
Picon

Re: Human annotation tool for UIMA

Dear Andrew,

> I am new to UIMA and am trying to find the best tool for doing doing human
> document annotation.  For instance, if I am building a machine-learning
> based named entity tagger and I want to tag some text with named 
> entities to
> train my recognizers, what would be the best way to do that? 
I think thats a matter of human/manual annotation. Generating training 
material for ML is a laborious task which is not an issue of UIMA (as 
far as I understand). Depending on the entities and the domain and 
language you are interested in you might find annotated corpora (you 
might check http://torvald.aksis.uib.no/corpora/ for existing corpora).

regards,
Katrin

--

-- 
Katrin Tomanek
Jena University Language and Information Engineering (JULIE) Lab
Phone: +49-3641-944307
Fax:   +49-3641-944321
email: tomanek@...
URL:   http://www.coling.uni-jena.de

Thilo Goetz | 1 Jun 09:21
Picon
Picon

Re: Human annotation tool for UIMA

Katrin Tomanek wrote:
> Dear Andrew,
> 
>> I am new to UIMA and am trying to find the best tool for doing doing 
>> human
>> document annotation.  For instance, if I am building a machine-learning
>> based named entity tagger and I want to tag some text with named 
>> entities to
>> train my recognizers, what would be the best way to do that? 
> I think thats a matter of human/manual annotation. Generating training 
> material for ML is a laborious task which is not an issue of UIMA (as 
> far as I understand). Depending on the entities and the domain and 
> language you are interested in you might find annotated corpora (you 
> might check http://torvald.aksis.uib.no/corpora/ for existing corpora).
> 
> regards,
> Katrin
> 
> 
> 

Also check http://registry.dfki.de/ for software tools to manually
annotate text.  I have no personal experience with any of the tools
there, but I have heard Alembic being favorably mentioned.  It looks
like it is freely available.  It should be relatively easy to transform
the resulting XML to UIMA, either via XSLT, or with a custom XML
parser that reads the annotated data and feeds it into UIMA APIs.

BTW, I have recently hacked UIMA's CAS Visual Debugger for a colleague
to allow creating manual annotations.  That was a one-off, though, and
(Continue reading)

Kirk True | 1 Jun 09:26
Gravatar

Re: UIMA internals memory footprint

Hi Marshall,

> This reduces 4.6 MB down to 1 MB overhead for 100K annotations.

That's awesome - thanks so much for looking into this! 

Just to double-check - will this make it into the 2.2 release?

Thanks again,
Kirk 

Joe Andrieu | 1 Jun 09:40

RE: Human annotation tool for UIMA

Thilo Goetz wrote:
> BTW, I have recently hacked UIMA's CAS Visual Debugger for a 
> colleague to allow creating manual annotations.  That was a 
> one-off, though, and I haven't fed it back into the main code 
> base.  If people are interested in that kind of 
> functionality, let me know.  We wouldn't want to compete with 
> a dedicated annotation tool, though.

I would like to second Andrew Borthwick's original request for a UIMA-savvy annotation tool. 

Adding it to a full-featured annotator would probably be great, but having an open source option would
offer the most potential
upside for UIMA. Alembic and its replacement Callisto are free, but not open source, so I believe MITRE
would have to add support
for UIMA themselves.

Are there any open source annotators people would recommend for integrating with UIMA?

-j

--
Joe Andrieu
SwitchBook Software
http://www.switchbook.com
joe@...
+1 (805) 705-8651 

Julien Nioche | 1 Jun 12:00
Picon
Favicon

Re: Human annotation tool for UIMA

GATE (http://gate.ac.uk) is open source and allows to create annotations manually. The interface is tightly bound to the GATE API so porting it to UIMA would be a relatively costly operation. It would certainly be easier to write a new annotation tool from scratch. However GATE could be used in the meantime to annotate documents and save them as XML, which could be loaded by UIMA at a later stage.

There is also a UIMA plugin for GATE which allows to call UIMA processes from GATE and vice versa; but I am not sure it works with the Apache version of UIMA. That could help using existing UIMA resources for pre-annotating documents.

Hope that helps

Julien

Thilo Goetz wrote:
BTW, I have recently hacked UIMA's CAS Visual Debugger for a colleague to allow creating manual annotations. That was a one-off, though, and I haven't fed it back into the main code base. If people are interested in that kind of functionality, let me know. We wouldn't want to compete with a dedicated annotation tool, though.
I would like to second Andrew Borthwick's original request for a UIMA-savvy annotation tool. Adding it to a full-featured annotator would probably be great, but having an open source option would offer the most potential upside for UIMA. Alembic and its replacement Callisto are free, but not open source, so I believe MITRE would have to add support for UIMA themselves. Are there any open source annotators people would recommend for integrating with UIMA? -j -- Joe Andrieu SwitchBook Software http://www.switchbook.com joe-WWIN1AkEjBuIuWR1G4zioA@public.gmane.org +1 (805) 705-8651

Thilo Goetz | 1 Jun 13:33
Picon
Picon

Re: Human annotation tool for UIMA

Julien Nioche wrote:
> GATE (http://gate.ac.uk) is open source and allows to create annotations 
> manually. The interface is tightly bound to the GATE API so porting it 
> to UIMA would be a relatively costly operation. It would certainly be 
> easier to write a new annotation tool from scratch. However GATE could 
> be used in the meantime to annotate documents and save them as XML, 
> which could be loaded by UIMA at a later stage.
> 
> There is also a UIMA plugin for GATE which allows to call UIMA processes 
> from GATE and vice versa; but I am not sure it works with the Apache 
> version of UIMA. That could help using existing UIMA resources for 
> pre-annotating documents.
> 
> Hope that helps
> 
> Julien

Hi Julien,

what open source license is GATE under?  I looked at the documentation,
but couldn't find it.

--Thilo

Jukka Zitting | 1 Jun 13:52
Picon
Gravatar

Using UIMA for EEG analysis?

Hi,

I'm trying to develop a system for automatically detecting various
types of brain activity based on raw EEG data. I have gigabytes of raw
data that I want to analyze, and I'm wondering if I could use the UIMA
framework in this task.

The high level requirements is that given the raw EEG data the
analysis system should produce a set of annotations that indicate
which parts of the EEG data indicate certain kinds of brain activity
like wake/sleep, REM/non-REM, etc. The typical approach is to use
relative strengths of selected frequency bands for the classification,
but I'm also experimented with self-organizing maps and other
auto-adapting mechanisms in an attempt to increase the accuracy of the
annotations.

So far I've used custom code (both standalone applications and Matlab
plugins) to manage things, but it seems like UIMA would be a nice
framework for handling such operations. I guess I could implement both
the frequency band and more advanced analyzers as UIMA analysis
engines.

Do you think UIMA would be a good match for my needs? Are there any
(public) examples of doing something similar? Good pointers on where I
should start?

BR,

Jukka Zitting

Julien Nioche | 1 Jun 15:00
Picon
Favicon

Re: Human annotation tool for UIMA

Hi Thilo

GATE is LGPL. Besides the page I mentioned earlier, 
http://sourceforge.net/projects/gate also contains a lot of information 
about GATE (mailing lists, forums, feature requests, etc..)

J.
>
>> GATE (http://gate.ac.uk) is open source and allows to create 
>> annotations manually. The interface is tightly bound to the GATE API 
>> so porting it to UIMA would be a relatively costly operation. It 
>> would certainly be easier to write a new annotation tool from 
>> scratch. However GATE could be used in the meantime to annotate 
>> documents and save them as XML, which could be loaded by UIMA at a 
>> later stage.
>>
>> There is also a UIMA plugin for GATE which allows to call UIMA 
>> processes from GATE and vice versa; but I am not sure it works with 
>> the Apache version of UIMA. That could help using existing UIMA 
>> resources for pre-annotating documents.
>>
>> Hope that helps
>>
>> Julien
>
> Hi Julien,
>
> what open source license is GATE under?  I looked at the documentation,
> but couldn't find it.
>
> --Thilo
>
>

J. William Murdock | 1 Jun 17:05

Re: Human annotation tool for UIMA

Joe Andrieu wrote:
> Are there any open source annotators people would recommend for integrating with UIMA?
>   

One manual annotation tool that is open source is Knowtator (which is 
licensed under MPL 1.1).  As I understand it, Knowtator is intended for 
manual annotation entities and relationships in text.  It is a layer on 
top of the Protégé open source ontology editor.  I'm not really familiar 
enough with Knowtator to explicitly recommend it.  Considering its 
stated goals and the framework that it was developed on, it seems like 
it might be particularly well suited to enabling manual annotations for 
relatively elaborate type systems that have a lot of structure and many 
common relation annotation types.  The flip side is that it may be 
overkill for the (more common) task of marking up instances of a flat 
list of named-entity types.  In any event, my point here is just that 
anyone who is thinking of building a mapping from an open source manual 
annotation tool to UIMA may want to consider Knowtator, especially if 
they are interested in a lot of expressive power.

Andrew Borthwick | 1 Jun 19:41

Re: Human annotation tool for UIMA

Thanks for your help, everyone.  I think that I will first explore using
GATE as Julien suggests below.  However, if anyone had any native UIMA tool
for doing manual annotations, it would be much appreciated.  Thilo, would
your tool work as a temporary solution?

It seems to me that having some sort of solution here would be an important
part of offering a complete UIMA toolset.  In our organization, we are
planning on working with both existing corpora and corpora which are more
specific to the domain on which we are working.  There is also the problem
of testing our NLP solution on documents of interest to us.  So there will
be many scenarios in which it won't be sufficient to simply use standard
corpora and we will need to do some annotation ourselves.

Thanks again,
Andrew Borthwick

On 6/1/07, Julien Nioche <J.Nioche@...> wrote:
>
>  GATE (http://gate.ac.uk) is open source and allows to create annotations
> manually. The interface is tightly bound to the GATE API so porting it to
> UIMA would be a relatively costly operation. It would certainly be easier to
> write a new annotation tool from scratch. However GATE could be used in the
> meantime to annotate documents and save them as XML, which could be loaded
> by UIMA at a later stage.
>
> There is also a UIMA plugin for GATE which allows to call UIMA processes
> from GATE and vice versa; but I am not sure it works with the Apache version
> of UIMA. That could help using existing UIMA resources for pre-annotating
> documents.
>
> Hope that helps
>
> Julien
>
>  Thilo Goetz wrote:
>
>  BTW, I have recently hacked UIMA's CAS Visual Debugger for a
> colleague to allow creating manual annotations.  That was a
> one-off, though, and I haven't fed it back into the main code
> base.  If people are interested in that kind of
> functionality, let me know.  We wouldn't want to compete with
> a dedicated annotation tool, though.
>
>  I would like to second Andrew Borthwick's original request for a UIMA-savvy annotation tool.
>
> Adding it to a full-featured annotator would probably be great, but having an open source option would
offer the most potential
> upside for UIMA. Alembic and its replacement Callisto are free, but not open source, so I believe MITRE
would have to add support
> for UIMA themselves.
>
> Are there any open source annotators people would recommend for integrating with UIMA?
>
> -j
>
> --
> Joe Andrieu
> SwitchBook Software
> http://www.switchbook.comjoe-WWIN1AkEjBuIuWR1G4zioA <at> public.gmane.org
> +1 (805) 705-8651
>
>
>
>
>
>

Gmane