Bart Mellebeek | 3 Nov 10:39
Favicon

Question on Capabilities of AE descriptor

Hello,

I have a question on the exact role of the output types in the 
Capabilities of an AE descriptor that I couldn't find in the documentation.
A strange thing happens when I try to manipulate the descriptors of ex4/ 
of the tutorial in uimaj-examples. I am running 
ex4/MeetingDetectorTAE.xml with UIMA Document Analyzer. When I delete 
the output type RoomNumber in the Capabilities of 
ex2/RoomNumberAnnotator.xml and I run ex4/MeetingDetectorTAE.xml, the 
RoomNumber type is still visible in the analysis results. Likewise, when 
I delete the output types TimeAnnot and DateAnnot in the capabilities of 
ex3/TutorialDateTime.xml, these types are still visible in the analysis 
results. Only deleting the output type DateTimeAnnot in the capabilities 
of ex3/TutorialDateTime.xml seems to have an impact on the analysis results.

Why is it that deleting some output types have no impact on analysis 
results, while deleting other output types do have an impact? Aren't all 
output types supposed to have this impact?

Any help appreciated.
Thanks,

Bart

Peter Klügl | 4 Nov 14:55
Picon
Favicon

CAS Viewer/Editor for HTML

Hello,

maybe you can remember that one of my students, Marco Nehmeier,
developed a CAS viewer and editor especially for HTML documents. The
work is now completed and was done in a practical course, more precisely
in a "Studienarbeit". The report is written in german and I will send 
you the pdf on request. If there is more interest in this work, then I 
can maybe provide an english version of the handbook part of the report.

The possibility to view or render annotation directly in interpreted
HTML is extremely important for us since we are processing converted
word processing documents in the majority of the cases. The work started
before the issue about code donation of the UIMA CAS viewer has started.

If you are interested what we are doing here in Würzburg and how we are
using UIMA, then just ask me, take a look at a short paper of mine:
Integrating the Rule-Based IE Component TextMarker into UIMA (LWA08 FGIR)

or in near future at the TextMarker sourceforge project itself (homepage 
will follow this month, source code early next year)

best regards

Peter

PS: Sorry for possible multiple posts.

--

-- 
Peter Klügl
University of Würzburg
(Continue reading)

Peter Klügl | 4 Nov 13:11
Picon
Favicon

CAS Viewer/Editor for HTML

Hello,

you can maybe remember that one of my students, Marco Nehmeier, 
developed a CAS viewer and editor especially for HTML documents. The 
work is now completed and was done in a practical course, more precisely 
in a "Studienarbeit". The report is written in german and is uploaded 
(temporary) to 
http://ki.informatik.uni-wuerzburg.de/~pkluegl/misc/CEV-Ausarbeitung. If 
there is more interest in this work, then I can maybe provide an english 
version of the handbook part of the report.

The possibility to view or render annotation directly in interpreted 
HTML is extremely important for us since we are processing converted 
word processing documents in the majority of the cases. The work started 
before the issue about code donation of the UIMA CAS viewer has started.

If you are interested what we are doing here in Würzburg and how we are 
using UIMA, then just ask me, take a look at a short paper of mine:
Integrating the Rule-Based IE Component TextMarker into UIMA (LWA08 FGIR)

or in near future at the project itself (homepage will follow this 
month, source code early next year):
https://sourceforge.net/projects/textmarker/

best regards

Peter

--

-- 
Peter Klügl
(Continue reading)

Marshall Schor | 4 Nov 17:37

Re: Question on Capabilities of AE descriptor

Bart Mellebeek wrote:

> > Hello,
> >
> > I have a question on the exact role of the output types in the
> > Capabilities of an AE descriptor that I couldn't find in the
> > documentation.
> > A strange thing happens when I try to manipulate the descriptors of
> > ex4/ of the tutorial in uimaj-examples. I am running
> > ex4/MeetingDetectorTAE.xml with UIMA Document Analyzer. When I delete
> > the output type RoomNumber in the Capabilities of
> > ex2/RoomNumberAnnotator.xml and I run ex4/MeetingDetectorTAE.xml, the
> > RoomNumber type is still visible in the analysis results.
>   

I think this is because ex4/MeetingDetectorTAE.xml itself declares it
outputs the RoomNumber type.  The DocumentAnalyzer is just a sample application
that shows *selected* feature structure types - selected by looking at the
output capabilities of the top-most analysis engine (in the case of an aggregate
having "nested" components - such as you have in your example).  This means that
the DocumentAnalyzer may not be showing all the feature structures in the CAS,
but that doesn't mean that those feature structures are not there.

See the code in uimaj-tools project: in
src/main/org/apache/uima/tools/docanalyzer/DocumentAnalyzer.java, lines 1185 - 1207.

> > Likewise, when I delete the output types TimeAnnot and DateAnnot in
> > the capabilities of ex3/TutorialDateTime.xml, these types are still
> > visible in the analysis results. 
>   
(Continue reading)

Bart Mellebeek | 4 Nov 20:14
Favicon

Re: Question on Capabilities of AE descriptor

Marshall Schor wrote:
> Bart Mellebeek wrote:
>
>   
>>> Hello,
>>>
>>> I have a question on the exact role of the output types in the
>>> Capabilities of an AE descriptor that I couldn't find in the
>>> documentation.
>>> A strange thing happens when I try to manipulate the descriptors of
>>> ex4/ of the tutorial in uimaj-examples. I am running
>>> ex4/MeetingDetectorTAE.xml with UIMA Document Analyzer. When I delete
>>> the output type RoomNumber in the Capabilities of
>>> ex2/RoomNumberAnnotator.xml and I run ex4/MeetingDetectorTAE.xml, the
>>> RoomNumber type is still visible in the analysis results.
>>>       
>>   
>>     
>
> I think this is because ex4/MeetingDetectorTAE.xml itself declares it
> outputs the RoomNumber type.  The DocumentAnalyzer is just a sample application
> that shows *selected* feature structure types - selected by looking at the
> output capabilities of the top-most analysis engine (in the case of an aggregate
> having "nested" components - such as you have in your example).  This means that
> the DocumentAnalyzer may not be showing all the feature structures in the CAS,
> but that doesn't mean that those feature structures are not there.
>
> See the code in uimaj-tools project: in
src/main/org/apache/uima/tools/docanalyzer/DocumentAnalyzer.java, lines 1185 - 1207.
>
(Continue reading)

Eddie Epstein | 5 Nov 00:04
Picon

Re: Question on Capabilities of AE descriptor

UIMA component descriptors play several roles: one is to enable the
framework to deploy and run an analytic; another is to help describe
the functionality of the analytic. The simple version of the story is
that declaring Sofa capabilities will change how an AE is called, but
otherwise capabilities are mostly descriptive.

There are some other potential uses of capability declarations. For
distributed UIMA deployments where the CAS must be serialized from one
machine to another, the input capabilities may be used to limit the
amount of the CAS transmitted. Another use is when searching a
component repository for analytics with particular functionality.

Hope this helps,
Eddie

>
> Thanks for your input.
>
> I asked this question because I am trying to build a UIMA pipeline and the
> role of the AE capabilities in the intermediate annotators is not entirely
> clear to me. I was under the impression that for each annotator in the
> pipeline, the capabilities specify which are its input/output types.
> However, apparently once an annotation is inside the CAS, the specifications
> in the capabilities of the AEs do not seem to be relevant anymore.
>
> For example, take the aggregate ex4/MeetingDetectorTAE.xml.
>  MeetingAnnotator.java uses the types RoomNumber, DateAnnot and TimeAnnot to
> detect meetings. What surprises me is that deleting the output type
> RoomNumber in ex2/RoomNumberAnnotator.xml and deleting all the input types
> in ex4/MeetingAnnotator.xml (RoomNumber, DateAnnot and TimeAnnot) has no
(Continue reading)

Marshall Schor | 6 Nov 04:18

Re: Question on Capabilities of AE descriptor


Bart Mellebeek wrote:
> Marshall Schor wrote:
>> Bart Mellebeek wrote:
>>
>>  
>>>> Hello,
>>>>
>>>> I have a question on the exact role of the output types in the
>>>> Capabilities of an AE descriptor that I couldn't find in the
>>>> documentation.
>>>> A strange thing happens when I try to manipulate the descriptors of
>>>> ex4/ of the tutorial in uimaj-examples. I am running
>>>> ex4/MeetingDetectorTAE.xml with UIMA Document Analyzer. When I delete
>>>> the output type RoomNumber in the Capabilities of
>>>> ex2/RoomNumberAnnotator.xml and I run ex4/MeetingDetectorTAE.xml, the
>>>> RoomNumber type is still visible in the analysis results.
>>>>       
>>>       
>>
>> I think this is because ex4/MeetingDetectorTAE.xml itself declares it
>> outputs the RoomNumber type.  The DocumentAnalyzer is just a sample
>> application
>> that shows *selected* feature structure types - selected by looking
>> at the
>> output capabilities of the top-most analysis engine (in the case of
>> an aggregate
>> having "nested" components - such as you have in your example).  This
>> means that
>> the DocumentAnalyzer may not be showing all the feature structures in
(Continue reading)

Aaron Kaplan | 6 Nov 18:41
Picon
Favicon

Re: Imports with '_pear.xml' in aggregate prevent annotators to work on the right Sofa

Eddie,

(Baptiste's problem and mine are the same--we're working together.)

I tried your first example and I can't even get that to work using the 
pear descriptor and the trunk version of uima-core.  No need to go to 
the second, more complicated version with two annotators and three 
views.  The aggregate descriptor I wrote when following your test case 
instructions is below.  Do you see anything wrong?

I put a breakpoint at the line in PearAnalysisEngineWrapper.java  where 
produceAnalysisEngine() is called.  The breakpoint is reached twice, and 
the argument clonedAdditionalParameters has the following values:

First time:

{PARAM_AGGREGATE_ANALYSIS_ENGINE_NAME=Aggregate,

CONFIG_PARAM_SETTINGS=org.apache.uima.resource.metadata.impl.ConfigurationParameterSettings_impl: 

parameterSettings = Array{}

settingsForGroups = {}
}

Second time:

{PARAM_AGGREGATE_ANALYSIS_ENGINE_NAME=Aggregate, 
RESOURCE_MANAGER=org.apache.uima.resource.impl.ResourceManager_impl <at> a3ce3f,

(Continue reading)

Howard G. | 7 Nov 18:41
Picon
Favicon

OpenCalais Annotator -- Contact developer?

I am trying to use the OpenCalais Annotator from the UIMA Sandbox and it is not behaving as I expected.  Is
there any way I can contact the developer(s)?
I have not seen any activity for it since July; is it still being worked on?

A UIMA user

Howard G. | 7 Nov 18:46
Picon
Favicon

OpenCalais Annotator -- contact developer

I am trying to use the OpenCalais Annotator from the UIMA Sandbox and it is not behaving as I expected.  Is
there any way I can contact the developer(s)?
I have not seen any activity for it since July; is it still being worked on?

A UIMA user


Gmane