Picon
Favicon

[jira] Commented: (UIMA-1193) Tagger throws occasional NPE


    [
https://issues.apache.org/jira/browse/UIMA-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635900#action_12635900
] 

Eugenie Giesbrecht commented on UIMA-1193:
------------------------------------------

Could you please send me an example of text with which this error happens, so that I have a starting point.. 

> Tagger throws occasional NPE
> ----------------------------
>
>                 Key: UIMA-1193
>                 URL: https://issues.apache.org/jira/browse/UIMA-1193
>             Project: UIMA
>          Issue Type: Bug
>          Components: Sandbox-Tagger
>    Affects Versions: 2.2.2
>            Reporter: Thilo Goetz
>            Assignee: Thilo Goetz
>             Fix For: 2.3S
>
>
> Tagger throws occasional NPE in Viterbi estimation.

--

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
(Continue reading)

Thilo Goetz (JIRA | 1 Oct 10:39
Picon
Favicon

[jira] Commented: (UIMA-1193) Tagger throws occasional NPE


    [
https://issues.apache.org/jira/browse/UIMA-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635975#action_12635975
] 

Thilo Goetz commented on UIMA-1193:
-----------------------------------

That's part of the problem.  We see the error intermittently in an interactive environment, and it's not
easily possible to figure out what text it fails on.  Also, at the place where the NPE is thrown, the document
text is not available.  So I can't easily dump it to disk either.

I've prepared anther patch that we're currently testing.  If that doesn't work either, I'll create a debug
version that can dump the text and we can create a test case that way.

> Tagger throws occasional NPE
> ----------------------------
>
>                 Key: UIMA-1193
>                 URL: https://issues.apache.org/jira/browse/UIMA-1193
>             Project: UIMA
>          Issue Type: Bug
>          Components: Sandbox-Tagger
>    Affects Versions: 2.2.2
>            Reporter: Thilo Goetz
>            Assignee: Thilo Goetz
>             Fix For: 2.3S
>
>
> Tagger throws occasional NPE in Viterbi estimation.
(Continue reading)

Picon
Favicon

[jira] Commented: (UIMA-1140) Embedded broker should be eliminated


    [
https://issues.apache.org/jira/browse/UIMA-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636053#action_12636053
] 

Jerry Cwiklik commented on UIMA-1140:
-------------------------------------

Found another problem with dd2spring. Inner aggregate bean with remote delegate is missing an
OutputChannel Bean. This bean is required to send messages to a remote delegate.

   <!--======================================-->
   <!-- Async Aggregate: MeetingDetector_1.2 -->
   <!--======================================-->
   <bean id="asAggr_ctlr_MeetingDetector_1.2"
         class="org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl"
         init-method="initialize">
      <constructor-arg index="0" ref="asAggr_ctlr_TopLevelTaeQueue_1"/>
      <constructor-arg index="1" value="MeetingDetector"/>
      <constructor-arg index="2"
                       value="file:/C:/dev/workspace/apache-uima/uimaj-as-activemq/src/test/resources/descriptors/analysis_engine/../tutorial/ex4/MeetingDetectorAndNoOpTAE.xml"/>
      <constructor-arg index="3" ref="casManager"/>
      <constructor-arg index="4" ref="inProcessCache"/>
      <constructor-arg index="5" ref="delegate_map_MeetingDetector_1.2"/>
      <property name="serviceEndpointName" value="inQ_MeetingDetector_1.2"/>
      <property name="controllerBeanName" value="asAggr_ctlr_MeetingDetector_1.2"/>
      <property name="errorHandlerChain" ref="err_hdlr_chn_MeetingDetector_1.2"/>
      <property name="flowControllerDescriptor"
                value="*importByName:org.apache.uima.flow.FixedFlowController"/>
   </bean>
(Continue reading)

Marshall Schor | 1 Oct 17:23

A possible core speedup

I'm not sure this would speed anything up - but just in case:

While doing some profiling, it appears that some time is spent (during
deserializing) in CasImpl's ll_getTypeClass.  This code is a big switch
statement written as a series of "if" statements, testing the current
typeSystemImpl instance's set of values for primitive type codes.

Switch can't be used here because the "case" values must be constants. 
I think all the built-in type codes are constant - but the
implementation seems to be allowing them to change (they're referenced
from the TypeSystemImpl instance, as non-final values). 

Using constants would probably speed up this function quite a bit.  Any
reasons anyone sees we couldn't/shouldn't change this to use constants?

-Marshall

Thilo Goetz | 1 Oct 18:23
Picon
Picon

Re: A possible core speedup

I was half way through writing a reply when my email client
crashed.  Oh well.  Here's the abbreviated version.  There's
nothing wrong with this in principle, just check the comments
on CASImpl.setupTSDefault().  Make sure that the constants
actually correspond to the order in which types are created.

Also check the TYPE_CLASS constants.  They may actually have
the correct values for the built-in types, not sure.

I doubt you'll get a big speed-up out of this, though.  Let
us know what you find.

--Thilo

Marshall Schor wrote:
> I'm not sure this would speed anything up - but just in case:
> 
> While doing some profiling, it appears that some time is spent (during
> deserializing) in CasImpl's ll_getTypeClass.  This code is a big switch
> statement written as a series of "if" statements, testing the current
> typeSystemImpl instance's set of values for primitive type codes.
> 
> Switch can't be used here because the "case" values must be constants. 
> I think all the built-in type codes are constant - but the
> implementation seems to be allowing them to change (they're referenced
> from the TypeSystemImpl instance, as non-final values). 
> 
> Using constants would probably speed up this function quite a bit.  Any
> reasons anyone sees we couldn't/shouldn't change this to use constants?
> 
(Continue reading)

Marshall Schor | 1 Oct 21:21

Another interesting potential speedup

Profiling certainly shows unusual places you'd never think to look :-)

This may be a bit of an anomaly - but we have a scaleout test for
uima-as, sending large numbers of CASes over the wire (but the test is
running in multiple JVMs on one machine - so there's no network
delays).  We're running this with essentially empty CASes - just to see
where other overhead is.

We expected that things like deserialization would not show up - because
the CASes were empty.  However, deserialization was the biggest time
consumer.  Looking into this, it turns out that (in our particular case)
90% of the time in deserialization was due to creating a new XML Reader
(the call: XMLReaderFactory.createXMLReader.  A quick search on the
internet turned up this link:
http://www.ibm.com/developerworks/xml/library/x-perfap2.html which
suggested this could indeed be a bottleneck, which could be avoided by
reusing the same XMLReader object, instead of throwing it away and
getting a new one on every call.

This would take some work (pooling, etc.) to make things thread-safe,
but might be a good thing to do -- unless small but non-empty CASes turn
out to bottleneck in some other way that swamps this measurement.

This only applies to transports that use XML-style of
serialization/deserialization, of course.

-Marshall

Picon
Favicon

[jira] Commented: (UIMA-1140) Embedded broker should be eliminated


    [
https://issues.apache.org/jira/browse/UIMA-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636142#action_12636142
] 

Jerry Cwiklik commented on UIMA-1140:
-------------------------------------

Another problem with dd2spring. A message handler chain for an aggregate does not contain handler for
processing requests coming from a remote delegate Cas Multiplier. When a CM receives an input CAS, it
generates new CASes from it. These new CASes are sent to the client (aggregate) in a *Request* message. An
InputChannel for handling messages from a remote Cas Multiplier must include the following handlers:

1) ProcessResponse Handler - processing Process reply msgs from the remote ( Needed for both Remote CM and
Non-CM). This handler handles Input CASes returned from the delegate
2) ProcessRequest Handler - processing  requests msgs from the remote (Only for Remote Cas Multipliers).
This handler handles *New* CASes sent by the remote delegate CM.
3) GetMetaResponse Handler - processing GetMete reply msgs from the remote ( Needed for both Remote CM and Non-CM)

Currently only 1 and 3 are added by dd2spring. Need #2 as well.

> Embedded broker should be eliminated
> ------------------------------------
>
>                 Key: UIMA-1140
>                 URL: https://issues.apache.org/jira/browse/UIMA-1140
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Async Scaleout
>            Reporter: Eddie Epstein
(Continue reading)

Picon
Favicon

[jira] Created: (UIMA-1194) JMX stats fro UIMA AS seem inconsistent

JMX stats fro UIMA AS seem inconsistent 
----------------------------------------

                 Key: UIMA-1194
                 URL: https://issues.apache.org/jira/browse/UIMA-1194
             Project: UIMA
          Issue Type: Bug
          Components: Async Scaleout
            Reporter: Jerry Cwiklik

The aggregate's JMX stats for remote delegate seem different from those shown by the delegate's JMX stats.
Specifically, analysis times are different. These numbers should be the same in both. it appears that the
numbers shown in the delegate's stats are always larger. 

--

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Marshall Schor | 2 Oct 06:29

source of some unusual profiling measurements

While measuring / profiling uima-as - I've notices some unusually high
time being spent in methods that are practically empty - e.g. "delegate"
method of "HandlerBase" class.

It turns out that the uima-as code has lots of tracing to the log,
normally disabled.  We had previously discovered that it payed to avoid
calling "logrb" with all of its arguments, if logging wasn't enabled for
that level, so in the base uima code all the logger calls are wrapped
with an if statement testing first if that logging level is enabled,
thus avoiding computing the arguments of logrb. 

This wrappping is missing in the newer uima-as code - I'm thinking that
this might account for the unusually high % of time being observed.  I
haven't been able to test this though - my profiling experiments started
to hang after I made this change.

-Marshall

Aaron Kaplan (JIRA | 2 Oct 15:15
Picon
Favicon

[jira] Created: (UIMA-1195) ConcurrentModificationException in CasCopier.copyCas()

ConcurrentModificationException in CasCopier.copyCas()
------------------------------------------------------

                 Key: UIMA-1195
                 URL: https://issues.apache.org/jira/browse/UIMA-1195
             Project: UIMA
          Issue Type: Bug
          Components: Core Java Framework
            Reporter: Aaron Kaplan
            Priority: Minor

I get a ConcurrentModificationException in the last line of the following snippet:

public class MovingImageMerger extends JCasMultiplier_ImplBase{
	public void process(JCas jcas) throws AnalysisEngineProcessException {
			if (resultJCas != null) throw new AnalysisEngineProcessException();
			resultJCas = getEmptyJCas();
			CasCopier.copyCas(jcas.getCas(), resultJCas.getCas(), true);

Here is the stack trace:

Caused by: java.util.ConcurrentModificationException
	at org.apache.uima.cas.impl.FSIndexRepositoryImpl$PointerIterator.checkConcurrentModification(FSIndexRepositoryImpl.java:264)
	at org.apache.uima.cas.impl.FSIndexRepositoryImpl$PointerIterator.checkConcurrentModificationAll(FSIndexRepositoryImpl.java:275)
	at org.apache.uima.cas.impl.FSIndexRepositoryImpl$PointerIterator.moveToNext(FSIndexRepositoryImpl.java:311)
	at org.apache.uima.cas.impl.FSIndexRepositoryImpl$PointerIterator.inc(FSIndexRepositoryImpl.java:541)
	at org.apache.uima.cas.impl.FSIteratorWrapper.moveToNext(FSIteratorWrapper.java:67)
	at org.apache.uima.cas.impl.FSIteratorImplBase.next(FSIteratorImplBase.java:48)
	at org.apache.uima.util.CasCopier.copyCasView(CasCopier.java:140)
	at org.apache.uima.util.CasCopier.copyCas(CasCopier.java:101)
(Continue reading)


Gmane