Picon

[jira] [Work started] (UIMA-4119) jcasgen-maven-plugin generates no files on Windows


     [
https://issues.apache.org/jira/browse/UIMA-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on UIMA-4119 started by Richard Eckart de Castilho.
--------------------------------------------------------
> jcasgen-maven-plugin generates no files on Windows
> --------------------------------------------------
>
>                 Key: UIMA-4119
>                 URL: https://issues.apache.org/jira/browse/UIMA-4119
>             Project: UIMA
>          Issue Type: Bug
>          Components: jcasgen-maven-plugin
>    Affects Versions: 2.6.0SDK
>            Reporter: Richard Eckart de Castilho
>            Assignee: Richard Eckart de Castilho
>             Fix For: 2.7.0SDK
>
>
> jcasgen-maven-plugin generates a temporary type system file importing all type system files for which
JCas classes should be generated. On Windows, this file contains invalid imports, e.g.:
> {noformat}<?xml version="1.0" encoding="UTF-8"?>
> <typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier">
>     <imports>
>         <import location="file:/C:/de.tudarmstadt.ukp.dkpro.core-asl/de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/desc/type/Morpheme.xml"/>
>         <import location="file:/C:/de.tudarmstadt.ukp.dkpro.core-asl/de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/desc/type/POS.xml"/>
>     </imports>
> </typeSystemDescription>
> {noformat}
(Continue reading)

Picon

[jira] [Comment Edited] (UIMA-4119) jcasgen-maven-plugin generates no files on Windows


    [
https://issues.apache.org/jira/browse/UIMA-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222433#comment-14222433
] 

Richard Eckart de Castilho edited comment on UIMA-4119 at 11/23/14 5:51 PM:
----------------------------------------------------------------------------

On windows, the default path representation and the URI path representation differ: "/C:/..." vs.
"C:\...". For this reason, Jg.isOutOfScope() fails to detect that types are within scope on Windows and
generates nothing.

That doesn't explain why the user reporting this problem initially had strange imports in the temporary
type system descriptor, but it was the reason why type generation failed when I tried to reproduce this.
Let's see if that also fixes the issue for the user.

was (Author: rec):
On windows, the default path representation and the URI path representation differ: "/C:/..." vs.
"C:\...". For this reason, Jg.isOutOfScope() fails to detect that types are within scope on Windows and
generates nothing.

> jcasgen-maven-plugin generates no files on Windows
> --------------------------------------------------
>
>                 Key: UIMA-4119
>                 URL: https://issues.apache.org/jira/browse/UIMA-4119
>             Project: UIMA
>          Issue Type: Bug
>          Components: jcasgen-maven-plugin
>    Affects Versions: 2.6.0SDK
(Continue reading)

Picon

[jira] [Commented] (UIMA-4119) jcasgen-maven-plugin generates no files on Windows


    [
https://issues.apache.org/jira/browse/UIMA-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222433#comment-14222433
] 

Richard Eckart de Castilho commented on UIMA-4119:
--------------------------------------------------

On windows, the default path representation and the URI path representation differ: "/C:/..." vs.
"C:\...". For this reason, Jg.isOutOfScope() fails to detect that types are within scope on Windows and
generates nothing.

> jcasgen-maven-plugin generates no files on Windows
> --------------------------------------------------
>
>                 Key: UIMA-4119
>                 URL: https://issues.apache.org/jira/browse/UIMA-4119
>             Project: UIMA
>          Issue Type: Bug
>          Components: jcasgen-maven-plugin
>    Affects Versions: 2.6.0SDK
>            Reporter: Richard Eckart de Castilho
>            Assignee: Richard Eckart de Castilho
>             Fix For: 2.7.0SDK
>
>
> jcasgen-maven-plugin generates a temporary type system file importing all type system files for which
JCas classes should be generated. On Windows, this file contains invalid imports, e.g.:
> {noformat}<?xml version="1.0" encoding="UTF-8"?>
> <typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier">
(Continue reading)

Srinivas Yerram | 22 Nov 18:33 2014

UIMA framework annotators multiple languages support clarifications


Dear Sir / Madam,

My core use cases are related to email data parsing, which are in different templates and in different
languages. Which I need to extract useful information through UIMA annotators or any other plugin
components. Scalability and clustering is high priority in my use case.

I would like to get clarification on apache UIMA framework as mentioned in below:

Whether UIMA framework annotators or any plug-in components will support for multi-language(like
English,French,Arabic,Chinese etc) to parse the email contents ?

Whether can I integrate Stanford NLP  libraries can be used as a plugin for apache UIMA framework components ?

I will appreciate for any quick response on this. Thanks

Regards,
Srinivas Yerram
Picon

[jira] [Created] (UIMA-4119) jcasgen-maven-plugin generates no files on Windows

Richard Eckart de Castilho created UIMA-4119:
------------------------------------------------

             Summary: jcasgen-maven-plugin generates no files on Windows
                 Key: UIMA-4119
                 URL: https://issues.apache.org/jira/browse/UIMA-4119
             Project: UIMA
          Issue Type: Bug
          Components: jcasgen-maven-plugin
    Affects Versions: 2.6.0SDK
            Reporter: Richard Eckart de Castilho
            Assignee: Richard Eckart de Castilho
             Fix For: 2.7.0SDK

jcasgen-maven-plugin generates a temporary type system file importing all type system files for which
JCas classes should be generated. On Windows, this file contains invalid imports, e.g.:

{noformat}<?xml version="1.0" encoding="UTF-8"?>
<typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier">
    <imports>
        <import location="file:/C:/de.tudarmstadt.ukp.dkpro.core-asl/de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/desc/type/Morpheme.xml"/>
        <import location="file:/C:/de.tudarmstadt.ukp.dkpro.core-asl/de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/desc/type/POS.xml"/>
    </imports>
</typeSystemDescription>
{noformat}

The paths in this file appear to be absolute paths starting with C:\, but actually they are relative to the
Eclipse workspace root.

--
(Continue reading)

Marshall Schor (JIRA | 21 Nov 19:35 2014
Picon

[jira] [Resolved] (UIMA-4003) add alternative int - int maps, and int sets for better space/time performance


     [
https://issues.apache.org/jira/browse/UIMA-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marshall Schor resolved UIMA-4003.
----------------------------------
    Resolution: Fixed

Switched ListUtils to use PositiveIntSet

> add alternative int - int maps, and int sets for better space/time performance
> ------------------------------------------------------------------------------
>
>                 Key: UIMA-4003
>                 URL: https://issues.apache.org/jira/browse/UIMA-4003
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>    Affects Versions: 2.6.0SDK
>            Reporter: Marshall Schor
>            Assignee: Marshall Schor
>            Priority: Minor
>             Fix For: 2.7.0SDK
>
>
> int - int maps and int sets are implemented in UIMA as either red-black trees (size = 5 words (4 words for
sets) + 1 bit per item, search time = log 2 size (binary search), insert /removal can cause rebalancing
tree), or as intVectors (like ArrayList<Integers> but doesn't wrap ints as Integers).
> For int - int maps, add a hash version (loses key "ordering"), which takes 3 - 6 words per item (avg 4.5 words -
slightly smaller), and has O(1) performance (based on existing JCasHashMap impl, but without
(Continue reading)

Eddie Epstein (JIRA | 21 Nov 17:07 2014
Picon

[jira] [Created] (UIMA-4118) viaducc: add ability to supress cancel_on_interrupt

Eddie Epstein created UIMA-4118:
-----------------------------------

             Summary: viaducc: add ability to supress cancel_on_interrupt
                 Key: UIMA-4118
                 URL: https://issues.apache.org/jira/browse/UIMA-4118
             Project: UIMA
          Issue Type: Bug
          Components: DUCC
            Reporter: Eddie Epstein
            Assignee: Eddie Epstein
            Priority: Minor

By default, viaducc launches remote processes with --cancel_on_interrupt. Allow users to be able to
disconnect from remote launches without terminating them.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Marshall Schor | 21 Nov 17:03 2014

Help test UIMA SDK 2.7.0

Hi,

If you can, please help test the new 2.7.0-SNAPSHOT, by building from trunk. 
When you do test runs, please add the JVM param

-Duima.check_invalid_fs_updates

to activate the new check to see if code is potentially corrupting any UIMA index.
This activates a new kind of check that checks all modifications to features to
see if that feature is
  being used as a key in a Sort or Set index, and the Feature Structure being
modified is currently in
  one or more indices (Jira issue UIMA-4059). Doing such a modification can
cause the index to become corrupt. 
  (The correct way   to do an update in this case is to
    first remove the Feature Structure from the indices,
    do the modification,
    and then add it back to the indices.)

It is somewhat likely that old code may start failing, due to this test, and due
to the stricter enforcement of correct Sofa references for adding subtypes of
AnnotationBase to the right view.

Currently, the JUnit tests pass, but several of these needed fixing due to this
increased checking.

I plan to write a few more unit tests to test the new deserialization of delta
CASes updating existing FSs (to insure indexed item updates don't corrupt indices).

-M
(Continue reading)

Marshall Schor (JIRA | 21 Nov 16:15 2014
Picon

[jira] [Resolved] (UIMA-4117) Change JSON format defaulting to include 0 values


     [
https://issues.apache.org/jira/browse/UIMA-4117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marshall Schor resolved UIMA-4117.
----------------------------------
    Resolution: Fixed

> Change JSON format defaulting to include 0 values
> -------------------------------------------------
>
>                 Key: UIMA-4117
>                 URL: https://issues.apache.org/jira/browse/UIMA-4117
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>            Reporter: Marshall Schor
>            Assignee: Marshall Schor
>             Fix For: 2.7.0SDK
>
>
> When serializing out CASes, begin features with value 0 are omitted by default.  This has confused users,
who expected to see these.  Change the default for UIMA numeric feature types to be to not omit them.  Update
the method name controlling this to setOmit0Values, since it's only controlling serialization for
numeric 0 values.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

(Continue reading)

Marshall Schor (JIRA | 21 Nov 16:15 2014
Picon

[jira] [Resolved] (UIMA-4116) change format of JSON map names so they can be JavaScript identifiers


     [
https://issues.apache.org/jira/browse/UIMA-4116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marshall Schor resolved UIMA-4116.
----------------------------------
    Resolution: Fixed

> change format of JSON map names so they can be JavaScript identifiers
> ---------------------------------------------------------------------
>
>                 Key: UIMA-4116
>                 URL: https://issues.apache.org/jira/browse/UIMA-4116
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>            Reporter: Marshall Schor
>            Assignee: Marshall Schor
>             Fix For: 2.7.0SDK
>
>
> In Javascript, if you have an hash (map) object  in a variable, e.g. "cas", you reference the value of a key
using {{ cas["keyname"] }} or, better: {{ cas.keyname }}.  For the 2nd form to work, keyname must be a valid
Javascript name.  Names like  <at> views are not valid.  So change these to forms e.g. _views which are valid.  

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

(Continue reading)

Marshall Schor (JIRA | 21 Nov 14:31 2014
Picon

[jira] [Updated] (UIMA-4117) Change JSON format defaulting to include 0 values


     [
https://issues.apache.org/jira/browse/UIMA-4117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marshall Schor updated UIMA-4117:
---------------------------------
    Description: When serializing out CASes, begin features with value 0 are omitted by default.  This has
confused users, who expected to see these.  Change the default for UIMA numeric feature types to be to not
omit them.  Update the method name controlling this to setOmit0Values, since it's only controlling
serialization for numeric 0 values.  (was: When serializing out CASes, begin features with value 0 are
omitted by default.  This has confused users, who expected to see these.  Change the default for UIMA
numeric feature types to be to not omit them. )

> Change JSON format defaulting to include 0 values
> -------------------------------------------------
>
>                 Key: UIMA-4117
>                 URL: https://issues.apache.org/jira/browse/UIMA-4117
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>            Reporter: Marshall Schor
>            Assignee: Marshall Schor
>             Fix For: 2.7.0SDK
>
>
> When serializing out CASes, begin features with value 0 are omitted by default.  This has confused users,
who expected to see these.  Change the default for UIMA numeric feature types to be to not omit them.  Update
the method name controlling this to setOmit0Values, since it's only controlling serialization for
numeric 0 values.
(Continue reading)


Gmane