Rohini Palaniswamy (JIRA | 31 May 21:52 2016
Picon

[Commented] (PIG-4908) JythonFunction refers to Oozie launcher script absolute path


    [
https://issues.apache.org/jira/browse/PIG-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308472#comment-15308472
] 

Rohini Palaniswamy commented on PIG-4908:
-----------------------------------------

bq. Why path contain . make a difference.
  It does not work with paths in jar and getResourceAsStream.

> JythonFunction refers to Oozie launcher script absolute path
> ------------------------------------------------------------
>
>                 Key: PIG-4908
>                 URL: https://issues.apache.org/jira/browse/PIG-4908
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.16.0
>
>         Attachments: PIG-4908-1.patch, PIG-4908-2-fixtest.patch, PIG-4908-3-fixtest.patch
>
>
>   We had a scenario where a user had multiple udfs all named udfs.py. JythonFunction was referring to the
absolute localized path of udfs.py in the Oozie launcher. Tasks which ran on a node and which had a
different version of udfs.py localized to same path (hashcode conflict) as the Oozie launcher failed.  We
should be referring to relative path of the files. The current code check is the canonical path starts with
cwd, but that does not work as the files are downloaded to a different location and symlinked to the current
(Continue reading)

Daniel Dai (JIRA | 31 May 21:50 2016
Picon

[Updated] (PIG-4734) TOMAP schema inferring breaks some scripts in type checking for bincond


     [
https://issues.apache.org/jira/browse/PIG-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-4734:
----------------------------
       Resolution: Fixed
     Hadoop Flags: Reviewed
    Fix Version/s:     (was: 0.15.1)
                   0.17.0
           Status: Resolved  (was: Patch Available)

Patch committed to both trunk and 0.16 branch. Thanks Rohini for review!

> TOMAP schema inferring breaks some scripts in type checking for bincond
> -----------------------------------------------------------------------
>
>                 Key: PIG-4734
>                 URL: https://issues.apache.org/jira/browse/PIG-4734
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Daniel Dai
>             Fix For: 0.16.0, 0.17.0
>
>         Attachments: PIG-4734-1.patch, PIG-4734-2.patch, PIG-4734-3.patch
>
>
> PIG-4674 added schema inferring for TOMAP.
> {code}
(Continue reading)

Daniel Dai (JIRA | 31 May 21:48 2016
Picon

[Updated] (PIG-4786) CROSS will not work correctly with Grace Parallelism


     [
https://issues.apache.org/jira/browse/PIG-4786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-4786:
----------------------------
       Resolution: Fixed
     Hadoop Flags: Reviewed
    Fix Version/s: 0.17.0
           Status: Resolved  (was: Patch Available)

Patch committed to both trunk and 0.16 branch. Thanks Rohini for review!

> CROSS will not work correctly with Grace Parallelism
> ----------------------------------------------------
>
>                 Key: PIG-4786
>                 URL: https://issues.apache.org/jira/browse/PIG-4786
>             Project: Pig
>          Issue Type: Bug
>          Components: tez
>            Reporter: Rohini Palaniswamy
>            Assignee: Daniel Dai
>             Fix For: 0.16.0, 0.17.0
>
>         Attachments: PIG-4786-1.patch
>
>
> PigImplConstants.PIG_CROSS_PARALLELISM accessed in GFCross UDF will refer to the old parallelism.

(Continue reading)

Daniel Dai (JIRA | 31 May 21:45 2016
Picon

[Commented] (PIG-4908) JythonFunction refers to Oozie launcher script absolute path


    [
https://issues.apache.org/jira/browse/PIG-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308454#comment-15308454
] 

Daniel Dai commented on PIG-4908:
---------------------------------

+1.

Further, I don't see a reason to make relative path different. We shall remove the relative path handling
code in a separate ticket, which is already very fragile, eg:
path.indexOf("." + File.separator)
Why path contain . make a difference.

> JythonFunction refers to Oozie launcher script absolute path
> ------------------------------------------------------------
>
>                 Key: PIG-4908
>                 URL: https://issues.apache.org/jira/browse/PIG-4908
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.16.0
>
>         Attachments: PIG-4908-1.patch, PIG-4908-2-fixtest.patch, PIG-4908-3-fixtest.patch
>
>
>   We had a scenario where a user had multiple udfs all named udfs.py. JythonFunction was referring to the
(Continue reading)

Apache Jenkins Server | 31 May 21:31 2016
Picon

Jenkins build became unstable: Pig-trunk-commit #2337

See <https://builds.apache.org/job/Pig-trunk-commit/2337/changes>

Rohini Palaniswamy (JIRA | 31 May 19:43 2016
Picon

[Created] (PIG-4913) Reduce jython function initiation during compilation

Rohini Palaniswamy created PIG-4913:
---------------------------------------

             Summary: Reduce jython function initiation during compilation
                 Key: PIG-4913
                 URL: https://issues.apache.org/jira/browse/PIG-4913
             Project: Pig
          Issue Type: Improvement
            Reporter: Rohini Palaniswamy

While investigating PIG-4908, saw that ScriptEngine.getScriptAsStream was invoked way too many times
during compilation phase for a simple script.

{code:title=sleep.py}
#!/usr/bin/python

import time;

 <at> outputSchema("sltime:int")
def sleep(num):
    if num == 1:
        print "Sleeping for %d minutes" % num;
        time.sleep(num * 60);
    return num;
{code}
{code:title=sleep.pig}
register 'sleep.py' using jython;

A = LOAD '/tmp/sleepdata' as (f1:int);
B = FOREACH A generate $0, sleep($0);
(Continue reading)

Rohini Palaniswamy (JIRA | 31 May 19:37 2016
Picon

[Updated] (PIG-4908) JythonFunction refers to Oozie launcher script absolute path


     [
https://issues.apache.org/jira/browse/PIG-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy updated PIG-4908:
------------------------------------
    Attachment: PIG-4908-3-fixtest.patch

  When absolute path is used, the nameInJar finally has the leading / removed and added to jar. For eg: 
/tmp/sleep.py will be added to jar as tmp/sleep.py so that it is accessible through
ScriptEngine.class.getResourceAsStream(scriptPath) when the jar is in classpath. 

Problem was we were looking for the absolute path first before trying the jar. So additionally fixed that
code to look for local file first if frontend else look at the last in PIG-4908-3-fixtest.patch.

Saw that during compilation we were initializing the function too many times with the different visitors.
Will file a separate jira to cut down on that.

> JythonFunction refers to Oozie launcher script absolute path
> ------------------------------------------------------------
>
>                 Key: PIG-4908
>                 URL: https://issues.apache.org/jira/browse/PIG-4908
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.16.0
>
>         Attachments: PIG-4908-1.patch, PIG-4908-2-fixtest.patch, PIG-4908-3-fixtest.patch
(Continue reading)

Rohini Palaniswamy (JIRA | 31 May 19:19 2016
Picon

[Updated] (PIG-4373) Implement PIG-3861 in Tez


     [
https://issues.apache.org/jira/browse/PIG-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy updated PIG-4373:
------------------------------------
    Fix Version/s:     (was: 0.16.0)
                   0.17.0

> Implement PIG-3861 in Tez
> -------------------------
>
>                 Key: PIG-4373
>                 URL: https://issues.apache.org/jira/browse/PIG-4373
>             Project: Pig
>          Issue Type: Improvement
>          Components: tez
>    Affects Versions: 0.14.0
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.17.0
>
>

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Rohini Palaniswamy (JIRA | 31 May 19:19 2016
Picon

[Updated] (PIG-4911) Provide option to disable DAG recovery


     [
https://issues.apache.org/jira/browse/PIG-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy updated PIG-4911:
------------------------------------
    Fix Version/s:     (was: 0.16.0)
                   0.17.0

> Provide option to disable DAG recovery
> --------------------------------------
>
>                 Key: PIG-4911
>                 URL: https://issues.apache.org/jira/browse/PIG-4911
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.17.0
>
>
>   Tez 0.7 has lot of issues with DAG recovery with auto parallelism causing hung dags in many cases as it was
not writing auto parallelism decisions to recovery history. Rewrite was done in Tez 0.8 to handle that.
>   Code was added to Tez to automatically disable recovery if there was auto parallelism so that it would
benefit both Pig and Tez. It works fine and the second AM attempt fails with DAG cannot be recovered error
when it sees there are vertices with auto parallelism. But problem is it is hard to see what the actual
problem is for the users and is hard to debug as well as the whole UI state is rewritten with the partial
recovery information.
>     Doing the disabling of recovery in Pig itself by setting tez.dag.recovery.enabled=false will make it
not go for the second attempt at all which will eventually fail. It also makes it easy to debug the original failure.
(Continue reading)

Rohini Palaniswamy (JIRA | 31 May 19:19 2016
Picon

[Updated] (PIG-4617) XML loader is not working fine with pig 0.14 version


     [
https://issues.apache.org/jira/browse/PIG-4617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy updated PIG-4617:
------------------------------------
    Fix Version/s:     (was: 0.16.0)
                   0.17.0

> XML loader is not working fine with pig 0.14 version
> ----------------------------------------------------
>
>                 Key: PIG-4617
>                 URL: https://issues.apache.org/jira/browse/PIG-4617
>             Project: Pig
>          Issue Type: Bug
>          Components: piggybank, UI
>            Reporter: vijayalakshmi karasani
>            Assignee: Rohini Palaniswamy
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> My old pig script (to load xml files and to parse)which ran successfully through pig 0.13 version is not
running with pig 0.14 and throwing ava.lang.IndexOutOfBoundsException: start 4, end 2, s.length() 2. 
> Out of my 10 xml files, 2 are running fine and rest 8 are not file..All these xml files ran successfully with
pig 0.13 version. May be in new version, you have added more validations for well formed of xml files
> My Code:
> REGISTER '/usr/hdp/current/pig-client/lib/piggybank.jar';
> C =  LOAD '/common/data/dia/stepxml/*' using
(Continue reading)

Rohini Palaniswamy (JIRA | 31 May 19:18 2016
Picon

[Commented] (PIG-4734) TOMAP schema inferring breaks some scripts in type checking for bincond


    [
https://issues.apache.org/jira/browse/PIG-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308156#comment-15308156
] 

Rohini Palaniswamy commented on PIG-4734:
-----------------------------------------

+1

> TOMAP schema inferring breaks some scripts in type checking for bincond
> -----------------------------------------------------------------------
>
>                 Key: PIG-4734
>                 URL: https://issues.apache.org/jira/browse/PIG-4734
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Daniel Dai
>             Fix For: 0.16.0, 0.15.1
>
>         Attachments: PIG-4734-1.patch, PIG-4734-2.patch, PIG-4734-3.patch
>
>
> PIG-4674 added schema inferring for TOMAP.
> {code}
> FOREACH A GENERATE (val == 'x' ? TOMAP('key', floatfield1) : (val == 'y' ? GenerateFloatMap('key',
floatfield2) : NULL)) as floatmap:map[float],
> {code}
> The following line fails with
(Continue reading)


Gmane