Rohini Palaniswamy (JIRA | 31 Oct 17:56 2014
Picon

[Updated] (PIG-4259) Fix few issues related to Union, CROSS and auto parallelism in Tez


     [
https://issues.apache.org/jira/browse/PIG-4259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy updated PIG-4259:
------------------------------------
    Summary: Fix few issues related to Union, CROSS and auto parallelism in Tez  (was: Fix few issues with Union
and CROSS in Tez)

> Fix few issues related to Union, CROSS and auto parallelism in Tez
> ------------------------------------------------------------------
>
>                 Key: PIG-4259
>                 URL: https://issues.apache.org/jira/browse/PIG-4259
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.14.0
>
>         Attachments: PIG-4259-1.patch
>
>

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

(Continue reading)

Rohini Palaniswamy (JIRA | 31 Oct 17:56 2014
Picon

[Updated] (PIG-4259) Fix few issues with Union and CROSS in Tez


     [
https://issues.apache.org/jira/browse/PIG-4259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy updated PIG-4259:
------------------------------------
    Attachment: PIG-4259-1.patch

Review board link - https://reviews.apache.org/r/27429/ 

Patch addresses different issues encountered while trying to debug wrong results for a production script.

Issues addressed:

- Optimized union followed directly by Limit also fixing possibility of incorrect results when Limit
could be totally removed by UnionOptimizer if parallelism of union was also 1.
-  Fixed wrong result in case of group by with secondary key followed by Union (Union_14)
-  Fixed CROSS for Union and multiquery.
-  Fixed/Optimized POLimit to not process next input in bag redundantly if limit is already reached.
-  Fixed some issues in auto parallelism and modified overriding parallelism of intermediate reducers
(PIG-4162) only for required cases.
-  Adjust the AM size based on total tasks. Pain to keep adjusting memory size after task runs for a long time
and then fails with OOM.
-  Fixes NPE in logs while fetching counters when job fails
-  Avoid printing counters everytime while printing dagStatus. Only print tasks and diagnostics.

> Fix few issues with Union and CROSS in Tez
> ------------------------------------------
>
>                 Key: PIG-4259
(Continue reading)

Rohini Palaniswamy | 31 Oct 17:54 2014
Picon

Review Request 27429: [PIG-4259] Fix few issues related to Union, CROSS and auto parallelism in Tez


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27429/
-----------------------------------------------------------

Review request for pig and Daniel Dai.

Bugs: PIG-4259
    https://issues.apache.org/jira/browse/PIG-4259

Repository: pig

Description
-------

Patch addresses different issues encountered while trying to debug wrong results for a production script.

Issues addressed:
    - Optimized union followed directly by Limit also fixing possibility of incorrect results when Limit
could be totally removed by UnionOptimizer if parallelism of union was also 1.
    -  Fixed wrong result in case of group by with secondary key followed by Union (Union_14)
    -  Fixed CROSS for Union and multiquery.
    -  Fixed/Optimized POLimit to not process next input in bag redundantly if limit is already reached.
    -  Fixed some issues in auto parallelism and modified overriding parallelism of intermediate reducers
(PIG-4162) only for required cases.
    -  Adjust the AM size based on total tasks. Pain to keep adjusting memory size after task runs for a long time
and then fails with OOM.
    -  Fixes NPE in logs while fetching counters when job fails
    -  Avoid printing counters everytime while printing dagStatus. Only print tasks and diagnostics.
(Continue reading)

Rohini Palaniswamy (JIRA | 31 Oct 17:23 2014
Picon

[Created] (PIG-4259) Fix few issues with Union and CROSS in Tez

Rohini Palaniswamy created PIG-4259:
---------------------------------------

             Summary: Fix few issues with Union and CROSS in Tez
                 Key: PIG-4259
                 URL: https://issues.apache.org/jira/browse/PIG-4259
             Project: Pig
          Issue Type: Sub-task
            Reporter: Rohini Palaniswamy
            Assignee: Rohini Palaniswamy
             Fix For: 0.14.0

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Daniel Dai (JIRA | 31 Oct 08:32 2014
Picon

[Updated] (PIG-4258) Fix several e2e tests on Windows


     [
https://issues.apache.org/jira/browse/PIG-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-4258:
----------------------------
      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Patch committed to both trunk and 0.14 branch. Thanks Rohini for review!

> Fix several e2e tests on Windows
> --------------------------------
>
>                 Key: PIG-4258
>                 URL: https://issues.apache.org/jira/browse/PIG-4258
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4258-1.patch, PIG-4258-2.patch
>
>
> Several issues found in Windows e2e tests:
> 1. Inconsistency in existing conf file, tmpPath is tmp/pigtest in default.conf, but /tmp/pigtest in
others; We do perl cleanup using "$me =~ s/[^a-zA-Z0-9]*//g" in default.conf, but "chomp $me" in rpm.conf
(Continue reading)

Rohini Palaniswamy (JIRA | 31 Oct 08:20 2014
Picon

[Commented] (PIG-4258) Fix several e2e tests on Windows


    [
https://issues.apache.org/jira/browse/PIG-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191531#comment-14191531
] 

Rohini Palaniswamy commented on PIG-4258:
-----------------------------------------

+1

> Fix several e2e tests on Windows
> --------------------------------
>
>                 Key: PIG-4258
>                 URL: https://issues.apache.org/jira/browse/PIG-4258
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4258-1.patch, PIG-4258-2.patch
>
>
> Several issues found in Windows e2e tests:
> 1. Inconsistency in existing conf file, tmpPath is tmp/pigtest in default.conf, but /tmp/pigtest in
others; We do perl cleanup using "$me =~ s/[^a-zA-Z0-9]*//g" in default.conf, but "chomp $me" in rpm.conf
> 2. build.xml only test tarball install, for rpm install which does not have the whole tarball, compiling
test udf fail
(Continue reading)

Daniel Dai (JIRA | 31 Oct 08:18 2014
Picon

[Updated] (PIG-4258) Fix several e2e tests on Windows


     [
https://issues.apache.org/jira/browse/PIG-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-4258:
----------------------------
    Attachment: PIG-4258-2.patch

Address Rohini's review comment. Removed pig.jar.dir. "cat touch" is actually "cat nonexist". Change
the patch to make it clear.

> Fix several e2e tests on Windows
> --------------------------------
>
>                 Key: PIG-4258
>                 URL: https://issues.apache.org/jira/browse/PIG-4258
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4258-1.patch, PIG-4258-2.patch
>
>
> Several issues found in Windows e2e tests:
> 1. Inconsistency in existing conf file, tmpPath is tmp/pigtest in default.conf, but /tmp/pigtest in
others; We do perl cleanup using "$me =~ s/[^a-zA-Z0-9]*//g" in default.conf, but "chomp $me" in rpm.conf
> 2. build.xml only test tarball install, for rpm install which does not have the whole tarball, compiling
(Continue reading)

jira | 31 Oct 08:00 2014
Picon

Subscription: PIG patch available

Issue Subscription
Filter: PIG patch available (22 issues)

Subscriber: pigdaily

Key         Summary
PIG-4258    Fix several e2e tests on Windows
            https://issues.apache.org/jira/browse/PIG-4258
PIG-4257    Fix several e2e tests on secure cluster
            https://issues.apache.org/jira/browse/PIG-4257
PIG-4251    Pig on Storm
            https://issues.apache.org/jira/browse/PIG-4251
PIG-4239    "pig.output.lazy" not works in spark mode
            https://issues.apache.org/jira/browse/PIG-4239
PIG-4232    UDFContext is not initialized in executors when running on Spark cluster
            https://issues.apache.org/jira/browse/PIG-4232
PIG-4224    Upload Tez payload history string to timeline server
            https://issues.apache.org/jira/browse/PIG-4224
PIG-4207    Make python udfs work with Spark
            https://issues.apache.org/jira/browse/PIG-4207
PIG-4111    Make Pig compiles with avro-1.7.7
            https://issues.apache.org/jira/browse/PIG-4111
PIG-4103    Fix TestRegisteredJarVisibility(after PIG-4083)
            https://issues.apache.org/jira/browse/PIG-4103
PIG-4084    Port TestPigRunner to Tez
            https://issues.apache.org/jira/browse/PIG-4084
PIG-4066    An optimization for ROLLUP operation in Pig
            https://issues.apache.org/jira/browse/PIG-4066
PIG-4004    Upgrade the Pigmix queries from the (old) mapred API to mapreduce
            https://issues.apache.org/jira/browse/PIG-4004
(Continue reading)

Daniel Dai (JIRA | 31 Oct 07:52 2014
Picon

[Commented] (PIG-4257) Fix several e2e tests on secure cluster


    [
https://issues.apache.org/jira/browse/PIG-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191498#comment-14191498
] 

Daniel Dai commented on PIG-4257:
---------------------------------

I only see it in our AD-MIT (active directory + mit kerberos) deployment. Have no idea why it makes a difference.

> Fix several e2e tests on secure cluster
> ---------------------------------------
>
>                 Key: PIG-4257
>                 URL: https://issues.apache.org/jira/browse/PIG-4257
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4257-1.patch
>
>
> There are several tests fail on some secure cluster setting. For example: Bloom_3, Union_\[7,8,13\],
Join_\[6-8\]. Here is one stack:
> {code}
> Error: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing
(Name: Local Rearrange[tuple]{chararray}(false) - scope-78 Operator Key: scope-78):
(Continue reading)

Daniel Dai (JIRA | 31 Oct 07:49 2014
Picon

[Updated] (PIG-4256) Fix StreamingPythonUDFs e2e test failure on Windows


     [
https://issues.apache.org/jira/browse/PIG-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-4256:
----------------------------
      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Patch committed to both trunk and 0.14 branch. Thanks Rohini for review!

> Fix StreamingPythonUDFs e2e test failure on Windows
> ---------------------------------------------------
>
>                 Key: PIG-4256
>                 URL: https://issues.apache.org/jira/browse/PIG-4256
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4256-1.patch
>
>

--
This message was sent by Atlassian JIRA
(Continue reading)

Daniel Dai (JIRA | 31 Oct 07:47 2014
Picon

[Updated] (PIG-4253) Add a UniqueID UDF


     [
https://issues.apache.org/jira/browse/PIG-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-4253:
----------------------------
    Summary: Add a UniqueID UDF  (was: Add a SequenceID UDF)

> Add a UniqueID UDF
> ------------------
>
>                 Key: PIG-4253
>                 URL: https://issues.apache.org/jira/browse/PIG-4253
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4253-1.patch, PIG-4253-2.patch
>
>

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Gmane