Daniel Dai (JIRA | 3 Sep 02:57 2014
Picon

[Updated] (PIG-4146) Create a target to run mr and tez unit test in one shot


     [
https://issues.apache.org/jira/browse/PIG-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-4146:
----------------------------
    Attachment: PIG-4146-2.patch

PIG-4146-1.patch has one problem. If test-mr fail, ant does not run test-tez. PIG-4146-2.patch fix the issue.

> Create a target to run mr and tez unit test in one shot
> -------------------------------------------------------
>
>                 Key: PIG-4146
>                 URL: https://issues.apache.org/jira/browse/PIG-4146
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4146-1.patch, PIG-4146-2.patch
>
>
> Currently we need to run "ant test" twice (with different test.exec.type setting) to do Pig QE. It is
desired to have a target which can do both.

--
This message was sent by Atlassian JIRA
(Continue reading)

jira | 3 Sep 02:01 2014
Picon

Subscription: PIG patch available

Issue Subscription
Filter: PIG patch available (17 issues)

Subscriber: pigdaily

Key         Summary
PIG-4131    Pig - kerberos error
            https://issues.apache.org/jira/browse/PIG-4131
PIG-4111    Make Pig compiles with avro-1.7.7
            https://issues.apache.org/jira/browse/PIG-4111
PIG-4103    Fix TestRegisteredJarVisibility(after PIG-4083)
            https://issues.apache.org/jira/browse/PIG-4103
PIG-4066    An optimization for ROLLUP operation in Pig
            https://issues.apache.org/jira/browse/PIG-4066
PIG-4004    Upgrade the Pigmix queries from the (old) mapred API to mapreduce
            https://issues.apache.org/jira/browse/PIG-4004
PIG-4002    Disable combiner when map-side aggregation is used
            https://issues.apache.org/jira/browse/PIG-4002
PIG-3952    PigStorage accepts '-tagSplit' to return full split information
            https://issues.apache.org/jira/browse/PIG-3952
PIG-3911    Define unique fields with  <at> OutputSchema
            https://issues.apache.org/jira/browse/PIG-3911
PIG-3877    Getting Geo Latitude/Longitude from Address Lines
            https://issues.apache.org/jira/browse/PIG-3877
PIG-3873    Geo distance calculation using Haversine
            https://issues.apache.org/jira/browse/PIG-3873
PIG-3866    Create ThreadLocal classloader per PigContext
            https://issues.apache.org/jira/browse/PIG-3866
PIG-3861    duplicate jars get added to distributed cache
            https://issues.apache.org/jira/browse/PIG-3861
(Continue reading)

Daniel Dai (JIRA | 3 Sep 01:00 2014
Picon

[Updated] (PIG-4143) Port more mini cluster tests to Tez - part 7


     [
https://issues.apache.org/jira/browse/PIG-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-4143:
----------------------------
    Attachment: PIG-4143-3.patch

Another patch to address Cheolsoo's review comment.

> Port more mini cluster tests to Tez - part 7
> --------------------------------------------
>
>                 Key: PIG-4143
>                 URL: https://issues.apache.org/jira/browse/PIG-4143
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4143-1.patch, PIG-4143-2.patch, PIG-4143-3.patch
>
>
> Enable TestGroupConstParallel, TestJobSubmission, TestMergeJoin, TestNativeMapReduce, TestPigProgressReporting.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
(Continue reading)

Daniel Dai (JIRA | 3 Sep 00:52 2014
Picon

[Updated] (PIG-4149) Rounding issue in FindQuantiles


     [
https://issues.apache.org/jira/browse/PIG-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-4149:
----------------------------
    Attachment: PIG-4149-1.patch

> Rounding issue in FindQuantiles
> -------------------------------
>
>                 Key: PIG-4149
>                 URL: https://issues.apache.org/jira/browse/PIG-4149
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4149-0.patch, PIG-4149-1.patch
>
>
> In FindQuantiles, Pig calculates an integer toSkip inside sample, and skip "toSkip" sample records to
find the next boundary. However, toSkip should not be an integer, this will cause rounding issue and all
the remainder will goes to the last partition.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
(Continue reading)

Juan Manuel Becerra (JIRA | 2 Sep 21:55 2014
Picon

[Commented] (PIG-4059) Pig on Spark


    [
https://issues.apache.org/jira/browse/PIG-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118624#comment-14118624
] 

Juan Manuel Becerra commented on PIG-4059:
------------------------------------------

When is the plan to include in a release spork? ... This is true ?. 

http://www.sigmoidanalytics.com/faster-etl-spark-apache-pig/

I tested spark (programming with scala) for ETL jobs and is really fast... if you have enough ram (it's my
case)  

> Pig on Spark
> ------------
>
>                 Key: PIG-4059
>                 URL: https://issues.apache.org/jira/browse/PIG-4059
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Rohini Palaniswamy
>            Assignee: Praveen Rachabattuni
>         Attachments: Pig-on-Spark-Design-Doc.pdf
>
>
>    There is lot of interest in adding Spark as a backend execution engine for Pig. 

--
(Continue reading)

Ameya Karve (JIRA | 2 Sep 11:19 2014
Picon

[Created] (PIG-4150) Error in Pig parser with a nested FLATTEN and TOKENIZE

Ameya Karve created PIG-4150:
--------------------------------

             Summary: Error in Pig parser with a nested FLATTEN and TOKENIZE
                 Key: PIG-4150
                 URL: https://issues.apache.org/jira/browse/PIG-4150
             Project: Pig
          Issue Type: Bug
            Reporter: Ameya Karve

I get a parsing error if I try to execute 

FOREACH a GENERATE FLATTEN(TOKENIZE(b, ';'));

This gets fixed using

FOREACH a {
  c = TOKENIZE(b, ';');
  GENERATE c;
}

Looks to me as an error in the logic of the pig parser wrt the semicolon

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Daniel Dai (JIRA | 2 Sep 08:23 2014
Picon

[Updated] (PIG-4149) Rounding issue in FindQuantiles


     [
https://issues.apache.org/jira/browse/PIG-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-4149:
----------------------------
    Attachment: PIG-4149-0.patch

Attach a draft patch to illustrate the idea. Need to add new test and run through existing tests.

> Rounding issue in FindQuantiles
> -------------------------------
>
>                 Key: PIG-4149
>                 URL: https://issues.apache.org/jira/browse/PIG-4149
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4149-0.patch
>
>
> In FindQuantiles, Pig calculates an integer toSkip inside sample, and skip "toSkip" sample records to
find the next boundary. However, toSkip should not be an integer, this will cause rounding issue and all
the remainder will goes to the last partition.

--
(Continue reading)

Daniel Dai (JIRA | 2 Sep 08:19 2014
Picon

[Created] (PIG-4149) Rounding issue in FindQuantiles

Daniel Dai created PIG-4149:
-------------------------------

             Summary: Rounding issue in FindQuantiles
                 Key: PIG-4149
                 URL: https://issues.apache.org/jira/browse/PIG-4149
             Project: Pig
          Issue Type: Bug
          Components: impl
            Reporter: Daniel Dai
            Assignee: Daniel Dai
             Fix For: 0.14.0

In FindQuantiles, Pig calculates an integer toSkip inside sample, and skip "toSkip" sample records to
find the next boundary. However, toSkip should not be an integer, this will cause rounding issue and all
the remainder will goes to the last partition.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Daniel Dai (JIRA | 2 Sep 08:16 2014
Picon

[Resolved] (PIG-4144) Make pigunit.PigTest work in tez mode


     [
https://issues.apache.org/jira/browse/PIG-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-4144.
-----------------------------
      Resolution: Fixed
    Hadoop Flags: Reviewed

Patch committed to trunk. Thanks Cheolsoo for review!

> Make pigunit.PigTest work in tez mode
> -------------------------------------
>
>                 Key: PIG-4144
>                 URL: https://issues.apache.org/jira/browse/PIG-4144
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4144-1.patch, PIG-4144-2.patch, PIG-4144-3.patch
>
>
> pigunit.PigTest does not work in both tez_local and tez mode. Need to make it work.

--
This message was sent by Atlassian JIRA
(Continue reading)

Daniel Dai (JIRA | 2 Sep 08:15 2014
Picon

[Updated] (PIG-4144) Make pigunit.PigTest work in tez mode


     [
https://issues.apache.org/jira/browse/PIG-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-4144:
----------------------------
    Attachment: PIG-4144-3.patch

New patch includes Cheolsoo's suggestion.

> Make pigunit.PigTest work in tez mode
> -------------------------------------
>
>                 Key: PIG-4144
>                 URL: https://issues.apache.org/jira/browse/PIG-4144
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4144-1.patch, PIG-4144-2.patch, PIG-4144-3.patch
>
>
> pigunit.PigTest does not work in both tez_local and tez mode. Need to make it work.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
(Continue reading)

Daniel Dai (JIRA | 2 Sep 08:06 2014
Picon

[Resolved] (PIG-4145) Port local mode tests to Tez - part1


     [
https://issues.apache.org/jira/browse/PIG-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-4145.
-----------------------------
      Resolution: Fixed
    Hadoop Flags: Reviewed

Patch committed to trunk. Thanks Cheolsoo for review!

> Port local mode tests to Tez - part1
> ------------------------------------
>
>                 Key: PIG-4145
>                 URL: https://issues.apache.org/jira/browse/PIG-4145
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4145-1.patch, PIG-4145-2.patch
>
>
> Migrate a small number of tests into tez local mode. Let's get reviewed and we can follow the same pattern to
migrate more tests.
> Note tez local mode does not work in all scenarios, I find several scripts which works in tez cluster but not
tez local mode.
(Continue reading)


Gmane