jira | 4 Jul 08:00 2015
Picon

Subscription: PIG patch available

Issue Subscription
Filter: PIG patch available (25 issues)

Subscriber: pigdaily

Key         Summary
PIG-4618    When use tez as the engine , set pig.user.cache.enabled=true  do  not take effect  
            https://issues.apache.org/jira/browse/PIG-4618
PIG-4598    Allow user defined plan optimizer rules
            https://issues.apache.org/jira/browse/PIG-4598
PIG-4581    thread safe issue in NodeIdGenerator
            https://issues.apache.org/jira/browse/PIG-4581
PIG-4539    New PigUnit
            https://issues.apache.org/jira/browse/PIG-4539
PIG-4526    Make setting up the build environment easier
            https://issues.apache.org/jira/browse/PIG-4526
PIG-4468    Pig's jackson version conflicts with that of hadoop 2.6.0
            https://issues.apache.org/jira/browse/PIG-4468
PIG-4455    Should use DependencyOrderWalker instead of DepthFirstWalker in MRPrinter
            https://issues.apache.org/jira/browse/PIG-4455
PIG-4417    Pig's register command should support automatic fetching of jars from repo.
            https://issues.apache.org/jira/browse/PIG-4417
PIG-4373    Implement Optimize the use of DistributedCache(PIG-2672) and PIG-3861 in Tez
            https://issues.apache.org/jira/browse/PIG-4373
PIG-4341    Add CMX support to pig.tmpfilecompression.codec
            https://issues.apache.org/jira/browse/PIG-4341
PIG-4323    PackageConverter hanging in Spark
            https://issues.apache.org/jira/browse/PIG-4323
PIG-4313    StackOverflowError in LIMIT operation on Spark
            https://issues.apache.org/jira/browse/PIG-4313
(Continue reading)

Xuefu Zhang (JIRA | 3 Jul 14:52 2015
Picon

[Resolved] (PIG-4613) Fix unit test failures about TestAssert


     [
https://issues.apache.org/jira/browse/PIG-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang resolved PIG-4613.
------------------------------
    Resolution: Fixed

Committed to Spark branch. Thanks, Xianda.

> Fix unit test failures about TestAssert
> ---------------------------------------
>
>                 Key: PIG-4613
>                 URL: https://issues.apache.org/jira/browse/PIG-4613
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: kexianda
>            Assignee: kexianda
>             Fix For: spark-branch
>
>         Attachments: PIG-4613.patch
>
>
> UT failed at following cases:
> org.apache.pig.test.TestAssert.testNegativeWithoutFetch
> org.apache.pig.test.TestAssert.testNegative

--
(Continue reading)

Xuefu Zhang (JIRA | 3 Jul 14:49 2015
Picon

[Updated] (PIG-4619) Cleanup: change the indent size of some files of pig on spark project from 2 to 4 space


     [
https://issues.apache.org/jira/browse/PIG-4619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-4619:
-----------------------------
    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed to Spark branch. Thanks, Liyun.

> Cleanup: change the indent size of some files of pig on spark project from 2 to 4 space
> ---------------------------------------------------------------------------------------
>
>                 Key: PIG-4619
>                 URL: https://issues.apache.org/jira/browse/PIG-4619
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>             Fix For: spark-branch
>
>         Attachments: PIG-4619.patch, indentSize.png
>
>
> following files under pig on spark project use 2 space indent:
> org.apache.pig.backend.hadoop.executionengine.spark.converter.CollectedGroupConverter
> org.apache.pig.backend.hadoop.executionengine.spark.JobMetricsListener
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLocalExecType
(Continue reading)

liyunzhang_intel (JIRA | 3 Jul 14:39 2015
Picon

[Updated] (PIG-4594) Enable "TestMultiQuery" in spark mode


     [
https://issues.apache.org/jira/browse/PIG-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

liyunzhang_intel updated PIG-4594:
----------------------------------
    Attachment: PIG-4594_2.patch

PIG-4594_2.patch is based on a0bea12 - (origin/spark) PIG-4614: Enable TestLocationInPhysicalPlan in
spark mode (Liyun via Xuefu)

> Enable "TestMultiQuery" in spark mode
> -------------------------------------
>
>                 Key: PIG-4594
>                 URL: https://issues.apache.org/jira/browse/PIG-4594
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>             Fix For: spark-branch
>
>         Attachments: PIG-4594.patch, PIG-4594_1.patch, PIG-4594_2.patch
>
>
> in https://builds.apache.org/job/Pig-spark/211/#showFailuresLink,it shows that 
> following unit test failures fail:
> org.apache.pig.test.TestMultiQuery.testMultiQueryJiraPig1068
> org.apache.pig.test.TestMultiQuery.testMultiQueryJiraPig1157
(Continue reading)

liyunzhang_intel (JIRA | 3 Jul 14:26 2015
Picon

[Commented] (PIG-4594) Enable "TestMultiQuery" in spark mode


    [
https://issues.apache.org/jira/browse/PIG-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613163#comment-14613163
] 

liyunzhang_intel commented on PIG-4594:
---------------------------------------

[~mohitsabharwal]:
Let 's make an example to explain why need to add PhysicalPlan#forceConnect and OperatorPlan#forceConnect.
 cat bin/testMultiQueryJiraPig983_2.pig 
{code}
a = load './passwd' using PigStorage(':') as (uname:chararray, passwd:chararray, uid:int, gid:int);
b = filter a by uid < 5;
c = filter a by uid >= 5;
d = join b by uname, c by uname;
{code}

{code}
#--------------------------------------------------
# Spark Plan                                  
#--------------------------------------------------

Spark node
scope-67
Store(hdfs://zly1.sh.intel.com:8020/tmp/temp-1052928641/tmp1820070054:org.apache.pig.impl.io.InterStorage)
- scope-68
|
|---a: New For Each(false,false,false,false)[bag] - scope-13
    |   |
(Continue reading)

Puneet Sareen | 3 Jul 13:59 2015

SOS: Need urgent help with Pig units

We are working on PIG script execution from Java code(PigUnit) in local mode. But we are getting following error while executing PigUnit :

 

org.apache.pig.PigException: ERROR 1002: Unable to store alias B

        at org.apache.pig.PigServer.storeEx(PigServer.java:1038)

        at org.apache.pig.PigServer.store(PigServer.java:997)

        at org.apache.pig.PigServer.store(PigServer.java:965)

        at main.Pig.runIdQuery(Pig.java:23)

        at main.Pig.main(Pig.java:10)

Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.

        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:286)

        at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)

        at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)

        at org.apache.pig.PigServer.storeEx(PigServer.java:1034)

        ... 4 more

Caused by: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.

        at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)

        at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)

        at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)

        at org.apache.hadoop.mapred.JobClient.init(JobClient.java:470)

        at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:449)

        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:163)

        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:280)

 

 

Please also find attached PigUnit (Java file) which we are trying to execute.

We have tried couple of solution to resolve the issue but we now are stuck with error and not able to proceed further.

 

Environment Detail :

1.       Haddop – 2.2.6

2.       Pig – 0.14.0

 

Regards,

Puneet

 

**************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS******** End of Disclaimer ********INFOSYS***
Mohit Sabharwal (JIRA | 3 Jul 12:37 2015
Picon

[Commented] (PIG-4619) Cleanup: change the indent size of some files of pig on spark project from 2 to 4 space


    [
https://issues.apache.org/jira/browse/PIG-4619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613115#comment-14613115
] 

Mohit Sabharwal commented on PIG-4619:
--------------------------------------

Thanks, [~kellyzly]

+1 (non-binding).

> Cleanup: change the indent size of some files of pig on spark project from 2 to 4 space
> ---------------------------------------------------------------------------------------
>
>                 Key: PIG-4619
>                 URL: https://issues.apache.org/jira/browse/PIG-4619
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>             Fix For: spark-branch
>
>         Attachments: PIG-4619.patch, indentSize.png
>
>
> following files under pig on spark project use 2 space indent:
> org.apache.pig.backend.hadoop.executionengine.spark.converter.CollectedGroupConverter
> org.apache.pig.backend.hadoop.executionengine.spark.JobMetricsListener
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLocalExecType
> Now all the files under this project should use 4 space indent.
> Besides SparkLauncher.java use tab to replace space.  We don't use tab to replace space in all the files in
this project so need change this file.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mohit Sabharwal (JIRA | 3 Jul 12:34 2015
Picon

[Commented] (PIG-4613) Fix unit test failures about TestAssert


    [
https://issues.apache.org/jira/browse/PIG-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613113#comment-14613113
] 

Mohit Sabharwal commented on PIG-4613:
--------------------------------------

Thanks, [~kexianda], [~kellyzly], LGTM

+1 (non-binding)

> Fix unit test failures about TestAssert
> ---------------------------------------
>
>                 Key: PIG-4613
>                 URL: https://issues.apache.org/jira/browse/PIG-4613
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: kexianda
>            Assignee: kexianda
>             Fix For: spark-branch
>
>         Attachments: PIG-4613.patch
>
>
> UT failed at following cases:
> org.apache.pig.test.TestAssert.testNegativeWithoutFetch
> org.apache.pig.test.TestAssert.testNegative

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mikko Kupsu (JIRA | 3 Jul 08:51 2015
Picon

[Commented] (PIG-4515) org.apache.pig.builtin.Distinct throws ClassCastException


    [
https://issues.apache.org/jira/browse/PIG-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612914#comment-14612914
] 

Mikko Kupsu commented on PIG-4515:
----------------------------------

[~rohini]: What is the status of this bug? It is still present in 0.15.0!

> org.apache.pig.builtin.Distinct throws ClassCastException
> ---------------------------------------------------------
>
>                 Key: PIG-4515
>                 URL: https://issues.apache.org/jira/browse/PIG-4515
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>         Environment: 2015-04-23 08:37:49,117 [main] INFO  org.apache.pig.Main - Apache Pig version 0.14.0
(r1640057) compiled Nov 16 2014, 18:02:05
>            Reporter: Mikko Kupsu
>         Attachments: fix_singletuplebag_classcast_exception.patch, fix_singletuplebag_classcast_exception_2.patch
>
>
> Running below script causes *ClassCastException*.
> {code}
> A = LOAD 'A' AS (a:int, b:int);
> B = GROUP A BY a;
> C = FOREACH B GENERATE Distinct(A);
> DUMP C;
> {code}
> Content of A:
> {code}
> 1	1
> 2	1
> 3	1
> 4	1
> 5	2
> 6	2
> 7	2
> 8	2
> 9	2
> {code}
> {code}
> Caused by: java.lang.ClassCastException: org.apache.pig.data.SingleTupleBag cannot be cast to org.apache.pig.data.Tuple
> 	at org.apache.pig.builtin.Distinct$Initial.exec(Distinct.java:86)
> 	at org.apache.pig.builtin.Distinct$Initial.exec(Distinct.java:78)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:323)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextTuple(POUserFunc.java:362)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:361)
> {code}

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

jira | 3 Jul 08:00 2015
Picon

Subscription: PIG patch available

Issue Subscription
Filter: PIG patch available (26 issues)

Subscriber: pigdaily

Key         Summary
PIG-4619    Cleanup: change the indent size of some files of pig on spark project from 2 to 4 space
            https://issues.apache.org/jira/browse/PIG-4619
PIG-4618    When use tez as the engine , set pig.user.cache.enabled=true  do  not take effect  
            https://issues.apache.org/jira/browse/PIG-4618
PIG-4598    Allow user defined plan optimizer rules
            https://issues.apache.org/jira/browse/PIG-4598
PIG-4581    thread safe issue in NodeIdGenerator
            https://issues.apache.org/jira/browse/PIG-4581
PIG-4539    New PigUnit
            https://issues.apache.org/jira/browse/PIG-4539
PIG-4526    Make setting up the build environment easier
            https://issues.apache.org/jira/browse/PIG-4526
PIG-4468    Pig's jackson version conflicts with that of hadoop 2.6.0
            https://issues.apache.org/jira/browse/PIG-4468
PIG-4455    Should use DependencyOrderWalker instead of DepthFirstWalker in MRPrinter
            https://issues.apache.org/jira/browse/PIG-4455
PIG-4417    Pig's register command should support automatic fetching of jars from repo.
            https://issues.apache.org/jira/browse/PIG-4417
PIG-4373    Implement Optimize the use of DistributedCache(PIG-2672) and PIG-3861 in Tez
            https://issues.apache.org/jira/browse/PIG-4373
PIG-4341    Add CMX support to pig.tmpfilecompression.codec
            https://issues.apache.org/jira/browse/PIG-4341
PIG-4323    PackageConverter hanging in Spark
            https://issues.apache.org/jira/browse/PIG-4323
PIG-4313    StackOverflowError in LIMIT operation on Spark
            https://issues.apache.org/jira/browse/PIG-4313
PIG-4251    Pig on Storm
            https://issues.apache.org/jira/browse/PIG-4251
PIG-4111    Make Pig compiles with avro-1.7.7
            https://issues.apache.org/jira/browse/PIG-4111
PIG-4002    Disable combiner when map-side aggregation is used
            https://issues.apache.org/jira/browse/PIG-4002
PIG-3952    PigStorage accepts '-tagSplit' to return full split information
            https://issues.apache.org/jira/browse/PIG-3952
PIG-3911    Define unique fields with  <at> OutputSchema
            https://issues.apache.org/jira/browse/PIG-3911
PIG-3877    Getting Geo Latitude/Longitude from Address Lines
            https://issues.apache.org/jira/browse/PIG-3877
PIG-3873    Geo distance calculation using Haversine
            https://issues.apache.org/jira/browse/PIG-3873
PIG-3866    Create ThreadLocal classloader per PigContext
            https://issues.apache.org/jira/browse/PIG-3866
PIG-3864    ToDate(userstring, format, timezone) computes DateTime with strange handling of Daylight
Saving Time with location based timezones
            https://issues.apache.org/jira/browse/PIG-3864
PIG-3851    Upgrade jline to 2.11
            https://issues.apache.org/jira/browse/PIG-3851
PIG-3668    COR built-in function when atleast one of the coefficient values is NaN
            https://issues.apache.org/jira/browse/PIG-3668
PIG-3635    Fix e2e tests for Hadoop 2.X on Windows
            https://issues.apache.org/jira/browse/PIG-3635
PIG-3587    add functionality for rolling over dates
            https://issues.apache.org/jira/browse/PIG-3587

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328&filterId=12322384

liyunzhang_intel (JIRA | 3 Jul 03:53 2015
Picon

[Updated] (PIG-4619) Cleanup: change the indent size of some files of pig on spark project from 2 to 4 space


     [
https://issues.apache.org/jira/browse/PIG-4619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

liyunzhang_intel updated PIG-4619:
----------------------------------
    Status: Patch Available  (was: Open)

> Cleanup: change the indent size of some files of pig on spark project from 2 to 4 space
> ---------------------------------------------------------------------------------------
>
>                 Key: PIG-4619
>                 URL: https://issues.apache.org/jira/browse/PIG-4619
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>             Fix For: spark-branch
>
>         Attachments: PIG-4619.patch, indentSize.png
>
>
> following files under pig on spark project use 2 space indent:
> org.apache.pig.backend.hadoop.executionengine.spark.converter.CollectedGroupConverter
> org.apache.pig.backend.hadoop.executionengine.spark.JobMetricsListener
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLocalExecType
> Now all the files under this project should use 4 space indent.
> Besides SparkLauncher.java use tab to replace space.  We don't use tab to replace space in all the files in
this project so need change this file.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Gmane