jira | 30 May 08:00 2015
Picon

Subscription: PIG patch available

Issue Subscription
Filter: PIG patch available (27 issues)

Subscriber: pigdaily

Key         Summary
PIG-4578    ToDateISO should support optional ' ' space variant used by JDBC
            https://issues.apache.org/jira/browse/PIG-4578
PIG-4570    Allow AvroStorage to use a class for the schema
            https://issues.apache.org/jira/browse/PIG-4570
PIG-4539    New PigUnit
            https://issues.apache.org/jira/browse/PIG-4539
PIG-4526    Make setting up the build environment easier
            https://issues.apache.org/jira/browse/PIG-4526
PIG-4468    Pig's jackson version conflicts with that of hadoop 2.6.0
            https://issues.apache.org/jira/browse/PIG-4468
PIG-4455    Should use DependencyOrderWalker instead of DepthFirstWalker in MRPrinter
            https://issues.apache.org/jira/browse/PIG-4455
PIG-4417    Pig's register command should support automatic fetching of jars from repo.
            https://issues.apache.org/jira/browse/PIG-4417
PIG-4373    Implement Optimize the use of DistributedCache(PIG-2672) and PIG-3861 in Tez
            https://issues.apache.org/jira/browse/PIG-4373
PIG-4365    TOP udf should implement Accumulator interface
            https://issues.apache.org/jira/browse/PIG-4365
PIG-4341    Add CMX support to pig.tmpfilecompression.codec
            https://issues.apache.org/jira/browse/PIG-4341
PIG-4323    PackageConverter hanging in Spark
            https://issues.apache.org/jira/browse/PIG-4323
PIG-4313    StackOverflowError in LIMIT operation on Spark
            https://issues.apache.org/jira/browse/PIG-4313
(Continue reading)

Mohit Sabharwal (JIRA | 29 May 22:10 2015
Picon

[Updated] (PIG-4243) Fix "TestStore" for Spark engine


     [
https://issues.apache.org/jira/browse/PIG-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mohit Sabharwal updated PIG-4243:
---------------------------------
    Summary: Fix "TestStore" for Spark engine  (was: Enable unit test "TestStore" for spark)

> Fix "TestStore" for Spark engine
> --------------------------------
>
>                 Key: PIG-4243
>                 URL: https://issues.apache.org/jira/browse/PIG-4243
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: Mohit Sabharwal
>             Fix For: spark-branch
>
>         Attachments: TEST-org.apache.pig.test.TestStore.txt
>
>
> 1. Build spark and pig env according to PIG-4168
> 2. add TestStore to $PIG_HOME/test/spark-tests
> cat  $PIG_HOME/test/spark-tests
> **/TestStore
> 3. run unit test TestStore
> ant test-spark
> 4. the unit test fails
(Continue reading)

Mohit Sabharwal (JIRA | 29 May 22:08 2015
Picon

[Assigned] (PIG-4243) Enable unit test "TestStore" for spark


     [
https://issues.apache.org/jira/browse/PIG-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mohit Sabharwal reassigned PIG-4243:
------------------------------------

    Assignee: Mohit Sabharwal  (was: Ranjana Rajendran)

> Enable unit test "TestStore" for spark
> --------------------------------------
>
>                 Key: PIG-4243
>                 URL: https://issues.apache.org/jira/browse/PIG-4243
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: Mohit Sabharwal
>             Fix For: spark-branch
>
>         Attachments: TEST-org.apache.pig.test.TestStore.txt
>
>
> 1. Build spark and pig env according to PIG-4168
> 2. add TestStore to $PIG_HOME/test/spark-tests
> cat  $PIG_HOME/test/spark-tests
> **/TestStore
> 3. run unit test TestStore
> ant test-spark
(Continue reading)

Michael Howard (JIRA | 29 May 19:17 2015
Picon

[Created] (PIG-4579) casting of primitive datetime data type should work

Michael Howard created PIG-4579:
-----------------------------------

             Summary: casting of primitive datetime data type should work
                 Key: PIG-4579
                 URL: https://issues.apache.org/jira/browse/PIG-4579
             Project: Pig
          Issue Type: Improvement
            Reporter: Michael Howard
            Priority: Minor

datetime is a primitive data type. 
One should be able to cast a chararray or a long into a datetime. 
currently, this does not work. 

casting from a chararray should call the built-in UDF ToDateISO(chararray)
casting from a long should call the built-in UDF ToDate(long)

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Michael Howard (JIRA | 29 May 18:32 2015
Picon

[Updated] (PIG-4578) ToDateISO should support optional ' ' space variant used by JDBC


     [
https://issues.apache.org/jira/browse/PIG-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Howard updated PIG-4578:
--------------------------------
    Attachment: iso8601AllowSpace.patch

> ToDateISO should support optional ' ' space variant used by JDBC
> ----------------------------------------------------------------
>
>                 Key: PIG-4578
>                 URL: https://issues.apache.org/jira/browse/PIG-4578
>             Project: Pig
>          Issue Type: Improvement
>          Components: internal-udfs
>            Reporter: Michael Howard
>            Assignee: Michael Howard
>            Priority: Minor
>         Attachments: iso8601AllowSpace.patch
>
>
> ISO-8601 standardizes text representation of dates and times. 
> Strict ISO-8601 requires a 'T' between the date and time portion. 
> ISO-8601 allows a ' ' space as a variant. 
> JDBC uses a ' ' space between the date and time portion. 
> Hive (& Impala) adopt the JDBC ' ' space variant. 
> The pig built-in UDF ToDateISO(chararray) currently accepts only the strict 'T' format. This makes it
difficult to integrate with data from JDBC sources, including Hive. 
> ToDateISO(chararray) should allow either the 'T' or ' ' space variant when parsing string
(Continue reading)

Michael Howard (JIRA | 29 May 18:30 2015
Picon

[Updated] (PIG-4578) ToDateISO should support optional ' ' space variant used by JDBC


     [
https://issues.apache.org/jira/browse/PIG-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Howard updated PIG-4578:
--------------------------------
    Release Note: Built-in UDF ToDateISO(chararray) now allows a space character instead of requiring a 'T'
between date and time in an ISO-8601 timestamp. Facilitates parsing of JDBC timestamp format. 
          Status: Patch Available  (was: Open)

A minor change confined to src/org/apache/pig/builtin/ToDate.java

test case test/org/apache/pig/test/TestBuiltin.java was also extended

> ToDateISO should support optional ' ' space variant used by JDBC
> ----------------------------------------------------------------
>
>                 Key: PIG-4578
>                 URL: https://issues.apache.org/jira/browse/PIG-4578
>             Project: Pig
>          Issue Type: Improvement
>          Components: internal-udfs
>            Reporter: Michael Howard
>            Assignee: Michael Howard
>            Priority: Minor
>
> ISO-8601 standardizes text representation of dates and times. 
> Strict ISO-8601 requires a 'T' between the date and time portion. 
> ISO-8601 allows a ' ' space as a variant. 
> JDBC uses a ' ' space between the date and time portion. 
(Continue reading)

Michael Howard (JIRA | 29 May 18:14 2015
Picon

[Created] (PIG-4578) ToDateISO should support optional ' ' space variant used by JDBC

Michael Howard created PIG-4578:
-----------------------------------

             Summary: ToDateISO should support optional ' ' space variant used by JDBC
                 Key: PIG-4578
                 URL: https://issues.apache.org/jira/browse/PIG-4578
             Project: Pig
          Issue Type: Improvement
          Components: internal-udfs
            Reporter: Michael Howard
            Assignee: Michael Howard
            Priority: Minor

ISO-8601 standardizes text representation of dates and times. 
Strict ISO-8601 requires a 'T' between the date and time portion. 
ISO-8601 allows a ' ' space as a variant. 
JDBC uses a ' ' space between the date and time portion. 
Hive (& Impala) adopt the JDBC ' ' space variant. 

The pig built-in UDF ToDateISO(chararray) currently accepts only the strict 'T' format. This makes it
difficult to integrate with data from JDBC sources, including Hive. 

ToDateISO(chararray) should allow either the 'T' or ' ' space variant when parsing string
representations of datetime primitives. 

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

(Continue reading)

Michael Howard (JIRA | 29 May 16:19 2015
Picon

[Commented] (PIG-4450) Make Pig Eclipse Setup Easier


    [
https://issues.apache.org/jira/browse/PIG-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564876#comment-14564876
] 

Michael Howard commented on PIG-4450:
-------------------------------------

Fri 29 May 2015

Current trunk version is:
URL: http://svn.apache.org/repos/asf/pig/trunk
Revision: 1682438

The following changes should be made to: 
https://cwiki.apache.org/confluence/display/PIG/How+to+set+up+Eclipse+environment

Generate Eclipse Files 
-----
should say:
ant clean -Dhadoopversion=23 eclipse-files

If you do not say "-Dhadoopversion=23" then you you get a bunch of Tez-related errors. This seems to be
caused by finding an old version of FileInputFormat from hadoop-core-1.0.4.jar instead of hadoop-common-2.6.0.jar

Pre-compile and generate sources
---

This should say "Generate sources & pre-compile" ... we need to generate source code before we compile. 
This should also define hadoopversion=23
(Continue reading)

Apache Jenkins Server | 29 May 16:03 2015
Picon

Build failed in Jenkins: Pig-trunk-commit #2147

See <https://builds.apache.org/job/Pig-trunk-commit/2147/>

------------------------------------------
[...truncated 4394 lines...]
    [junit] Running org.apache.pig.test.TestNewPlanFilterRule
    [junit] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.116 sec
    [junit] Running org.apache.pig.test.TestNewPlanImplicitSplit
    [junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.007 sec
    [junit] Running org.apache.pig.test.TestNewPlanListener
    [junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.368 sec
    [junit] Running org.apache.pig.test.TestNewPlanLogToPhyTranslationVisitor
    [junit] Tests run: 27, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.831 sec
    [junit] Running org.apache.pig.test.TestNewPlanLogicalOptimizer
    [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.408 sec
    [junit] Running org.apache.pig.test.TestNewPlanOperatorPlan
    [junit] Tests run: 47, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.271 sec
    [junit] Running org.apache.pig.test.TestNewPlanPruneMapKeys
    [junit] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.01 sec
    [junit] Running org.apache.pig.test.TestNewPlanPushDownForeachFlatten
    [junit] Tests run: 45, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.3 sec
    [junit] Running org.apache.pig.test.TestNewPlanPushUpFilter
    [junit] Tests run: 46, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.821 sec
    [junit] Running org.apache.pig.test.TestNewPlanRule
    [junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.356 sec
    [junit] Running org.apache.pig.test.TestNotEqualTo
    [junit] Tests run: 28, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.473 sec
    [junit] Running org.apache.pig.test.TestNull
    [junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.466 sec
    [junit] Running org.apache.pig.test.TestNullConstant
    [junit] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.626 sec
(Continue reading)

Xuefu Zhang (JIRA | 29 May 15:17 2015
Picon

[Updated] (PIG-4565) Support custom MR partitioners for Spark engine


     [
https://issues.apache.org/jira/browse/PIG-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-4565:
-----------------------------
    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed to Spark branch. Thanks, Mohit.

> Support custom MR partitioners for Spark engine 
> ------------------------------------------------
>
>                 Key: PIG-4565
>                 URL: https://issues.apache.org/jira/browse/PIG-4565
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>    Affects Versions: spark-branch
>            Reporter: Mohit Sabharwal
>            Assignee: Mohit Sabharwal
>             Fix For: spark-branch
>
>         Attachments: PIG-4565.1.patch, PIG-4565.2.patch, PIG-4565.3.patch, PIG-4565.patch
>
>
> Shuffle operations like DISTINCT, GROUP, JOIN, CROSS allow custom MR partitioners to be specified.
> Example:
> {code}
(Continue reading)

liyunzhang_intel (JIRA | 29 May 08:24 2015
Picon

[Commented] (PIG-4565) Support custom MR partitioners for Spark engine


    [
https://issues.apache.org/jira/browse/PIG-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564276#comment-14564276
] 

liyunzhang_intel commented on PIG-4565:
---------------------------------------

[~mohitsabharwal]:  For PIG-4565.3.patch, +1

> Support custom MR partitioners for Spark engine 
> ------------------------------------------------
>
>                 Key: PIG-4565
>                 URL: https://issues.apache.org/jira/browse/PIG-4565
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>    Affects Versions: spark-branch
>            Reporter: Mohit Sabharwal
>            Assignee: Mohit Sabharwal
>             Fix For: spark-branch
>
>         Attachments: PIG-4565.1.patch, PIG-4565.2.patch, PIG-4565.3.patch, PIG-4565.patch
>
>
> Shuffle operations like DISTINCT, GROUP, JOIN, CROSS allow custom MR partitioners to be specified.
> Example:
> {code}
> B = GROUP A BY $0 PARTITION BY org.apache.pig.test.utils.SimpleCustomPartitioner PARALLEL 2;
(Continue reading)


Gmane