Jonathan Coveney | 1 Mar 2012 02:47
Picon
Gravatar

Better map support?

Hey all! In the last couple of weeks I've found myself wanting for better
map support in pig. I'd be willing to do the work, just wanted to get a
sense of what people thought might be useful. And maybe some of this exists!

TOBAG - given a Map, outputs {(key,value)}
KEYSET - returns {(key)} where it is unique
VALUESET - returns {(value)} where it is unique
CONCAT - takes some number of Maps, and merges them together. The first
one's values will override the rest. Or the latest one will. Whatevs.

Not sure what else would be useful. I'm including user <at>  on this because I'd
love to hear any map manipulation features you crave. Maps, up to know,
have been pretty 2nd class and I'd love to help change that.

Jon
Picon
Favicon

[Updated] (PIG-2564) Build fails - Hadoop 0.23.1-SNAPSHOT no longer available


     [
https://issues.apache.org/jira/browse/PIG-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Weise updated PIG-2564:
------------------------------

    Status: Patch Available  (was: Open)

> Build fails - Hadoop 0.23.1-SNAPSHOT no longer available
> --------------------------------------------------------
>
>                 Key: PIG-2564
>                 URL: https://issues.apache.org/jira/browse/PIG-2564
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.2, 0.10, 0.11
>            Reporter: Thomas Weise
>            Assignee: Thomas Weise
>            Priority: Critical
>             Fix For: 0.10, 0.9.3, 0.11
>
>         Attachments: PIG-2564.patch
>
>
> Builds for 0.23 currently fail. 0.23.1-SNAPSHOT no longer available, 0.23.2-SNAPSHOT is:
> https://repository.apache.org/content/groups/snapshots-group/org/apache/hadoop/hadoop-hdfs/0.23.1-SNAPSHOT/
> https://repository.apache.org/content/groups/snapshots-group/org/apache/hadoop/hadoop-hdfs/0.23.2-SNAPSHOT/
> We should switch to 0.23.1 release.

(Continue reading)

Picon
Favicon

[Commented] (PIG-2564) Build fails - Hadoop 0.23.1-SNAPSHOT no longer available


    [
https://issues.apache.org/jira/browse/PIG-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219730#comment-13219730
] 

Thomas Weise commented on PIG-2564:
-----------------------------------

Tests pass (run for 0.9 branch).

                
> Build fails - Hadoop 0.23.1-SNAPSHOT no longer available
> --------------------------------------------------------
>
>                 Key: PIG-2564
>                 URL: https://issues.apache.org/jira/browse/PIG-2564
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.2, 0.10, 0.11
>            Reporter: Thomas Weise
>            Assignee: Thomas Weise
>            Priority: Critical
>             Fix For: 0.10, 0.9.3, 0.11
>
>         Attachments: PIG-2564.patch
>
>
> Builds for 0.23 currently fail. 0.23.1-SNAPSHOT no longer available, 0.23.2-SNAPSHOT is:
> https://repository.apache.org/content/groups/snapshots-group/org/apache/hadoop/hadoop-hdfs/0.23.1-SNAPSHOT/
> https://repository.apache.org/content/groups/snapshots-group/org/apache/hadoop/hadoop-hdfs/0.23.2-SNAPSHOT/
(Continue reading)

Picon
Favicon

[Commented] (PIG-2532) Registered classes fail deserialization in frontend


    [
https://issues.apache.org/jira/browse/PIG-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219749#comment-13219749
] 

Travis Crawford commented on PIG-2532:
--------------------------------------

Thomas, thanks for the report. I will take a look at this test failure.

> Registered classes fail deserialization in frontend
> ---------------------------------------------------
>
>                 Key: PIG-2532
>                 URL: https://issues.apache.org/jira/browse/PIG-2532
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>             Fix For: 0.10, 0.9.3
>
>         Attachments: PIG-2532-v2.patch, PIG-2532-v3.patch, PIG-2532-v4-branch-0.9.patch,
PIG-2532-v4.patch, PIG-2532.patch, PIG-253_javax.zip
>
>
> This issue came up while integrating HCatalog with our environment. HCatalog jars are added to the pig
command-line with {{-Dpig.additional.jars}} but fails (exception below). When added to the pig
classpath the error goes away.
> We identified the issue as deserialization using the root class loader, not the context class loader set
when the thread is created. This causes HCatSchema which is serialized into the context to fail
(Continue reading)

Picon
Favicon

[Resolved] (PIG-2541) Automatic record provenance (source tagging) for PigStorage


     [
https://issues.apache.org/jira/browse/PIG-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-2541.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.11
     Release Note: 
We add a new option -tagsource to PigStorage. With this flag, we can get the INPUT_FILE_NAME as the first
column of the output data. eg:

a = load '1.txt' using PigStorage('\t', '-tagsource');
     Hadoop Flags: Reviewed

Unit tests pass. test-patch:
     [exec] -1 overall.  
     [exec] 
     [exec]     +1  <at> author.  The patch does not contain any  <at> author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     -1 release audit.  The applied patch generated 533 release audit warnings (more than the trunk's
(Continue reading)

Picon
Favicon

[Created] (PIG-2565) Support IMPORT for macros stored in S3 Buckets

Support IMPORT for macros stored in S3 Buckets
----------------------------------------------

                 Key: PIG-2565
                 URL: https://issues.apache.org/jira/browse/PIG-2565
             Project: Pig
          Issue Type: Improvement
          Components: impl
    Affects Versions: 0.9.2
            Reporter: Nikolai Avteniev

Macros enable building modular pig programs it would be useful to extend this functionality to pig users
who are running pig on the amazon platform.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

Daniel Dai (Updated) (JIRA | 1 Mar 2012 08:18
Picon
Favicon

[Updated] (PIG-2534) Pig generating infinite map outputs


     [
https://issues.apache.org/jira/browse/PIG-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-2534:
----------------------------

    Attachment: PIG-2534-2.patch

Attach PIG-2534-2.patch to add comments to setOutputUids, also fix a findbug warning. For refactory, we
can certainly do it, but since that is lower priority, I will open a separate ticket to address it.

> Pig generating infinite map outputs
> -----------------------------------
>
>                 Key: PIG-2534
>                 URL: https://issues.apache.org/jira/browse/PIG-2534
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.1
>            Reporter: Vivek Padmanabhan
>         Attachments: PIG-2534-1.patch, PIG-2534-2.patch
>
>
> I am getting a strange behavior by Pig in the below script for Pig 0.9.
> {code}
> event_serve = LOAD 'input1'   AS (s, m, l);
> cm_data_raw = LOAD 'input2'  AS (s, m, l);
> SPLIT cm_data_raw INTO
>     cm_serve_raw IF (( (chararray) (s#'key1') == '0') AND ( (chararray) (s#'key2') == '5')),
(Continue reading)

Picon
Favicon

[Resolved] (PIG-2534) Pig generating infinite map outputs


     [
https://issues.apache.org/jira/browse/PIG-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-2534.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.11
                   0.9.3
                   0.10
         Assignee: Daniel Dai
     Hadoop Flags: Reviewed

Unit test pass. test-patch:
     [exec] -1 overall.  
     [exec] 
     [exec]     +1  <at> author.  The patch does not contain any  <at> author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     -1 release audit.  The applied patch generated 533 release audit warnings (more than the trunk's
current 530 warnings).

(Continue reading)

Daniel Dai (Created) (JIRA | 1 Mar 2012 08:26
Picon
Favicon

[Created] (PIG-2566) Refactory ColumnPruneHelper not throw exception in normal flow

Refactory ColumnPruneHelper not throw exception in normal flow
--------------------------------------------------------------

                 Key: PIG-2566
                 URL: https://issues.apache.org/jira/browse/PIG-2566
             Project: Pig
          Issue Type: Improvement
            Reporter: Daniel Dai

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

Picon
Favicon

[Commented] (PIG-2534) Pig generating infinite map outputs


    [
https://issues.apache.org/jira/browse/PIG-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219877#comment-13219877
] 

Daniel Dai commented on PIG-2534:
---------------------------------

Open PIG-2566 to track Dmitriy's comment.

> Pig generating infinite map outputs
> -----------------------------------
>
>                 Key: PIG-2534
>                 URL: https://issues.apache.org/jira/browse/PIG-2534
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.1
>            Reporter: Vivek Padmanabhan
>            Assignee: Daniel Dai
>             Fix For: 0.10, 0.9.3, 0.11
>
>         Attachments: PIG-2534-1.patch, PIG-2534-2.patch
>
>
> I am getting a strange behavior by Pig in the below script for Pig 0.9.
> {code}
> event_serve = LOAD 'input1'   AS (s, m, l);
> cm_data_raw = LOAD 'input2'  AS (s, m, l);
> SPLIT cm_data_raw INTO
(Continue reading)


Gmane