Niels Basjes (JIRA | 5 Aug 11:05 2015
Picon

[Updated] (PIG-4638) Allow TOMAP to accept dynamically sized input


     [
https://issues.apache.org/jira/browse/PIG-4638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Niels Basjes updated PIG-4638:
------------------------------
    Status: Patch Available  (was: Open)

> Allow TOMAP to accept dynamically sized input 
> ----------------------------------------------
>
>                 Key: PIG-4638
>                 URL: https://issues.apache.org/jira/browse/PIG-4638
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.15.0
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>             Fix For: 0.16.0
>
>         Attachments: PIG-4638-20150723.patch, PIG-4638-20150805-1058.patch
>
>
> Currently the TOMAP function only supports a parameter list of values.
> Triggered by reading http://stackoverflow.com/q/17847970/ 
> {quote}I want to convert a bag of tuples to a map with specific value in each tuple as key. Basically I want to change:
> \{(id1, value1),(id2, value2), ...\} into \[id1#value1, id2#value2\]{quote}
> I propose to make the TOMAP accept both the current form
> {code}TOMAP($0, $1, $2, $3){code}
> and a new form where the a single parameter: 
(Continue reading)

Niels Basjes (JIRA | 5 Aug 11:05 2015
Picon

[Updated] (PIG-4405) Adding 'map[]' support to mock/Storage


     [
https://issues.apache.org/jira/browse/PIG-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Niels Basjes updated PIG-4405:
------------------------------
    Status: Patch Available  (was: Open)

> Adding 'map[]' support to mock/Storage
> --------------------------------------
>
>                 Key: PIG-4405
>                 URL: https://issues.apache.org/jira/browse/PIG-4405
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.14.0
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>             Fix For: 0.16.0
>
>         Attachments: PIG-4405-20150723.patch, PIG-4405-20150805-1058.patch
>
>
> The mock/Storage contains convenience methods for creating a bag and a tuple when doing unit tests. Pig
has however 3 complex data types ( see
http://pig.apache.org/docs/r0.14.0/basic.html#Simple+and+Complex ) and the third one (the map) is
not yet present in such a convenience method.
> Feature request: Add such a method to facilitate testing map[] output better.

--
(Continue reading)

Niels Basjes (JIRA | 5 Aug 11:05 2015
Picon

[Updated] (PIG-4638) Allow TOMAP to accept dynamically sized input


     [
https://issues.apache.org/jira/browse/PIG-4638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Niels Basjes updated PIG-4638:
------------------------------
    Attachment: PIG-4638-20150805-1058.patch

Updated patch to follow the changes in PIG-4405.

> Allow TOMAP to accept dynamically sized input 
> ----------------------------------------------
>
>                 Key: PIG-4638
>                 URL: https://issues.apache.org/jira/browse/PIG-4638
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.15.0
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>             Fix For: 0.16.0
>
>         Attachments: PIG-4638-20150723.patch, PIG-4638-20150805-1058.patch
>
>
> Currently the TOMAP function only supports a parameter list of values.
> Triggered by reading http://stackoverflow.com/q/17847970/ 
> {quote}I want to convert a bag of tuples to a map with specific value in each tuple as key. Basically I want to change:
> \{(id1, value1),(id2, value2), ...\} into \[id1#value1, id2#value2\]{quote}
> I propose to make the TOMAP accept both the current form
(Continue reading)

Niels Basjes (JIRA | 5 Aug 11:03 2015
Picon

[Updated] (PIG-4405) Adding 'map[]' support to mock/Storage


     [
https://issues.apache.org/jira/browse/PIG-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Niels Basjes updated PIG-4405:
------------------------------
    Attachment: PIG-4405-20150805-1058.patch

Updated patch that uses the form indicated by [~daijy]

> Adding 'map[]' support to mock/Storage
> --------------------------------------
>
>                 Key: PIG-4405
>                 URL: https://issues.apache.org/jira/browse/PIG-4405
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.14.0
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>             Fix For: 0.16.0
>
>         Attachments: PIG-4405-20150723.patch, PIG-4405-20150805-1058.patch
>
>
> The mock/Storage contains convenience methods for creating a bag and a tuple when doing unit tests. Pig
has however 3 complex data types ( see
http://pig.apache.org/docs/r0.14.0/basic.html#Simple+and+Complex ) and the third one (the map) is
not yet present in such a convenience method.
> Feature request: Add such a method to facilitate testing map[] output better.
(Continue reading)

Niels Basjes (JIRA | 5 Aug 09:20 2015
Picon

[Updated] (PIG-4526) Make setting up the build environment easier


     [
https://issues.apache.org/jira/browse/PIG-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Niels Basjes updated PIG-4526:
------------------------------
    Status: Open  (was: Patch Available)

Ran into the same problem as HADOOP-11936

> Make setting up the build environment easier
> --------------------------------------------
>
>                 Key: PIG-4526
>                 URL: https://issues.apache.org/jira/browse/PIG-4526
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>             Fix For: 0.16.0
>
>         Attachments: PIG-4526-2015-04-30-1632.patch, PIG-4526-2015-05-01-1545.patch, PIG-4526-2015-05-03-0910.patch
>
>
> In AVRO-1537 and HADOOP-11843 a docker based solution was created to setup all the tools for doing a full
build. This enables much easier reproduction of any issues and getting up and running for new developers.
> This issue is to 'copy/port' that setup into the pig project.

--
This message was sent by Atlassian JIRA
(Continue reading)

Niels Basjes (JIRA | 5 Aug 09:06 2015
Picon

[Updated] (PIG-4405) Adding 'map[]' support to mock/Storage


     [
https://issues.apache.org/jira/browse/PIG-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Niels Basjes updated PIG-4405:
------------------------------
    Status: Open  (was: Patch Available)

> Adding 'map[]' support to mock/Storage
> --------------------------------------
>
>                 Key: PIG-4405
>                 URL: https://issues.apache.org/jira/browse/PIG-4405
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.14.0
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>             Fix For: 0.16.0
>
>         Attachments: PIG-4405-20150723.patch
>
>
> The mock/Storage contains convenience methods for creating a bag and a tuple when doing unit tests. Pig
has however 3 complex data types ( see
http://pig.apache.org/docs/r0.14.0/basic.html#Simple+and+Complex ) and the third one (the map) is
not yet present in such a convenience method.
> Feature request: Add such a method to facilitate testing map[] output better.

--
(Continue reading)

Niels Basjes (JIRA | 5 Aug 09:06 2015
Picon

[Commented] (PIG-4405) Adding 'map[]' support to mock/Storage


    [
https://issues.apache.org/jira/browse/PIG-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654920#comment-14654920
] 

Niels Basjes commented on PIG-4405:
-----------------------------------

I agree, we should go for the consistency. 
I was unaware of the 'bag' syntax you showed.

> Adding 'map[]' support to mock/Storage
> --------------------------------------
>
>                 Key: PIG-4405
>                 URL: https://issues.apache.org/jira/browse/PIG-4405
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.14.0
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>             Fix For: 0.16.0
>
>         Attachments: PIG-4405-20150723.patch
>
>
> The mock/Storage contains convenience methods for creating a bag and a tuple when doing unit tests. Pig
has however 3 complex data types ( see
http://pig.apache.org/docs/r0.14.0/basic.html#Simple+and+Complex ) and the third one (the map) is
not yet present in such a convenience method.
(Continue reading)

Niels Basjes (JIRA | 5 Aug 09:05 2015
Picon

[Updated] (PIG-4638) Allow TOMAP to accept dynamically sized input


     [
https://issues.apache.org/jira/browse/PIG-4638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Niels Basjes updated PIG-4638:
------------------------------
    Status: Open  (was: Patch Available)

> Allow TOMAP to accept dynamically sized input 
> ----------------------------------------------
>
>                 Key: PIG-4638
>                 URL: https://issues.apache.org/jira/browse/PIG-4638
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.15.0
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>             Fix For: 0.16.0
>
>         Attachments: PIG-4638-20150723.patch
>
>
> Currently the TOMAP function only supports a parameter list of values.
> Triggered by reading http://stackoverflow.com/q/17847970/ 
> {quote}I want to convert a bag of tuples to a map with specific value in each tuple as key. Basically I want to change:
> \{(id1, value1),(id2, value2), ...\} into \[id1#value1, id2#value2\]{quote}
> I propose to make the TOMAP accept both the current form
> {code}TOMAP($0, $1, $2, $3){code}
> and a new form where the a single parameter: 
(Continue reading)

jira | 5 Aug 08:00 2015
Picon

Subscription: PIG patch available

Issue Subscription
Filter: PIG patch available (31 issues)

Subscriber: pigdaily

Key         Summary
PIG-4649    [Pig on Tez] Union followed by HCatStorer misses some data
            https://issues.apache.org/jira/browse/PIG-4649
PIG-4644    PORelationToExprProject.clone() is broken
            https://issues.apache.org/jira/browse/PIG-4644
PIG-4638    Allow TOMAP to accept dynamically sized input 
            https://issues.apache.org/jira/browse/PIG-4638
PIG-4629    org.apache.hadoop.hive.ql.exec.FunctionRegistry#getFunctionInfo() throws
SemanticException since Hive 1.1.0
            https://issues.apache.org/jira/browse/PIG-4629
PIG-4628    Pig 0.14 job with order by fails in mapreduce mode with Oozie
            https://issues.apache.org/jira/browse/PIG-4628
PIG-4598    Allow user defined plan optimizer rules
            https://issues.apache.org/jira/browse/PIG-4598
PIG-4581    thread safe issue in NodeIdGenerator
            https://issues.apache.org/jira/browse/PIG-4581
PIG-4539    New PigUnit
            https://issues.apache.org/jira/browse/PIG-4539
PIG-4526    Make setting up the build environment easier
            https://issues.apache.org/jira/browse/PIG-4526
PIG-4515    org.apache.pig.builtin.Distinct throws ClassCastException
            https://issues.apache.org/jira/browse/PIG-4515
PIG-4468    Pig's jackson version conflicts with that of hadoop 2.6.0
            https://issues.apache.org/jira/browse/PIG-4468
PIG-4455    Should use DependencyOrderWalker instead of DepthFirstWalker in MRPrinter
(Continue reading)

Daniel Dai (JIRA | 5 Aug 07:14 2015
Picon

[Commented] (PIG-4612) accumulating upon filters is still accumulating


    [
https://issues.apache.org/jira/browse/PIG-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654829#comment-14654829
] 

Daniel Dai commented on PIG-4612:
---------------------------------

You will need to call initial/intermediate, otherwise, who construct the intermediate output? The
knowledge of constructing intermediate output is captured in user defined Intermediate function.

> accumulating upon filters is still accumulating
> -----------------------------------------------
>
>                 Key: PIG-4612
>                 URL: https://issues.apache.org/jira/browse/PIG-4612
>             Project: Pig
>          Issue Type: Improvement
>          Components: internal-udfs
>    Affects Versions: 0.15.0
>         Environment: I use yarn not tez nor spark, but i think the problem also exists in those environments
>            Reporter: Remi Catherinot
>              Labels: performance
>
> Accumulator are not used when accumulating filter results. Here is a script with no filters which end-up
having a map-combine-reduce plan which efficiently use Accumulator design.
> A = LOAD '/some/data' AS (a:chararray,b:long,c:chararray);
> B = FOREACH (GROUP A BY (a)) {
>    GENERATE MAX(A.b) AS accumulated;
> }
(Continue reading)

Daniel Dai (JIRA | 5 Aug 06:41 2015
Picon

[Commented] (PIG-4623) Fixed the 'new line' character inside double-quote causing the csv parsing failure


    [
https://issues.apache.org/jira/browse/PIG-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654796#comment-14654796
] 

Daniel Dai commented on PIG-4623:
---------------------------------

Can you generate a patch and attach to the ticket?

> Fixed the 'new line' character inside double-quote causing the csv parsing failure
> ----------------------------------------------------------------------------------
>
>                 Key: PIG-4623
>                 URL: https://issues.apache.org/jira/browse/PIG-4623
>             Project: Pig
>          Issue Type: Bug
>          Components: piggybank
>            Reporter: Ken Wu
>            Assignee: Ken Wu
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A new line character should be allowed inside a double quote as a valid csv document. For example, the
following csv document should be treated as a SINGLE valid csv data
> Iphone,"{ ItemName : Cheez-It
> 21 Ounce}",
> However, the current implementation of the getNext() inside
org.apache.pig.piggybank.storage.CSVLoader class fails to take care of this case and it sees two lines
of data while in fact it should be treated as single line of data.
(Continue reading)


Gmane