Xuefu Zhang (JIRA | 1 Apr 01:20 2011
Picon

[Commented] (PIG-1931) Integrate Macro Expansion with New Parser


    [
https://issues.apache.org/jira/browse/PIG-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014274#comment-13014274
] 

Xuefu Zhang commented on PIG-1931:
----------------------------------

+1 PIG-1931_4.patch looks good.

> Integrate Macro Expansion with New Parser
> -----------------------------------------
>
>                 Key: PIG-1931
>                 URL: https://issues.apache.org/jira/browse/PIG-1931
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>             Fix For: 0.9.0
>
>         Attachments: PIG-1931_1.patch, PIG-1931_2.patch, PIG-1931_3.patch, PIG-1931_4.patch
>
>
> Currently Macro expansion is implemented as a preprocessor (PIG-1793) so that it can work with the old
parser. Now the new parser replaced old parser in trunk and we can integrate macro expansion into the new
parser. This has many advantages such as better error reporting, less code and making Macro part of Pig Latin.
> To aid debugging, Pig command line option -r (dryrun) will produce a script with expanded macros (in
(Continue reading)

Alan Gates | 1 Apr 01:45 2011
Picon

Re: transform sql to pig

In Pig Latin data flows are more linear than SQL, so SQL subqueries  
tend to come first in Pig Latin scripts.  Given that, your first SQL  
query would look roughly like:

-- Find the records from s that will form the 'in' clause
A = load 'supplier' as (<whatever it's schema is>); --
B = filter A by s_comment matches '.*Customer.*Complaints.*';
C = foreach B generate s_suppkey; -- project out everything but the  
field you're interested in

D = load 'partsupp' as (<schema>);
E = load 'part' as (<schema>);
F = join D by ps_partkey, E by p_partkey; -- do the join
G = filter F by p_brand != '[BRAND]'and p_type not matches  
'[TYPE].*'and p_size == '[SIZE1]' or p_size == '[SIZE2]' ... -- do the  
big harry filter, pretty much the same except Pig doesn't support the  
IN syntax

-- Rewrite the subquery as an anti-join
H = cogroup G by ps_suppkey, C by s_suppkey);
I = filter H by COUNT(C) == 0;
J = foreach I generate flatten(G);

-- Now do the group by
K = group J by _brand,p_type,p_sizeorder bysupplier_cnt  
desc,p_brand,p_type,p_size;
L = foreach K {
	L1 = J.ps_suppkey;
	L2 = distinct L1;
	generate group, COUNT(L2);
(Continue reading)

Xuefu Zhang (JIRA | 1 Apr 02:00 2011
Picon

[Updated] (PIG-1947) Incorrect line number is reported during parsing


     [
https://issues.apache.org/jira/browse/PIG-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-1947:
-----------------------------

    Attachment: PIG-1947.patch

Temporary fix for the issue. With this patch, Pig is able to correctly report line/column number for batch
processing. For interactive mode, The line number can be wrong, which probably doesn't matter as user
should assume it's the last line entered, and the column number can be also wrong, which is the same for
previous version anyway.

We shall fully address the issue in the next release when we replace the grunt parser. 

> Incorrect line number is reported during parsing
> ------------------------------------------------
>
>                 Key: PIG-1947
>                 URL: https://issues.apache.org/jira/browse/PIG-1947
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>             Fix For: 0.9.0
>
>         Attachments: PIG-1947.patch
>
(Continue reading)

deepak kumar v | 1 Apr 06:38 2011
Picon

Re: How to group on a group id that is present inside a complex hierarchy

any response?

On Tue, Mar 29, 2011 at 3:32 PM, deepak kumar v <deepu.pig@...> wrote:

> Hi,
> Below are list of tuples generated by a UDF.
>
> ( ( [stdout#{ (day, age, name, address, ['k1#v1','k2#v2'] ) } ] ) )
> ( ( [stdout#{ (12/2,22,deepak,newyork,  ['k1#v2','k2#v2'] ) } ] ) )
> ( ( [stdout#{ (12/3,22,deepak,newyork,  ['k1#v1','k2#v2'] ) } ] ) )
> group a -- ( v1 , { (day, age, name, address, ['k1#v1','k2#v2']
> ), (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2']) } )
> group b -- ( v2 , { (12/2,22,deepak,newyork,  ['k1#v2','k2#v2'])} )
>
> I need to run group by on k1 so that i have two groups.
> *
> Approach #1*
> grped = group inputTuples by $0.$0.#'stdout'.$0.$0.$5#'k1'
>
> Error:
> 2011-03-29 15:16:44,589 [main] WARN  org.apache.pig.PigServer - Encountered
> Warning IMPLICIT_CAST_TO_MAP 1 time(s).
> 2011-03-29 15:16:44,589 [main] INFO
> org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script:
> GROUP_BY
> 2011-03-29 15:16:44,589 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> pig.usenewlogicalplan is set to true. New logical plan will be used.
> 2011-03-29 15:16:44,593 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2042: Error in new logical plan. Try -Dpig.usenewlogicalplan=false.
(Continue reading)

Corinne Chandel (JIRA | 1 Apr 18:08 2011
Picon

[Updated] (PIG-1772) Pig 090 Documentation


     [
https://issues.apache.org/jira/browse/PIG-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Corinne Chandel updated PIG-1772:
---------------------------------

    Attachment: pig-1772-beta-1-3.patch

pig-1772-beta-1-3.patch

Includes:
> JavaScript fix UDF doc (udf.xml)
> minor update for Control Flow doc (cont.xml)

> Pig 090 Documentation
> ---------------------
>
>                 Key: PIG-1772
>                 URL: https://issues.apache.org/jira/browse/PIG-1772
>             Project: Pig
>          Issue Type: Task
>          Components: documentation
>    Affects Versions: 0.9.0
>            Reporter: Corinne Chandel
>            Assignee: Corinne Chandel
>             Fix For: 0.9.0
>
>         Attachments: pig-1772-1.patch, pig-1772-2.patch, pig-1772-3.patch, pig-1772-beta-1-2.patch,
pig-1772-beta-1-3.patch, pig-1772-beta-1.patch
(Continue reading)

Richard Ding (JIRA | 1 Apr 18:23 2011
Picon

[Commented] (PIG-1931) Integrate Macro Expansion with New Parser


    [
https://issues.apache.org/jira/browse/PIG-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014719#comment-13014719
] 

Richard Ding commented on PIG-1931:
-----------------------------------

test-patch result:

{code}
     [exec] -1 overall.  
     [exec] 
     [exec]     +1  <at> author.  The patch does not contain any  <at> author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 8 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     -1 release audit.  The applied patch generated 556 release audit warnings (more than the trunk's
current 555 warnings).
{code}

Release audit warning is html related.

Unit tests pass.
(Continue reading)

Richard Ding (JIRA | 1 Apr 18:31 2011
Picon

[Commented] (PIG-1931) Integrate Macro Expansion with New Parser


    [
https://issues.apache.org/jira/browse/PIG-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014721#comment-13014721
] 

Richard Ding commented on PIG-1931:
-----------------------------------

patch 4 committed to trunk.

> Integrate Macro Expansion with New Parser
> -----------------------------------------
>
>                 Key: PIG-1931
>                 URL: https://issues.apache.org/jira/browse/PIG-1931
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>             Fix For: 0.9.0
>
>         Attachments: PIG-1931_1.patch, PIG-1931_2.patch, PIG-1931_3.patch, PIG-1931_4.patch
>
>
> Currently Macro expansion is implemented as a preprocessor (PIG-1793) so that it can work with the old
parser. Now the new parser replaced old parser in trunk and we can integrate macro expansion into the new
parser. This has many advantages such as better error reporting, less code and making Macro part of Pig Latin.
> To aid debugging, Pig command line option -r (dryrun) will produce a script with expanded macros (in
(Continue reading)

Olga Natkovich (JIRA | 1 Apr 19:38 2011
Picon

[Assigned] (PIG-1948) java.lang.ClassCastException while using double value from result of a group


     [
https://issues.apache.org/jira/browse/PIG-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich reassigned PIG-1948:
-----------------------------------

    Assignee: Thejas M Nair

> java.lang.ClassCastException while using double value from result of a group
> ----------------------------------------------------------------------------
>
>                 Key: PIG-1948
>                 URL: https://issues.apache.org/jira/browse/PIG-1948
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.7.0, 0.8.0, 0.9.0
>            Reporter: Vivek Padmanabhan
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>
> I have a fairly simple script (but too many coloumns) which is failing with class cast exception.
> {code}
> register myudf.jar;
> A = load 'newinput' as (datestamp: chararray,vtestid: chararray,src_kt1: chararray,f1:
chararray,f2: chararray,f3: chararray,f4: chararray,f5: chararray,f6: int,ipc: chararray,woeid:
long,woeid_place: chararray,f7: chararray,f8: double,woeid_latitude: double,f9:
chararray,woeid_town: chararray,woeid_county: chararray,a1: chararray,a2:
(Continue reading)

Olga Natkovich (JIRA | 1 Apr 19:38 2011
Picon

[Updated] (PIG-1948) java.lang.ClassCastException while using double value from result of a group


     [
https://issues.apache.org/jira/browse/PIG-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-1948:
--------------------------------

    Fix Version/s: 0.8.0

> java.lang.ClassCastException while using double value from result of a group
> ----------------------------------------------------------------------------
>
>                 Key: PIG-1948
>                 URL: https://issues.apache.org/jira/browse/PIG-1948
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.7.0, 0.8.0, 0.9.0
>            Reporter: Vivek Padmanabhan
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>
> I have a fairly simple script (but too many coloumns) which is failing with class cast exception.
> {code}
> register myudf.jar;
> A = load 'newinput' as (datestamp: chararray,vtestid: chararray,src_kt1: chararray,f1:
chararray,f2: chararray,f3: chararray,f4: chararray,f5: chararray,f6: int,ipc: chararray,woeid:
long,woeid_place: chararray,f7: chararray,f8: double,woeid_latitude: double,f9:
chararray,woeid_town: chararray,woeid_county: chararray,a1: chararray,a2:
(Continue reading)

Xuefu Zhang (JIRA | 1 Apr 21:09 2011
Picon

[Created] (PIG-1956) Pig parser shouldn't log error code 0

Pig parser shouldn't log error code 0
-------------------------------------

                 Key: PIG-1956
                 URL: https://issues.apache.org/jira/browse/PIG-1956
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.9.0
            Reporter: Xuefu Zhang
            Assignee: Xuefu Zhang
            Priority: Minor
             Fix For: 0.9.0

For the following pig script:

a = load 'x' as (name, age, gpa);
b = group a by name;
c = foreach b { ba = filter a by age < '25'; bb = foreach ba generate gpa; generate group, flatten(bb);}

Parser gives the following error:

ERROR 0: <line 3, column 14> ...

It should give a more meaningful error message.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

(Continue reading)


Gmane