Rohini Palaniswamy (JIRA | 31 Aug 13:51 2015
Picon

[Commented] (PIG-3622) Allow casting bytearray fields to bytearray type


    [
https://issues.apache.org/jira/browse/PIG-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723323#comment-14723323
] 

Rohini Palaniswamy commented on PIG-3622:
-----------------------------------------

TestTypeCheckingValidatorNewLP.testExpressionTypeCheckingFail11 is failing because of this
patch. 

> Allow casting bytearray fields to bytearray type
> ------------------------------------------------
>
>                 Key: PIG-3622
>                 URL: https://issues.apache.org/jira/browse/PIG-3622
>             Project: Pig
>          Issue Type: Improvement
>         Environment: 0.12
>            Reporter: Redis Liu
>            Assignee: Redis Liu
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: 3622-v2.patch
>
>
> test.pig:
> AA = load '1.txt' USING PigStorage(' ') as (a:bytearray, b:chararray, c:chararray);
> AA1 = filter AA by a == '1';
(Continue reading)

Apache Jenkins Server | 31 Aug 12:54 2015
Picon

Build failed in Jenkins: Pig-trunk-commit #2232

See <https://builds.apache.org/job/Pig-trunk-commit/2232/>

------------------------------------------
[...truncated 4424 lines...]
    [junit] Running org.apache.pig.test.TestNewPlanListener
    [junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.368 sec
    [junit] Running org.apache.pig.test.TestNewPlanLogToPhyTranslationVisitor
    [junit] Tests run: 27, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.859 sec
    [junit] Running org.apache.pig.test.TestNewPlanLogicalOptimizer
    [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.408 sec
    [junit] Running org.apache.pig.test.TestNewPlanOperatorPlan
    [junit] Tests run: 47, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.09 sec
    [junit] Running org.apache.pig.test.TestNewPlanPruneMapKeys
    [junit] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.995 sec
    [junit] Running org.apache.pig.test.TestNewPlanPushDownForeachFlatten
    [junit] Tests run: 45, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.038 sec
    [junit] Running org.apache.pig.test.TestNewPlanPushUpFilter
    [junit] Tests run: 46, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.071 sec
    [junit] Running org.apache.pig.test.TestNewPlanRule
    [junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.329 sec
    [junit] Running org.apache.pig.test.TestNotEqualTo
    [junit] Tests run: 28, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.451 sec
    [junit] Running org.apache.pig.test.TestNull
    [junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.443 sec
    [junit] Running org.apache.pig.test.TestNullConstant
    [junit] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.519 sec
    [junit] Running org.apache.pig.test.TestNumberOfReducers
    [junit] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 516.569 sec
    [junit] Running org.apache.pig.test.TestOptimizeLimit
    [junit] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.351 sec
(Continue reading)

Srikanth Sundarrajan (JIRA | 31 Aug 10:13 2015
Picon

[Created] (PIG-4667) Enable Pig on Spark to run on Yarn Client/Cluster mode

Srikanth Sundarrajan created PIG-4667:
-----------------------------------------

             Summary: Enable Pig on Spark to run on Yarn Client/Cluster mode
                 Key: PIG-4667
                 URL: https://issues.apache.org/jira/browse/PIG-4667
             Project: Pig
          Issue Type: Sub-task
          Components: spark
            Reporter: Srikanth Sundarrajan
            Assignee: Srikanth Sundarrajan

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Xianda Ke (JIRA | 31 Aug 10:04 2015
Picon

[Commented] (PIG-4655) Support InputStats in spark mode


    [
https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723187#comment-14723187
] 

Xianda Ke commented on PIG-4655:
--------------------------------

new patch(PIG-4655-2.patch) is attached. [~mohitsabharwal] Please help review. Thanks.

> Support InputStats in spark mode
> --------------------------------
>
>                 Key: PIG-4655
>                 URL: https://issues.apache.org/jira/browse/PIG-4655
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: Xianda Ke
>            Assignee: Xianda Ke
>             Fix For: spark-branch
>
>         Attachments: PIG-4655-2.patch, PIG-4655.patch
>
>
> Currently, InputStats is not implemented in spark mode. 
> The JUnit case TestPigRunner.testEmptyFileCounter() will fail.

--
This message was sent by Atlassian JIRA
(Continue reading)

Xianda Ke (JIRA | 31 Aug 10:02 2015
Picon

[Updated] (PIG-4655) Support InputStats in spark mode


     [
https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xianda Ke updated PIG-4655:
---------------------------
    Attachment: PIG-4655-2.patch

RB request: https://reviews.apache.org/r/37636/

> Support InputStats in spark mode
> --------------------------------
>
>                 Key: PIG-4655
>                 URL: https://issues.apache.org/jira/browse/PIG-4655
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: Xianda Ke
>            Assignee: Xianda Ke
>             Fix For: spark-branch
>
>         Attachments: PIG-4655-2.patch, PIG-4655.patch
>
>
> Currently, InputStats is not implemented in spark mode. 
> The JUnit case TestPigRunner.testEmptyFileCounter() will fail.

--
This message was sent by Atlassian JIRA
(Continue reading)

Xianda Ke (JIRA | 31 Aug 09:58 2015
Picon

[Commented] (PIG-4655) Support InputStats in spark mode


    [
https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723181#comment-14723181
] 

Xianda Ke commented on PIG-4655:
--------------------------------

Hi [~mohitsabharwal], Thanks for your comments.
1. The members declaration is moved to the top of the class. Thanks.

2. addInputInfoForSparkOper() is a helper function, which will call SparkJobStats.addInputStats()
For each POStore, we start a job and then create a SparkJobStats to collect the I/O statistics. When a
SparkOperator has multiple POStores, we create multiple SparkJobStats. But the input info (POLoads) of
a SparkOperator should be collected only once. To avoid the input info was collected repeatedly, we need a
SparkOperator Set to indicate whether the input info of the SparkOperator has already been computed. I
think it better to put this Set in SparkPigStats. That's why I created the helper function
addInputInfoForSparkOper() and didn't put it in class SparkJobStats.  Any comments?

Thanks,
Xianda

> Support InputStats in spark mode
> --------------------------------
>
>                 Key: PIG-4655
>                 URL: https://issues.apache.org/jira/browse/PIG-4655
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
(Continue reading)

jira | 31 Aug 08:00 2015
Picon

Subscription: PIG patch available

Issue Subscription
Filter: PIG patch available (29 issues)

Subscriber: pigdaily

Key         Summary
PIG-4663    HBaseStorage should allow the MaxResultsPerColumnFamily limit to avoid memory or scan timeout issues
            https://issues.apache.org/jira/browse/PIG-4663
PIG-4656    Improve String serialization and comparator performance in BinInterSedes
            https://issues.apache.org/jira/browse/PIG-4656
PIG-4644    PORelationToExprProject.clone() is broken
            https://issues.apache.org/jira/browse/PIG-4644
PIG-4629    org.apache.hadoop.hive.ql.exec.FunctionRegistry#getFunctionInfo() throws
SemanticException since Hive 1.1.0
            https://issues.apache.org/jira/browse/PIG-4629
PIG-4598    Allow user defined plan optimizer rules
            https://issues.apache.org/jira/browse/PIG-4598
PIG-4581    thread safe issue in NodeIdGenerator
            https://issues.apache.org/jira/browse/PIG-4581
PIG-4539    New PigUnit
            https://issues.apache.org/jira/browse/PIG-4539
PIG-4534    Pig 0.14.0 with Hive 1.1.0, gives unresolved dependency error for hive-shims-common-secure
            https://issues.apache.org/jira/browse/PIG-4534
PIG-4515    org.apache.pig.builtin.Distinct throws ClassCastException
            https://issues.apache.org/jira/browse/PIG-4515
PIG-4468    Pig's jackson version conflicts with that of hadoop 2.6.0
            https://issues.apache.org/jira/browse/PIG-4468
PIG-4455    Should use DependencyOrderWalker instead of DepthFirstWalker in MRPrinter
            https://issues.apache.org/jira/browse/PIG-4455
PIG-4417    Pig's register command should support automatic fetching of jars from repo.
(Continue reading)

Jeff Zhang (JIRA | 31 Aug 07:44 2015
Picon

[Created] (PIG-4666) Add BUILDING.txt for building instruction

Jeff Zhang created PIG-4666:
-------------------------------

             Summary: Add BUILDING.txt for building instruction
                 Key: PIG-4666
                 URL: https://issues.apache.org/jira/browse/PIG-4666
             Project: Pig
          Issue Type: Improvement
          Components: build
    Affects Versions: 0.15.0
            Reporter: Jeff Zhang

Copy the building instruction here wiki page (https://cwiki.apache.org/confluence/display/PIG/HowToContribute)
{noformat}
Compilation
Make sure that your code introduces no new warnings into the javac compilation.
To compile with Hadoop 1.x 
> ant clean jar
To compile with Hadoop 2.x 
> ant clean jar -Dhadoopversion=23
The hadoopversion setting has 2 values - 20 and 23. -Dhadoopversion=20 which is the default denotes the
Hadoop 0.20.x and 1.x releases which are the old versions with JobTracker. -Dhadoopversion=23 denotes
the Hadoop 0.23.x and Hadoop 2.x releases which are the next gen versions of Hadoop which are based on YARN
and have separate Resource Manager and Application Masters instead of a single JobTracker that managed
both resources (cpu, memory) and running of mapreduce applications.  The exact versions of Hadoop 1.x or
2.x pig compiles against is configured in ivy/libraries.properties and is usually updated to compile
against the latest stable releases.
{noformat}

--
(Continue reading)

Jeff Zhang (JIRA | 31 Aug 07:44 2015
Picon

[Updated] (PIG-4666) Add BUILDING.txt for building instruction


     [
https://issues.apache.org/jira/browse/PIG-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Zhang updated PIG-4666:
----------------------------
    Description: 
Copy the building instruction from wiki page (https://cwiki.apache.org/confluence/display/PIG/HowToContribute)
{noformat}
Compilation
Make sure that your code introduces no new warnings into the javac compilation.
To compile with Hadoop 1.x 
> ant clean jar
To compile with Hadoop 2.x 
> ant clean jar -Dhadoopversion=23
The hadoopversion setting has 2 values - 20 and 23. -Dhadoopversion=20 which is the default denotes the
Hadoop 0.20.x and 1.x releases which are the old versions with JobTracker. -Dhadoopversion=23 denotes
the Hadoop 0.23.x and Hadoop 2.x releases which are the next gen versions of Hadoop which are based on YARN
and have separate Resource Manager and Application Masters instead of a single JobTracker that managed
both resources (cpu, memory) and running of mapreduce applications.  The exact versions of Hadoop 1.x or
2.x pig compiles against is configured in ivy/libraries.properties and is usually updated to compile
against the latest stable releases.
{noformat}

  was:
Copy the building instruction here wiki page (https://cwiki.apache.org/confluence/display/PIG/HowToContribute)
{noformat}
Compilation
Make sure that your code introduces no new warnings into the javac compilation.
To compile with Hadoop 1.x 
(Continue reading)

Xianda Ke (JIRA | 31 Aug 04:48 2015
Picon

[Commented] (PIG-4634) Fix records count issues in output statistics


    [
https://issues.apache.org/jira/browse/PIG-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721892#comment-14721892
] 

Xianda Ke commented on PIG-4634:
--------------------------------

Hi [~mohitsabharwal], I have created the RB([RB37627 | https://reviews.apache.org/r/37627/]). Thanks.

> Fix records count issues in output statistics
> ---------------------------------------------
>
>                 Key: PIG-4634
>                 URL: https://issues.apache.org/jira/browse/PIG-4634
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: Xianda Ke
>            Assignee: Xianda Ke
>             Fix For: spark-branch
>
>         Attachments: PIG-4634-3.patch, PIG-4634.patch, PIG-4634_2.patch
>
>
> Test cases simpleTest() and simpleTest2()  in TestPigRunner failed, caused by following issues:
> 1. pig context in SparkPigStats isn't initialized.
> 2. the records count logic hasn't been implemented.
> 3. getOutpugAlias(), getPigProperties(), getBytesWritten() and getRecordWritten() have not been implemented.

(Continue reading)

Rohini Palaniswamy (JIRA | 30 Aug 17:12 2015
Picon

[Updated] (PIG-3102) Option for PigStorage load to error out when input record is incomplete (instead of filling in null)


     [
https://issues.apache.org/jira/browse/PIG-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy updated PIG-3102:
------------------------------------
         Assignee: Rohini Palaniswamy
    Fix Version/s: 0.16.0

> Option for PigStorage load to error out when input record is incomplete (instead of filling in null)
> ----------------------------------------------------------------------------------------------------
>
>                 Key: PIG-3102
>                 URL: https://issues.apache.org/jira/browse/PIG-3102
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Koji Noguchi
>            Assignee: Rohini Palaniswamy
>            Priority: Minor
>             Fix For: 0.16.0
>
>
> Continuing from PIG-3100. 
> If users know that all input records have correct number of fields, then enforcing that (with option)
would let us catch any input corruption early.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

(Continue reading)


Gmane