Picon

[Commented] (PIG-2347) Fix Pig Unit tests for hadoop 23


    [
https://issues.apache.org/jira/browse/PIG-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178097#comment-13178097
] 

Thomas Weise commented on PIG-2347:
-----------------------------------

Thanks Daniel. Please change default hadoopversion back to 20. As result of avro update from 1.4.1 to 1.5.3
there is a compile error in piggybank. I will update PIG-2410 to account for the same.

                
> Fix Pig Unit tests for hadoop 23
> --------------------------------
>
>                 Key: PIG-2347
>                 URL: https://issues.apache.org/jira/browse/PIG-2347
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.9.1, 0.10, 0.11
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.10, 0.9.2, 0.11
>
>         Attachments: PIG-2347-1.patch, PIG-2347-2.patch, PIG-2347-3.patch, PIG-2347-3_0.9.patch,
PIG-2347.patch, PIG-2347.patch, PIG-2347.patch, syslog
>
>
> This is the continuation work for PIG-2125. There are still 20+ unit test suit for hadoop 23. We need to fix them.
(Continue reading)

Picon

[Updated] (PIG-2410) Piggybank does not compile in 23


     [
https://issues.apache.org/jira/browse/PIG-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Weise updated PIG-2410:
------------------------------

    Attachment: PIG-2410_branch-0.9-1.patch

Updated patch for branch-0.9 with avro 1.5.3  (included PIG-2202)

> Piggybank does not compile in 23
> --------------------------------
>
>                 Key: PIG-2410
>                 URL: https://issues.apache.org/jira/browse/PIG-2410
>             Project: Pig
>          Issue Type: Bug
>          Components: piggybank
>    Affects Versions: 0.10, 0.9.2, 0.11
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>              Labels: hadoop2.0
>             Fix For: 0.10, 0.9.2, 0.11
>
>         Attachments: PIG-2410-0.patch, PIG-2410-1.patch, PIG-2410_branch-0.9-1.patch, PIG-2410_branch-0.9.patch
>
>
> These does not compile:
> AllLoader.java
(Continue reading)

Picon

[Commented] (PIG-2347) Fix Pig Unit tests for hadoop 23


    [
https://issues.apache.org/jira/browse/PIG-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178100#comment-13178100
] 

Daniel Dai commented on PIG-2347:
---------------------------------

Yes, default version should be 20. I change it back. Thanks Thomas.

> Fix Pig Unit tests for hadoop 23
> --------------------------------
>
>                 Key: PIG-2347
>                 URL: https://issues.apache.org/jira/browse/PIG-2347
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.9.1, 0.10, 0.11
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.10, 0.9.2, 0.11
>
>         Attachments: PIG-2347-1.patch, PIG-2347-2.patch, PIG-2347-3.patch, PIG-2347-3_0.9.patch,
PIG-2347.patch, PIG-2347.patch, PIG-2347.patch, syslog
>
>
> This is the continuation work for PIG-2125. There are still 20+ unit test suit for hadoop 23. We need to fix them.

--
(Continue reading)

Harsh J (Commented) (JIRA | 1 Jan 06:22 2012
Picon

[Commented] (PIG-2441) rmf does not work in local mode


    [
https://issues.apache.org/jira/browse/PIG-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178121#comment-13178121
] 

Harsh J commented on PIG-2441:
------------------------------

Per your comments, looks like even rm/rmr looked at the wrong location, not just rmf.

> rmf does not work in local mode
> -------------------------------
>
>                 Key: PIG-2441
>                 URL: https://issues.apache.org/jira/browse/PIG-2441
>             Project: Pig
>          Issue Type: Bug
>          Components: grunt
>    Affects Versions: 0.9.1, 0.10, 0.9.2
>         Environment: Mac OS X 10.6.8
>            Reporter: Russell Jurney
>              Labels: fun, grunt, happy, mac, osx, pants, pig, rm_rf
>
> russell-jurneys-macbook-pro:Collecting-Data peyomp$ pig -l /tmp -v -x local
> 2011-12-20 18:58:54,074 [main] INFO  org.apache.pig.Main - Logging error messages to: /private/tmp/pig_1324436334061.log
> 2011-12-20 18:58:54,324 [main] INFO 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file
system at: file:///
> grunt> rmf '/tmp/mail_pairs.avro'
> grunt> rmf '/tmp/mail_pairs.avro'
(Continue reading)

Picon

[Updated] (PIG-2441) rmf does not work in local mode


     [
https://issues.apache.org/jira/browse/PIG-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Russell Jurney updated PIG-2441:
--------------------------------

    Description: 
$ pig -l /tmp -v -x local
2011-12-20 18:58:54,074 [main] INFO  org.apache.pig.Main - Logging error messages to: /private/tmp/pig_1324436334061.log
2011-12-20 18:58:54,324 [main] INFO 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file
system at: file:///
grunt> rmf '/tmp/mail_pairs.avro'
grunt> rmf '/tmp/mail_pairs.avro'
grunt> rm '/tmp/mail_pairs.avro' 
2011-12-20 18:59:02,968 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Encountered
IOException. File or directory '/tmp/mail_pairs.avro' does not exist.
2011-12-20 18:59:02,968 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.io.IOException: File
or directory '/tmp/mail_pairs.avro' does not exist.
	at org.apache.pig.tools.grunt.GruntParser.processRemove(GruntParser.java:957)
	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:430)
	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
	at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
	at org.apache.pig.Main.run(Main.java:523)
	at org.apache.pig.Main.main(Main.java:148)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
(Continue reading)

Picon

[Commented] (PIG-2441) rmf does not work in local mode


    [
https://issues.apache.org/jira/browse/PIG-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178132#comment-13178132
] 

Daniel Dai commented on PIG-2441:
---------------------------------

Seems I have trouble reproduce the issue. Can you tell me which version of Pig you are using? And can you do
"pig -secretDebugCmd -x local", and paste the output?

> rmf does not work in local mode
> -------------------------------
>
>                 Key: PIG-2441
>                 URL: https://issues.apache.org/jira/browse/PIG-2441
>             Project: Pig
>          Issue Type: Bug
>          Components: grunt
>    Affects Versions: 0.9.1, 0.10, 0.9.2
>         Environment: Mac OS X 10.6.8
>            Reporter: Russell Jurney
>            Priority: Critical
>              Labels: fun, grunt, happy, mac, osx, pants, pig, rm_rf
>
> $ pig -l /tmp -v -x local
> 2011-12-20 18:58:54,074 [main] INFO  org.apache.pig.Main - Logging error messages to: /private/tmp/pig_1324436334061.log
> 2011-12-20 18:58:54,324 [main] INFO 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file
system at: file:///
(Continue reading)

Picon

[Commented] (PIG-2362) Rework Ant build.xml to use macrodef instead of antcall


    [
https://issues.apache.org/jira/browse/PIG-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178253#comment-13178253
] 

Daniel Dai commented on PIG-2362:
---------------------------------

Seems "include-meta" is not being called after patch? This is necessary for hadoop 23.

> Rework Ant build.xml to use macrodef instead of antcall
> -------------------------------------------------------
>
>                 Key: PIG-2362
>                 URL: https://issues.apache.org/jira/browse/PIG-2362
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>            Priority: Minor
>             Fix For: 0.11
>
>         Attachments: PIG-2362.1.patch, PIG-2362.2.patch, PIG-2362.3.patch, PIG-2362.4.patch
>
>
> Antcall is evil: http://www.build-doctor.com/2008/03/13/antcall-is-evil/
> We'd better use macrodef and let Ant build a clean dependency graph.
> http://ant.apache.org/manual/Tasks/macrodef.html
> Right now we do like this:
> {code}
(Continue reading)

Picon

[Created] (PIG-2453) Fetching schema can be very slow for multi-thousand LOADs

Fetching schema can be very slow for multi-thousand LOADs
---------------------------------------------------------

                 Key: PIG-2453
                 URL: https://issues.apache.org/jira/browse/PIG-2453
             Project: Pig
          Issue Type: Bug
            Reporter: Dmitriy V. Ryaboy
            Assignee: Dmitriy V. Ryaboy

When a user tries to load resources with thousands of files using PigStorage, we spend an inordinate amount
of time looking for schema files. This is because we check for a schema file per loaded file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

Picon

[Commented] (PIG-2453) Fetching schema can be very slow for multi-thousand LOADs


    [
https://issues.apache.org/jira/browse/PIG-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178258#comment-13178258
] 

Dmitriy V. Ryaboy commented on PIG-2453:
----------------------------------------

One proposed solution is to only check for .pig_schema files on a per-directory level instead of per-file.
We can also probably do fewer NN calls by caching all found schema files and not checking
metaFilePath.exists() redundantly.

> Fetching schema can be very slow for multi-thousand LOADs
> ---------------------------------------------------------
>
>                 Key: PIG-2453
>                 URL: https://issues.apache.org/jira/browse/PIG-2453
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>
> When a user tries to load resources with thousands of files using PigStorage, we spend an inordinate
amount of time looking for schema files. This is because we check for a schema file per loaded file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

(Continue reading)

Picon

[Updated] (PIG-2453) Fetching schema can be very slow for multi-thousand LOADs


     [
https://issues.apache.org/jira/browse/PIG-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-2453:
-----------------------------------

    Attachment: PIG-2453.patch

Added caching, changed JsonMetadata behavior to ignore file-specific schemas. I don't think this
feature was used anywhere except tests anyway (?).

> Fetching schema can be very slow for multi-thousand LOADs
> ---------------------------------------------------------
>
>                 Key: PIG-2453
>                 URL: https://issues.apache.org/jira/browse/PIG-2453
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>         Attachments: PIG-2453.patch
>
>
> When a user tries to load resources with thousands of files using PigStorage, we spend an inordinate
amount of time looking for schema files. This is because we check for a schema file per loaded file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
(Continue reading)


Gmane