David Starina | 8 Feb 17:02 2016
Picon

Running Mahout on Hadoop cluster

Hi,

I am not sure why I can not find the info I am looking for online, probably
not searching in the right way, so I am hoping you guys will be able to
point me in the right direction.

I have set up a Mahout project in IntelliJ IDEA on my machine. I created a
class extending AbstractJob to run some MapReduce code. I have a remote
Hadoop cluster (actually two virtual machines to simulate remote cluster)
with Mahout installed.

How do I build/pack my custom class to run it on the remote cluster using
the remotely installed Mahout libraries? Or should I not use the remote
libraries and should all dependancies be packed and distributed together?

Thank you for any suggestions.

David
Alok Tanna | 4 Feb 06:08 2016
Picon

Re: Mahout error : seq2sparse

This command works thank you  , yes I am seeing lot of empty lines in my
input files. any magic command to remove this lines that would save lot of
time.
I would re run this once I have removed empty lines.

It would be great if I can get this working in local mode or else I will
have to send few days to get it working on hadoop\spark cluster.

Thanks,
Alok Tanna

On Wed, Feb 3, 2016 at 11:38 PM, Andrew Musselman <
andrew.musselman <at> gmail.com> wrote:

> Ah; looks like that config can be set in Hadoop's core-site.xml but if
> you're running Mahout in local mode that shouldn't help.
>
> Can you try this with local mode off, in other words on a running
> Hadoop/Spark cluster?
>
> Looking for empty lines could be run via a command like `grep -r "^$"
> input-file-directory`; blank lines will show up before your next prompt if
> so.
>
> On Wed, Feb 3, 2016 at 8:30 PM, Alok Tanna <tannaalok <at> gmail.com> wrote:
>
>> Thank you Andrew for the quick response . I have around 300 input files.
>> It would take a while for me to go though each file. I will try to look
>> into that, but then I had successfully generated the sequence file use mahout
>> seqdirectory for the same dataset. How can I find which mahout release I am
(Continue reading)

Alok Tanna | 4 Feb 04:33 2016
Picon

Mahout error : seq2sparse

Mahout in local mode

I am able to successfully run the below command on smaller data set, but then when I am running this command on large data set I am getting below error.  Its looks like I need to increase size of some parameter but then I am not sure which one.  It is failing with this error java.io.EOFException   which creating the dictionary-0 file 

Please fine the attached file for more details. 

command: mahout seq2sparse -i /home/ubuntu/AT/AT-Seq/ -o /home/ubuntu/AT/AT-vectors/ -lnorm -nv -wt tfidf

Main error : 


16/02/03 23:02:06 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:02:17 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:02:18 WARN mapred.LocalJobRunner: job_local1308764206_0003
java.io.EOFException
        at java.io.DataInputStream.readByte(DataInputStream.java:267)
        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:299)
        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:320)
        at org.apache.hadoop.io.Text.readFields(Text.java:263)
        at org.apache.mahout.common.StringTuple.readFields(StringTuple.java:142)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
        at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117)
        at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
16/02/03 23:02:18 INFO mapred.JobClient: Job complete: job_local1308764206_0003
16/02/03 23:02:18 INFO mapred.JobClient: Counters: 20
16/02/03 23:02:18 INFO mapred.JobClient:   File Output Format Counters
16/02/03 23:02:18 INFO mapred.JobClient:     Bytes Written=14923244
16/02/03 23:02:18 INFO mapred.JobClient:   FileSystemCounters
16/02/03 23:02:18 INFO mapred.JobClient:     FILE_BYTES_READ=1412144036729
16/02/03 23:02:18 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=323876626568
16/02/03 23:02:18 INFO mapred.JobClient:   File Input Format Counters
16/02/03 23:02:18 INFO mapred.JobClient:     Bytes Read=11885543289
16/02/03 23:02:18 INFO mapred.JobClient:   Map-Reduce Framework
16/02/03 23:02:18 INFO mapred.JobClient:     Reduce input groups=223
16/02/03 23:02:18 INFO mapred.JobClient:     Map output materialized bytes=2214020551
16/02/03 23:02:18 INFO mapred.JobClient:     Combine output records=0
16/02/03 23:02:18 INFO mapred.JobClient:     Map input records=223
16/02/03 23:02:18 INFO mapred.JobClient:     Reduce shuffle bytes=0
16/02/03 23:02:18 INFO mapred.JobClient:     Physical memory (bytes) snapshot=0
16/02/03 23:02:18 INFO mapred.JobClient:     Reduce output records=222
16/02/03 23:02:18 INFO mapred.JobClient:     Spilled Records=638
16/02/03 23:02:18 INFO mapred.JobClient:     Map output bytes=2214019100
16/02/03 23:02:18 INFO mapred.JobClient:     CPU time spent (ms)=0
16/02/03 23:02:18 INFO mapred.JobClient:     Total committed heap usaAT (bytes)=735978192896
16/02/03 23:02:18 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
16/02/03 23:02:18 INFO mapred.JobClient:     Combine input records=0
16/02/03 23:02:18 INFO mapred.JobClient:     Map output records=223
16/02/03 23:02:18 INFO mapred.JobClient:     SPLIT_RAW_BYTES=9100
16/02/03 23:02:18 INFO mapred.JobClient:     Reduce input records=222
Exception in thread "main" java.lang.IllegalStateException: Job failed!
        at org.apache.mahout.vectorizer.DictionaryVectorizer.makePartialVectors(DictionaryVectorizer.java:329)
        at org.apache.mahout.vectorizer.DictionaryVectorizer.createTermFrequencyVectors(DictionaryVectorizer.java:199)
        at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:274)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:56)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
.
.



--
Thanks & Regards,
 
Alok R. Tanna
 

ubuntu <at> :~/mahout/trunk/bin$ ./mahout seq2sparse -i /home/ubuntu/AT/AT-Seq/ -o /home/ubuntu/AT/AT-vectors -lnorm -nv -wt tfidf
MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath.
MAHOUT_LOCAL is set, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/ubuntu/mahout/trunk/examples/tarATt/mahout-examples-1.0-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLogATrBinder.class]
SLF4J: Found binding in [jar:file:/home/ubuntu/mahout/trunk/examples/tarATt/dependency/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLogATrBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLogATrFactory]
16/02/03 22:43:55 INFO vectorizer.SparseVectorsFromSequenceFiles: Maximum n-gram size is: 1
16/02/03 22:43:55 INFO vectorizer.SparseVectorsFromSequenceFiles: Minimum LLR value: 1.0
16/02/03 22:43:55 INFO vectorizer.SparseVectorsFromSequenceFiles: Number of reduce tasks: 1
16/02/03 22:43:55 INFO vectorizer.SparseVectorsFromSequenceFiles: Tokenizing documents in /home/ubuntu/AT/AT-Seq
16/02/03 22:43:55 INFO common.HadoopUtil: Deleting /home/ubuntu/AT/AT-vectors/tokenized-documents
16/02/03 22:43:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/02/03 22:43:56 INFO input.FileInputFormat: Total input paths to process : 1
16/02/03 22:43:56 INFO mapred.JobClient: Running job: job_local1577040045_0001
16/02/03 22:43:56 INFO mapred.LocalJobRunner: Waiting for map tasks
16/02/03 22:43:56 INFO mapred.LocalJobRunner: Starting task: attempt_local1577040045_0001_m_000000_0
16/02/03 22:43:56 INFO util.ProcessTree: setsid exited with exit code 0
16/02/03 22:43:56 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3e3d6c70
16/02/03 22:43:56 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-Seq/part-m-00000:0+33554432
16/02/03 22:43:56 INFO compress.CodecPool: Got brand-new decompressor
16/02/03 22:43:57 INFO mapred.JobClient:  map 0% reduce 0%
16/02/03 22:44:04 INFO mapred.LocalJobRunner:
16/02/03 22:44:05 INFO mapred.JobClient:  map 8% reduce 0%
16/02/03 22:44:10 INFO mapred.LocalJobRunner:
16/02/03 22:44:20 INFO mapred.LocalJobRunner:
16/02/03 22:44:21 INFO mapred.JobClient:  map 14% reduce 0%
16/02/03 22:44:29 INFO mapred.LocalJobRunner:
16/02/03 22:44:35 INFO mapred.LocalJobRunner:
16/02/03 22:44:35 INFO mapred.JobClient:  map 16% reduce 0%
16/02/03 22:44:41 INFO mapred.LocalJobRunner:
16/02/03 22:44:41 INFO mapred.JobClient:  map 25% reduce 0%
16/02/03 22:45:29 INFO mapred.LocalJobRunner:
16/02/03 22:46:01 INFO mapred.Task: Task:attempt_local1577040045_0001_m_000000_0 is done. And is in the process of commiting
16/02/03 22:46:01 INFO mapred.LocalJobRunner:
16/02/03 22:46:01 INFO mapred.Task: Task attempt_local1577040045_0001_m_000000_0 is allowed to commit now
16/02/03 22:46:01 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1577040045_0001_m_000000_0' to /home/ubuntu/AT/AT-vectors/tokenized-documents
16/02/03 22:46:01 INFO mapred.LocalJobRunner:
16/02/03 22:46:01 INFO mapred.Task: Task 'attempt_local1577040045_0001_m_000000_0' done.
16/02/03 22:46:01 INFO mapred.LocalJobRunner: Finishing task: attempt_local1577040045_0001_m_000000_0
16/02/03 22:46:01 INFO mapred.LocalJobRunner: Starting task: attempt_local1577040045_0001_m_000001_0
16/02/03 22:46:01 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2e6e1156
16/02/03 22:46:01 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-Seq/part-m-00000:33554432+33554432
16/02/03 22:46:07 INFO mapred.LocalJobRunner:
16/02/03 22:46:07 INFO mapred.JobClient:  map 37% reduce 0%
16/02/03 22:46:10 INFO mapred.LocalJobRunner:
16/02/03 22:46:10 INFO mapred.JobClient:  map 48% reduce 0%
16/02/03 22:46:12 INFO mapred.Task: Task:attempt_local1577040045_0001_m_000001_0 is done. And is in the process of commiting
16/02/03 22:46:12 INFO mapred.LocalJobRunner:
16/02/03 22:46:12 INFO mapred.Task: Task attempt_local1577040045_0001_m_000001_0 is allowed to commit now
16/02/03 22:46:12 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1577040045_0001_m_000001_0' to /home/ubuntu/AT/AT-vectors/tokenized-documents
16/02/03 22:46:12 INFO mapred.LocalJobRunner:
16/02/03 22:46:12 INFO mapred.Task: Task 'attempt_local1577040045_0001_m_000001_0' done.
16/02/03 22:46:12 INFO mapred.LocalJobRunner: Finishing task: attempt_local1577040045_0001_m_000001_0
16/02/03 22:46:12 INFO mapred.LocalJobRunner: Starting task: attempt_local1577040045_0001_m_000002_0
16/02/03 22:46:12 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 57dfddf4
16/02/03 22:46:12 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-Seq/part-m-00000:67108864+33554432
16/02/03 22:46:13 INFO mapred.JobClient:  map 50% reduce 0%
16/02/03 22:46:18 INFO mapred.LocalJobRunner:
16/02/03 22:46:18 INFO mapred.JobClient:  map 61% reduce 0%
16/02/03 22:46:21 INFO mapred.LocalJobRunner:
16/02/03 22:46:21 INFO mapred.JobClient:  map 66% reduce 0%
16/02/03 22:46:24 INFO mapred.LocalJobRunner:
16/02/03 22:46:27 INFO mapred.LocalJobRunner:
16/02/03 22:46:28 INFO mapred.JobClient:  map 68% reduce 0%
16/02/03 22:46:30 INFO mapred.LocalJobRunner:
16/02/03 22:46:31 INFO mapred.JobClient:  map 72% reduce 0%
16/02/03 22:46:33 INFO mapred.LocalJobRunner:
16/02/03 22:46:34 INFO mapred.JobClient:  map 73% reduce 0%
16/02/03 22:46:36 INFO mapred.LocalJobRunner:
16/02/03 22:46:37 INFO mapred.JobClient:  map 75% reduce 0%
16/02/03 22:46:39 INFO mapred.LocalJobRunner:
16/02/03 22:46:42 INFO mapred.Task: Task:attempt_local1577040045_0001_m_000002_0 is done. And is in the process of commiting
16/02/03 22:46:42 INFO mapred.LocalJobRunner:
16/02/03 22:46:42 INFO mapred.Task: Task attempt_local1577040045_0001_m_000002_0 is allowed to commit now
16/02/03 22:46:42 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1577040045_0001_m_000002_0' to /home/ubuntu/AT/AT-vectors/tokenized-documents
16/02/03 22:46:42 INFO mapred.LocalJobRunner:
16/02/03 22:46:42 INFO mapred.Task: Task 'attempt_local1577040045_0001_m_000002_0' done.
16/02/03 22:46:42 INFO mapred.LocalJobRunner: Finishing task: attempt_local1577040045_0001_m_000002_0
16/02/03 22:46:42 INFO mapred.LocalJobRunner: Starting task: attempt_local1577040045_0001_m_000003_0
16/02/03 22:46:42 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 629ee12
16/02/03 22:46:42 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-Seq/part-m-00000:100663296+33416929
16/02/03 22:46:48 INFO mapred.LocalJobRunner:
16/02/03 22:46:48 INFO mapred.JobClient:  map 79% reduce 0%
16/02/03 22:46:51 INFO mapred.LocalJobRunner:
16/02/03 22:46:54 INFO mapred.LocalJobRunner:
16/02/03 22:47:00 INFO mapred.LocalJobRunner:
16/02/03 22:47:00 INFO mapred.JobClient:  map 83% reduce 0%
16/02/03 22:47:06 INFO mapred.LocalJobRunner:
16/02/03 22:47:06 INFO mapred.JobClient:  map 87% reduce 0%
16/02/03 22:47:15 INFO mapred.LocalJobRunner:
16/02/03 22:47:21 INFO mapred.LocalJobRunner:
16/02/03 22:47:22 INFO mapred.JobClient:  map 88% reduce 0%
16/02/03 22:47:24 INFO mapred.LocalJobRunner:
16/02/03 22:47:24 INFO mapred.JobClient:  map 90% reduce 0%
16/02/03 22:47:27 INFO mapred.LocalJobRunner:
16/02/03 22:47:27 INFO mapred.JobClient:  map 92% reduce 0%
16/02/03 22:47:30 INFO mapred.LocalJobRunner:
16/02/03 22:47:30 INFO mapred.JobClient:  map 94% reduce 0%
16/02/03 22:47:33 INFO mapred.LocalJobRunner:
16/02/03 22:47:36 INFO mapred.LocalJobRunner:
16/02/03 22:47:36 INFO mapred.JobClient:  map 96% reduce 0%
16/02/03 22:47:39 INFO mapred.LocalJobRunner:
16/02/03 22:47:42 INFO mapred.LocalJobRunner:
16/02/03 22:47:42 INFO mapred.JobClient:  map 98% reduce 0%
16/02/03 22:47:45 INFO mapred.LocalJobRunner:
16/02/03 22:47:45 INFO mapred.JobClient:  map 99% reduce 0%
16/02/03 22:47:47 INFO mapred.Task: Task:attempt_local1577040045_0001_m_000003_0 is done. And is in the process of commiting
16/02/03 22:47:47 INFO mapred.LocalJobRunner:
16/02/03 22:47:47 INFO mapred.Task: Task attempt_local1577040045_0001_m_000003_0 is allowed to commit now
16/02/03 22:47:47 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1577040045_0001_m_000003_0' to /home/ubuntu/AT/AT-vectors/tokenized-documents
16/02/03 22:47:47 INFO mapred.LocalJobRunner:
16/02/03 22:47:47 INFO mapred.Task: Task 'attempt_local1577040045_0001_m_000003_0' done.
16/02/03 22:47:47 INFO mapred.LocalJobRunner: Finishing task: attempt_local1577040045_0001_m_000003_0
16/02/03 22:47:47 INFO mapred.LocalJobRunner: Map task executor complete.
16/02/03 22:47:47 INFO mapred.JobClient:  map 100% reduce 0%
16/02/03 22:47:47 INFO mapred.JobClient: Job complete: job_local1577040045_0001
16/02/03 22:47:47 INFO mapred.JobClient: Counters: 12
16/02/03 22:47:47 INFO mapred.JobClient:   File Output Format Counters
16/02/03 22:47:47 INFO mapred.JobClient:     Bytes Written=2331715562
16/02/03 22:47:47 INFO mapred.JobClient:   FileSystemCounters
16/02/03 22:47:47 INFO mapred.JobClient:     FILE_BYTES_READ=500146067
16/02/03 22:47:47 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=6713694238
16/02/03 22:47:47 INFO mapred.JobClient:   File Input Format Counters
16/02/03 22:47:47 INFO mapred.JobClient:     Bytes Read=150434481
16/02/03 22:47:47 INFO mapred.JobClient:   Map-Reduce Framework
16/02/03 22:47:47 INFO mapred.JobClient:     Map input records=223
16/02/03 22:47:47 INFO mapred.JobClient:     Physical memory (bytes) snapshot=0
16/02/03 22:47:47 INFO mapred.JobClient:     Spilled Records=0
16/02/03 22:47:47 INFO mapred.JobClient:     CPU time spent (ms)=0
16/02/03 22:47:47 INFO mapred.JobClient:     Total committed heap usaAT (bytes)=54023159808
16/02/03 22:47:47 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
16/02/03 22:47:47 INFO mapred.JobClient:     Map output records=223
16/02/03 22:47:47 INFO mapred.JobClient:     SPLIT_RAW_BYTES=424
16/02/03 22:47:47 INFO vectorizer.SparseVectorsFromSequenceFiles: Creating Term Frequency Vectors
16/02/03 22:47:47 INFO vectorizer.DictionaryVectorizer: Creating dictionary from /home/ubuntu/AT/AT-vectors/tokenized-documents and saving at /home/ubuntu/AT/AT-vectors/wordcount
16/02/03 22:47:47 INFO common.HadoopUtil: Deleting /home/ubuntu/AT/AT-vectors/wordcount
16/02/03 22:47:48 INFO input.FileInputFormat: Total input paths to process : 4
16/02/03 22:47:48 INFO mapred.JobClient: Running job: job_local1001281779_0002
16/02/03 22:47:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000000_0
16/02/03 22:47:48 INFO mapred.LocalJobRunner: Waiting for map tasks
16/02/03 22:47:48 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 63d46a99
16/02/03 22:47:48 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:469762048+36442589
16/02/03 22:47:48 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:47:48 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:47:48 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:47:49 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:47:49 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000000_0 is done. And is in the process of commiting
16/02/03 22:47:49 INFO mapred.LocalJobRunner:
16/02/03 22:47:49 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000000_0' done.
16/02/03 22:47:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000000_0
16/02/03 22:47:49 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000001_0
16/02/03 22:47:49 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> c7ab4a1
16/02/03 22:47:49 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:0+33554432
16/02/03 22:47:49 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:47:49 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:47:49 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:47:49 INFO mapred.JobClient:  map 0% reduce 0%
16/02/03 22:47:52 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:47:52 INFO mapred.MapTask: Finished spill 0
16/02/03 22:47:52 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000001_0 is done. And is in the process of commiting
16/02/03 22:47:52 INFO mapred.LocalJobRunner:
16/02/03 22:47:52 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000001_0' done.
16/02/03 22:47:52 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000001_0
16/02/03 22:47:52 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000002_0
16/02/03 22:47:52 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 7e310eb0
16/02/03 22:47:52 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:33554432+33554432
16/02/03 22:47:52 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:47:52 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:47:52 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:47:52 INFO mapred.JobClient:  map 1% reduce 0%
16/02/03 22:47:53 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:47:53 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000002_0 is done. And is in the process of commiting
16/02/03 22:47:53 INFO mapred.LocalJobRunner:
16/02/03 22:47:53 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000002_0' done.
16/02/03 22:47:53 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000002_0
16/02/03 22:47:53 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000003_0
16/02/03 22:47:53 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2486e704
16/02/03 22:47:53 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:67108864+33554432
16/02/03 22:47:53 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:47:53 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:47:53 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:47:56 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:47:56 INFO mapred.MapTask: Finished spill 0
16/02/03 22:47:56 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000003_0 is done. And is in the process of commiting
16/02/03 22:47:58 INFO mapred.LocalJobRunner:
16/02/03 22:47:58 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000003_0' done.
16/02/03 22:47:58 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000003_0
16/02/03 22:47:58 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000004_0
16/02/03 22:47:58 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 63109aa1
16/02/03 22:47:58 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:100663296+33554432
16/02/03 22:47:58 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:47:58 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:47:58 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:47:58 INFO mapred.JobClient:  map 2% reduce 0%
16/02/03 22:47:59 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:47:59 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000004_0 is done. And is in the process of commiting
16/02/03 22:47:59 INFO mapred.LocalJobRunner:
16/02/03 22:47:59 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000004_0' done.
16/02/03 22:47:59 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000004_0
16/02/03 22:47:59 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000005_0
16/02/03 22:47:59 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 6e7f3fbd
16/02/03 22:47:59 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:134217728+33554432
16/02/03 22:47:59 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:47:59 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:47:59 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:04 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:04 INFO mapred.MapTask: Finished spill 0
16/02/03 22:48:04 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000005_0 is done. And is in the process of commiting
16/02/03 22:48:04 INFO mapred.LocalJobRunner:
16/02/03 22:48:04 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000005_0' done.
16/02/03 22:48:04 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000005_0
16/02/03 22:48:04 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000006_0
16/02/03 22:48:04 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 64ba64c8
16/02/03 22:48:04 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:167772160+33554432
16/02/03 22:48:04 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:04 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:04 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:05 INFO mapred.JobClient:  map 4% reduce 0%
16/02/03 22:48:07 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:07 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000006_0 is done. And is in the process of commiting
16/02/03 22:48:07 INFO mapred.LocalJobRunner:
16/02/03 22:48:07 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000006_0' done.
16/02/03 22:48:07 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000006_0
16/02/03 22:48:07 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000007_0
16/02/03 22:48:07 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 1253b973
16/02/03 22:48:07 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:201326592+33554432
16/02/03 22:48:07 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:07 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:07 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:08 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:09 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000007_0 is done. And is in the process of commiting
16/02/03 22:48:09 INFO mapred.LocalJobRunner:
16/02/03 22:48:09 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000007_0' done.
16/02/03 22:48:09 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000007_0
16/02/03 22:48:09 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000008_0
16/02/03 22:48:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 11bbad83
16/02/03 22:48:09 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:234881024+33554432
16/02/03 22:48:09 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:09 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:09 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:10 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:10 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000008_0 is done. And is in the process of commiting
16/02/03 22:48:10 INFO mapred.LocalJobRunner:
16/02/03 22:48:10 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000008_0' done.
16/02/03 22:48:10 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000008_0
16/02/03 22:48:10 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000009_0
16/02/03 22:48:10 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3a296a42
16/02/03 22:48:10 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:268435456+33554432
16/02/03 22:48:10 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:10 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:10 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:11 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:11 INFO mapred.MapTask: Finished spill 0
16/02/03 22:48:11 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000009_0 is done. And is in the process of commiting
16/02/03 22:48:11 INFO mapred.LocalJobRunner:
16/02/03 22:48:11 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000009_0' done.
16/02/03 22:48:11 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000009_0
16/02/03 22:48:11 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000010_0
16/02/03 22:48:11 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 5dda7d56
16/02/03 22:48:11 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:301989888+33554432
16/02/03 22:48:11 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:11 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:11 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:12 INFO mapred.JobClient:  map 5% reduce 0%
16/02/03 22:48:13 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:14 INFO mapred.MapTask: Finished spill 0
16/02/03 22:48:14 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000010_0 is done. And is in the process of commiting
16/02/03 22:48:14 INFO mapred.LocalJobRunner:
16/02/03 22:48:14 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000010_0' done.
16/02/03 22:48:14 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000010_0
16/02/03 22:48:14 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000011_0
16/02/03 22:48:14 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 5bcf59f5
16/02/03 22:48:14 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:335544320+33554432
16/02/03 22:48:14 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:14 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:14 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:15 INFO mapred.JobClient:  map 7% reduce 0%
16/02/03 22:48:16 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:16 INFO mapred.MapTask: Finished spill 0
16/02/03 22:48:16 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000011_0 is done. And is in the process of commiting
16/02/03 22:48:16 INFO mapred.LocalJobRunner:
16/02/03 22:48:16 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000011_0' done.
16/02/03 22:48:16 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000011_0
16/02/03 22:48:16 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000012_0
16/02/03 22:48:16 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3826515f
16/02/03 22:48:16 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:369098752+33554432
16/02/03 22:48:16 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:16 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:16 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:17 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:17 INFO mapred.MapTask: Finished spill 0
16/02/03 22:48:17 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000012_0 is done. And is in the process of commiting
16/02/03 22:48:17 INFO mapred.LocalJobRunner:
16/02/03 22:48:17 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000012_0' done.
16/02/03 22:48:17 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000012_0
16/02/03 22:48:17 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000013_0
16/02/03 22:48:17 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 59c34d14
16/02/03 22:48:17 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:402653184+33554432
16/02/03 22:48:17 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:17 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:17 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:17 INFO mapred.JobClient:  map 10% reduce 0%
16/02/03 22:48:18 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:18 INFO mapred.MapTask: Finished spill 0
16/02/03 22:48:18 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000013_0 is done. And is in the process of commiting
16/02/03 22:48:18 INFO mapred.LocalJobRunner:
16/02/03 22:48:18 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000013_0' done.
16/02/03 22:48:18 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000013_0
16/02/03 22:48:18 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000014_0
16/02/03 22:48:18 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 7f49d071
16/02/03 22:48:18 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:436207616+33554432
16/02/03 22:48:18 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:18 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:18 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:19 INFO mapred.JobClient:  map 11% reduce 0%
16/02/03 22:48:19 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:20 INFO mapred.MapTask: Finished spill 0
16/02/03 22:48:20 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000014_0 is done. And is in the process of commiting
16/02/03 22:48:20 INFO mapred.LocalJobRunner:
16/02/03 22:48:20 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000014_0' done.
16/02/03 22:48:20 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000014_0
16/02/03 22:48:20 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000015_0
16/02/03 22:48:20 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 6b364c80
16/02/03 22:48:20 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:0+33554432
16/02/03 22:48:20 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:20 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:20 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:20 INFO mapred.JobClient:  map 12% reduce 0%
16/02/03 22:48:25 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:25 INFO mapred.MapTask: Finished spill 0
16/02/03 22:48:25 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000015_0 is done. And is in the process of commiting
16/02/03 22:48:25 INFO mapred.LocalJobRunner:
16/02/03 22:48:25 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000015_0' done.
16/02/03 22:48:25 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000015_0
16/02/03 22:48:25 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000016_0
16/02/03 22:48:25 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 696ea882
16/02/03 22:48:25 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:33554432+33554432
16/02/03 22:48:25 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:25 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:25 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:26 INFO mapred.JobClient:  map 14% reduce 0%
16/02/03 22:48:28 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:28 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000016_0 is done. And is in the process of commiting
16/02/03 22:48:28 INFO mapred.LocalJobRunner:
16/02/03 22:48:28 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000016_0' done.
16/02/03 22:48:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000016_0
16/02/03 22:48:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000017_0
16/02/03 22:48:28 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 4feaa03f
16/02/03 22:48:28 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:67108864+33554432
16/02/03 22:48:28 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:28 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:28 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:29 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:29 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000017_0 is done. And is in the process of commiting
16/02/03 22:48:29 INFO mapred.LocalJobRunner:
16/02/03 22:48:29 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000017_0' done.
16/02/03 22:48:29 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000017_0
16/02/03 22:48:29 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000018_0
16/02/03 22:48:29 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 4308d92c
16/02/03 22:48:29 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:100663296+33554432
16/02/03 22:48:29 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:29 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:29 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:30 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:30 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000018_0 is done. And is in the process of commiting
16/02/03 22:48:30 INFO mapred.LocalJobRunner:
16/02/03 22:48:30 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000018_0' done.
16/02/03 22:48:30 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000018_0
16/02/03 22:48:30 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000019_0
16/02/03 22:48:30 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> c47eaa2
16/02/03 22:48:30 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:134217728+33554432
16/02/03 22:48:30 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:30 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:30 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:31 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:31 INFO mapred.MapTask: Finished spill 0
16/02/03 22:48:31 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000019_0 is done. And is in the process of commiting
16/02/03 22:48:31 INFO mapred.LocalJobRunner:
16/02/03 22:48:31 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000019_0' done.
16/02/03 22:48:31 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000019_0
16/02/03 22:48:31 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000020_0
16/02/03 22:48:31 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 5fb7e03a
16/02/03 22:48:31 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:167772160+33554432
16/02/03 22:48:31 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:31 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:31 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:32 INFO mapred.JobClient:  map 15% reduce 0%
16/02/03 22:48:37 INFO mapred.LocalJobRunner:
16/02/03 22:48:38 INFO mapred.JobClient:  map 17% reduce 0%
16/02/03 22:48:39 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:39 INFO mapred.MapTask: Finished spill 0
16/02/03 22:48:39 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000020_0 is done. And is in the process of commiting
16/02/03 22:48:39 INFO mapred.LocalJobRunner:
16/02/03 22:48:39 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000020_0' done.
16/02/03 22:48:39 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000020_0
16/02/03 22:48:39 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000021_0
16/02/03 22:48:39 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 1d559265
16/02/03 22:48:39 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:201326592+33554432
16/02/03 22:48:39 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:39 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:39 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:42 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:42 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000021_0 is done. And is in the process of commiting
16/02/03 22:48:42 INFO mapred.LocalJobRunner:
16/02/03 22:48:42 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000021_0' done.
16/02/03 22:48:42 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000021_0
16/02/03 22:48:42 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000022_0
16/02/03 22:48:42 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3dc44d71
16/02/03 22:48:42 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:234881024+33554432
16/02/03 22:48:42 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:42 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:42 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:45 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:45 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000022_0 is done. And is in the process of commiting
16/02/03 22:48:45 INFO mapred.LocalJobRunner:
16/02/03 22:48:45 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000022_0' done.
16/02/03 22:48:45 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000022_0
16/02/03 22:48:45 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000023_0
16/02/03 22:48:45 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 1581392b
16/02/03 22:48:45 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:268435456+33554432
16/02/03 22:48:45 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:45 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:45 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:47 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:47 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000023_0 is done. And is in the process of commiting
16/02/03 22:48:47 INFO mapred.LocalJobRunner:
16/02/03 22:48:47 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000023_0' done.
16/02/03 22:48:47 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000023_0
16/02/03 22:48:47 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000024_0
16/02/03 22:48:47 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2b9cbd43
16/02/03 22:48:47 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:301989888+33554432
16/02/03 22:48:47 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:47 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:47 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:49 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:49 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000024_0 is done. And is in the process of commiting
16/02/03 22:48:49 INFO mapred.LocalJobRunner:
16/02/03 22:48:49 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000024_0' done.
16/02/03 22:48:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000024_0
16/02/03 22:48:49 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000025_0
16/02/03 22:48:49 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 34b57c75
16/02/03 22:48:49 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:335544320+33554432
16/02/03 22:48:49 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:49 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:49 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:50 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:48:50 INFO mapred.MapTask: Finished spill 0
16/02/03 22:48:50 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000025_0 is done. And is in the process of commiting
16/02/03 22:48:50 INFO mapred.LocalJobRunner:
16/02/03 22:48:50 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000025_0' done.
16/02/03 22:48:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000025_0
16/02/03 22:48:50 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000026_0
16/02/03 22:48:50 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3236213e
16/02/03 22:48:50 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:369098752+33554432
16/02/03 22:48:50 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:48:50 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:48:50 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:48:51 INFO mapred.JobClient:  map 18% reduce 0%
16/02/03 22:48:56 INFO mapred.LocalJobRunner:
16/02/03 22:49:29 INFO mapred.LocalJobRunner:
16/02/03 22:49:29 INFO mapred.JobClient:  map 20% reduce 0%
16/02/03 22:49:33 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:49:33 INFO mapred.MapTask: Finished spill 0
16/02/03 22:49:33 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000026_0 is done. And is in the process of commiting
16/02/03 22:49:33 INFO mapred.LocalJobRunner:
16/02/03 22:49:33 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000026_0' done.
16/02/03 22:49:33 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000026_0
16/02/03 22:49:33 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000027_0
16/02/03 22:49:33 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3101e292
16/02/03 22:49:33 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:402653184+33554432
16/02/03 22:49:33 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:49:33 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:49:33 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:49:48 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:49:48 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000027_0 is done. And is in the process of commiting
16/02/03 22:49:48 INFO mapred.LocalJobRunner:
16/02/03 22:49:48 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000027_0' done.
16/02/03 22:49:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000027_0
16/02/03 22:49:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000028_0
16/02/03 22:49:48 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> ab4e449
16/02/03 22:49:48 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:436207616+33554432
16/02/03 22:49:48 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:49:48 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:49:48 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:50:02 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:50:02 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000028_0 is done. And is in the process of commiting
16/02/03 22:50:02 INFO mapred.LocalJobRunner:
16/02/03 22:50:02 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000028_0' done.
16/02/03 22:50:02 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000028_0
16/02/03 22:50:02 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000029_0
16/02/03 22:50:02 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3e12c345
16/02/03 22:50:02 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:469762048+33554432
16/02/03 22:50:02 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:50:02 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:50:02 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:50:15 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:50:15 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000029_0 is done. And is in the process of commiting
16/02/03 22:50:15 INFO mapred.LocalJobRunner:
16/02/03 22:50:15 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000029_0' done.
16/02/03 22:50:15 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000029_0
16/02/03 22:50:15 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000030_0
16/02/03 22:50:15 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 4457d912
16/02/03 22:50:15 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:503316480+33554432
16/02/03 22:50:15 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:50:22 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:50:22 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:50:34 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:50:34 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000030_0 is done. And is in the process of commiting
16/02/03 22:50:34 INFO mapred.LocalJobRunner:
16/02/03 22:50:34 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000030_0' done.
16/02/03 22:50:34 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000030_0
16/02/03 22:50:34 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000031_0
16/02/03 22:50:34 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 5a34e632
16/02/03 22:50:34 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:536870912+33554432
16/02/03 22:50:34 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:50:34 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:50:34 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:50:46 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:50:46 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000031_0 is done. And is in the process of commiting
16/02/03 22:50:46 INFO mapred.LocalJobRunner:
16/02/03 22:50:46 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000031_0' done.
16/02/03 22:50:46 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000031_0
16/02/03 22:50:46 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000032_0
16/02/03 22:50:46 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 508b1bf4
16/02/03 22:50:46 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:570425344+33554432
16/02/03 22:50:46 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:50:46 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:50:46 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:50:57 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:50:57 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000032_0 is done. And is in the process of commiting
16/02/03 22:50:57 INFO mapred.LocalJobRunner:
16/02/03 22:50:57 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000032_0' done.
16/02/03 22:50:57 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000032_0
16/02/03 22:50:57 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000033_0
16/02/03 22:50:57 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 7df2e609
16/02/03 22:50:57 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:603979776+33554432
16/02/03 22:50:57 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:50:57 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:50:57 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:51:08 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:51:08 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000033_0 is done. And is in the process of commiting
16/02/03 22:51:08 INFO mapred.LocalJobRunner:
16/02/03 22:51:08 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000033_0' done.
16/02/03 22:51:08 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000033_0
16/02/03 22:51:08 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000034_0
16/02/03 22:51:08 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 22a41ee1
16/02/03 22:51:08 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:637534208+33554432
16/02/03 22:51:08 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:51:08 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:51:08 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:51:17 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:51:17 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000034_0 is done. And is in the process of commiting
16/02/03 22:51:17 INFO mapred.LocalJobRunner:
16/02/03 22:51:17 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000034_0' done.
16/02/03 22:51:17 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000034_0
16/02/03 22:51:17 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000035_0
16/02/03 22:51:17 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 70b5cd50
16/02/03 22:51:17 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:671088640+33554432
16/02/03 22:51:17 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:51:17 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:51:17 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:51:26 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:51:26 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000035_0 is done. And is in the process of commiting
16/02/03 22:51:26 INFO mapred.LocalJobRunner:
16/02/03 22:51:26 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000035_0' done.
16/02/03 22:51:26 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000035_0
16/02/03 22:51:26 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000036_0
16/02/03 22:51:26 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> dad5960
16/02/03 22:51:26 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:704643072+33554432
16/02/03 22:51:26 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:51:26 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:51:26 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:51:35 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:51:35 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000036_0 is done. And is in the process of commiting
16/02/03 22:51:35 INFO mapred.LocalJobRunner:
16/02/03 22:51:35 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000036_0' done.
16/02/03 22:51:35 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000036_0
16/02/03 22:51:35 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000037_0
16/02/03 22:51:35 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 7317851
16/02/03 22:51:35 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:738197504+33554432
16/02/03 22:51:35 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:51:35 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:51:35 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:51:42 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:51:42 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000037_0 is done. And is in the process of commiting
16/02/03 22:51:42 INFO mapred.LocalJobRunner:
16/02/03 22:51:42 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000037_0' done.
16/02/03 22:51:42 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000037_0
16/02/03 22:51:42 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000038_0
16/02/03 22:51:42 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3ba75545
16/02/03 22:51:42 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:771751936+33554432
16/02/03 22:51:42 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:51:42 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:51:42 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:51:49 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:51:49 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000038_0 is done. And is in the process of commiting
16/02/03 22:51:49 INFO mapred.LocalJobRunner:
16/02/03 22:51:49 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000038_0' done.
16/02/03 22:51:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000038_0
16/02/03 22:51:49 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000039_0
16/02/03 22:51:49 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2f3e32c7
16/02/03 22:51:49 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:805306368+33554432
16/02/03 22:51:49 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:51:49 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:51:49 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:51:55 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:51:55 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000039_0 is done. And is in the process of commiting
16/02/03 22:51:55 INFO mapred.LocalJobRunner:
16/02/03 22:51:55 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000039_0' done.
16/02/03 22:51:55 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000039_0
16/02/03 22:51:55 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000040_0
16/02/03 22:51:55 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 7594b386
16/02/03 22:51:55 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:838860800+33554432
16/02/03 22:51:55 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:51:55 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:51:55 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:01 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:01 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000040_0 is done. And is in the process of commiting
16/02/03 22:52:01 INFO mapred.LocalJobRunner:
16/02/03 22:52:01 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000040_0' done.
16/02/03 22:52:01 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000040_0
16/02/03 22:52:01 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000041_0
16/02/03 22:52:01 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 4e8a22c9
16/02/03 22:52:01 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:872415232+33554432
16/02/03 22:52:01 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:01 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:01 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:06 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:06 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000041_0 is done. And is in the process of commiting
16/02/03 22:52:06 INFO mapred.LocalJobRunner:
16/02/03 22:52:06 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000041_0' done.
16/02/03 22:52:06 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000041_0
16/02/03 22:52:06 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000042_0
16/02/03 22:52:06 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2165ed11
16/02/03 22:52:06 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:905969664+33554432
16/02/03 22:52:06 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:06 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:06 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:10 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:10 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000042_0 is done. And is in the process of commiting
16/02/03 22:52:10 INFO mapred.LocalJobRunner:
16/02/03 22:52:10 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000042_0' done.
16/02/03 22:52:10 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000042_0
16/02/03 22:52:10 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000043_0
16/02/03 22:52:10 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 15580e1b
16/02/03 22:52:10 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:939524096+33554432
16/02/03 22:52:10 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:10 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:10 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:13 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:13 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000043_0 is done. And is in the process of commiting
16/02/03 22:52:13 INFO mapred.LocalJobRunner:
16/02/03 22:52:13 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000043_0' done.
16/02/03 22:52:13 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000043_0
16/02/03 22:52:13 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000044_0
16/02/03 22:52:13 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 4814c78
16/02/03 22:52:13 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:973078528+33554432
16/02/03 22:52:13 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:13 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:13 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:16 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:16 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000044_0 is done. And is in the process of commiting
16/02/03 22:52:16 INFO mapred.LocalJobRunner:
16/02/03 22:52:16 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000044_0' done.
16/02/03 22:52:16 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000044_0
16/02/03 22:52:16 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000045_0
16/02/03 22:52:16 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3e61061d
16/02/03 22:52:16 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:1006632960+33554432
16/02/03 22:52:16 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:16 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:16 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:18 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:18 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000045_0 is done. And is in the process of commiting
16/02/03 22:52:18 INFO mapred.LocalJobRunner:
16/02/03 22:52:18 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000045_0' done.
16/02/03 22:52:18 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000045_0
16/02/03 22:52:18 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000046_0
16/02/03 22:52:18 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 67296cd0
16/02/03 22:52:18 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:1040187392+33554432
16/02/03 22:52:18 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:18 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:18 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:19 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:19 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000046_0 is done. And is in the process of commiting
16/02/03 22:52:19 INFO mapred.LocalJobRunner:
16/02/03 22:52:19 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000046_0' done.
16/02/03 22:52:19 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000046_0
16/02/03 22:52:19 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000047_0
16/02/03 22:52:19 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 4c05b3a6
16/02/03 22:52:19 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:0+33554432
16/02/03 22:52:19 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:19 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:19 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:27 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:28 INFO mapred.MapTask: Finished spill 0
16/02/03 22:52:28 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000047_0 is done. And is in the process of commiting
16/02/03 22:52:28 INFO mapred.LocalJobRunner:
16/02/03 22:52:28 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000047_0' done.
16/02/03 22:52:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000047_0
16/02/03 22:52:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000048_0
16/02/03 22:52:28 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 293c8965
16/02/03 22:52:28 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:33554432+33554432
16/02/03 22:52:28 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:28 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:28 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:28 INFO mapred.JobClient:  map 21% reduce 0%
16/02/03 22:52:28 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:28 INFO mapred.MapTask: Finished spill 0
16/02/03 22:52:28 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000048_0 is done. And is in the process of commiting
16/02/03 22:52:28 INFO mapred.LocalJobRunner:
16/02/03 22:52:28 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000048_0' done.
16/02/03 22:52:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000048_0
16/02/03 22:52:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000049_0
16/02/03 22:52:28 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 573cd86d
16/02/03 22:52:28 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:67108864+33554432
16/02/03 22:52:28 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:28 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:28 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:36 INFO mapred.JobClient:  map 22% reduce 0%
16/02/03 22:52:37 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:37 INFO mapred.MapTask: Finished spill 0
16/02/03 22:52:37 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000049_0 is done. And is in the process of commiting
16/02/03 22:52:37 INFO mapred.LocalJobRunner:
16/02/03 22:52:37 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000049_0' done.
16/02/03 22:52:37 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000049_0
16/02/03 22:52:37 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000050_0
16/02/03 22:52:37 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 15dc6730
16/02/03 22:52:37 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:100663296+33554432
16/02/03 22:52:37 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:37 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:37 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:37 INFO mapred.JobClient:  map 24% reduce 0%
16/02/03 22:52:47 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:47 INFO mapred.MapTask: Finished spill 0
16/02/03 22:52:47 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000050_0 is done. And is in the process of commiting
16/02/03 22:52:47 INFO mapred.LocalJobRunner:
16/02/03 22:52:47 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000050_0' done.
16/02/03 22:52:47 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000050_0
16/02/03 22:52:47 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000051_0
16/02/03 22:52:47 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 21daa31e
16/02/03 22:52:47 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:134217728+33554432
16/02/03 22:52:47 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:47 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:47 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:48 INFO mapred.JobClient:  map 25% reduce 0%
16/02/03 22:52:48 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:48 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000051_0 is done. And is in the process of commiting
16/02/03 22:52:48 INFO mapred.LocalJobRunner:
16/02/03 22:52:48 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000051_0' done.
16/02/03 22:52:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000051_0
16/02/03 22:52:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000052_0
16/02/03 22:52:48 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 5b7b5f28
16/02/03 22:52:48 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:167772160+33554432
16/02/03 22:52:48 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:48 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:48 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:50 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:50 INFO mapred.MapTask: Finished spill 0
16/02/03 22:52:50 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000052_0 is done. And is in the process of commiting
16/02/03 22:52:50 INFO mapred.LocalJobRunner:
16/02/03 22:52:50 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000052_0' done.
16/02/03 22:52:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000052_0
16/02/03 22:52:50 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000053_0
16/02/03 22:52:50 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 18def165
16/02/03 22:52:50 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:0+33554432
16/02/03 22:52:50 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:50 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:50 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:50 INFO mapred.JobClient:  map 27% reduce 0%
16/02/03 22:52:55 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:56 INFO mapred.MapTask: Finished spill 0
16/02/03 22:52:56 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000053_0 is done. And is in the process of commiting
16/02/03 22:52:56 INFO mapred.LocalJobRunner:
16/02/03 22:52:56 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000053_0' done.
16/02/03 22:52:56 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000053_0
16/02/03 22:52:56 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000054_0
16/02/03 22:52:56 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 5f40d2b9
16/02/03 22:52:56 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:33554432+33554432
16/02/03 22:52:56 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:56 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:56 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:56 INFO mapred.JobClient:  map 28% reduce 0%
16/02/03 22:52:58 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:52:59 INFO mapred.MapTask: Finished spill 0
16/02/03 22:52:59 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000054_0 is done. And is in the process of commiting
16/02/03 22:52:59 INFO mapred.LocalJobRunner:
16/02/03 22:52:59 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000054_0' done.
16/02/03 22:52:59 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000054_0
16/02/03 22:52:59 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000055_0
16/02/03 22:52:59 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 35089735
16/02/03 22:52:59 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:67108864+33554432
16/02/03 22:52:59 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:52:59 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:52:59 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:52:59 INFO mapred.JobClient:  map 30% reduce 0%
16/02/03 22:53:00 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:00 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000055_0 is done. And is in the process of commiting
16/02/03 22:53:00 INFO mapred.LocalJobRunner:
16/02/03 22:53:00 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000055_0' done.
16/02/03 22:53:00 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000055_0
16/02/03 22:53:00 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000056_0
16/02/03 22:53:00 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 20408eb0
16/02/03 22:53:00 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:100663296+33554432
16/02/03 22:53:00 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:00 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:00 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:04 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:04 INFO mapred.MapTask: Finished spill 0
16/02/03 22:53:04 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000056_0 is done. And is in the process of commiting
16/02/03 22:53:04 INFO mapred.LocalJobRunner:
16/02/03 22:53:04 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000056_0' done.
16/02/03 22:53:04 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000056_0
16/02/03 22:53:04 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000057_0
16/02/03 22:53:04 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 267279fd
16/02/03 22:53:04 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:134217728+33554432
16/02/03 22:53:04 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:04 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:04 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:05 INFO mapred.JobClient:  map 31% reduce 0%
16/02/03 22:53:06 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:06 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000057_0 is done. And is in the process of commiting
16/02/03 22:53:06 INFO mapred.LocalJobRunner:
16/02/03 22:53:06 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000057_0' done.
16/02/03 22:53:06 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000057_0
16/02/03 22:53:06 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000058_0
16/02/03 22:53:06 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 6349a3ca
16/02/03 22:53:06 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:167772160+33554432
16/02/03 22:53:06 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:06 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:06 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:07 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:07 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000058_0 is done. And is in the process of commiting
16/02/03 22:53:07 INFO mapred.LocalJobRunner:
16/02/03 22:53:07 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000058_0' done.
16/02/03 22:53:07 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000058_0
16/02/03 22:53:07 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000059_0
16/02/03 22:53:07 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3cebbff7
16/02/03 22:53:07 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:201326592+33554432
16/02/03 22:53:07 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:07 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:07 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:09 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:09 INFO mapred.MapTask: Finished spill 0
16/02/03 22:53:09 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000059_0 is done. And is in the process of commiting
16/02/03 22:53:09 INFO mapred.LocalJobRunner:
16/02/03 22:53:09 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000059_0' done.
16/02/03 22:53:09 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000059_0
16/02/03 22:53:09 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000060_0
16/02/03 22:53:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 4fa6f29e
16/02/03 22:53:09 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:234881024+33554432
16/02/03 22:53:09 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:09 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:09 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:10 INFO mapred.JobClient:  map 32% reduce 0%
16/02/03 22:53:10 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:10 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000060_0 is done. And is in the process of commiting
16/02/03 22:53:10 INFO mapred.LocalJobRunner:
16/02/03 22:53:10 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000060_0' done.
16/02/03 22:53:10 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000060_0
16/02/03 22:53:10 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000061_0
16/02/03 22:53:10 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 214dd657
16/02/03 22:53:10 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:268435456+33554432
16/02/03 22:53:10 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:10 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:10 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:13 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:13 INFO mapred.MapTask: Finished spill 0
16/02/03 22:53:13 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000061_0 is done. And is in the process of commiting
16/02/03 22:53:13 INFO mapred.LocalJobRunner:
16/02/03 22:53:13 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000061_0' done.
16/02/03 22:53:13 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000061_0
16/02/03 22:53:13 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000062_0
16/02/03 22:53:13 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> ae9bd16
16/02/03 22:53:13 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:301989888+33554432
16/02/03 22:53:13 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:13 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:13 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:14 INFO mapred.JobClient:  map 34% reduce 0%
16/02/03 22:53:15 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:15 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000062_0 is done. And is in the process of commiting
16/02/03 22:53:15 INFO mapred.LocalJobRunner:
16/02/03 22:53:15 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000062_0' done.
16/02/03 22:53:15 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000062_0
16/02/03 22:53:15 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000063_0
16/02/03 22:53:15 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2c2f4d91
16/02/03 22:53:15 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:335544320+33554432
16/02/03 22:53:15 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:15 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:15 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:16 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:16 INFO mapred.MapTask: Finished spill 0
16/02/03 22:53:16 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000063_0 is done. And is in the process of commiting
16/02/03 22:53:16 INFO mapred.LocalJobRunner:
16/02/03 22:53:16 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000063_0' done.
16/02/03 22:53:16 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000063_0
16/02/03 22:53:16 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000064_0
16/02/03 22:53:16 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3445a62e
16/02/03 22:53:16 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:369098752+33554432
16/02/03 22:53:16 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:16 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:16 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:16 INFO mapred.JobClient:  map 35% reduce 0%
16/02/03 22:53:20 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:20 INFO mapred.MapTask: Finished spill 0
16/02/03 22:53:20 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000064_0 is done. And is in the process of commiting
16/02/03 22:53:20 INFO mapred.LocalJobRunner:
16/02/03 22:53:20 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000064_0' done.
16/02/03 22:53:20 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000064_0
16/02/03 22:53:20 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000065_0
16/02/03 22:53:20 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> b1fe9d5
16/02/03 22:53:20 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:402653184+33554432
16/02/03 22:53:20 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:20 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:20 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:21 INFO mapred.JobClient:  map 37% reduce 0%
16/02/03 22:53:23 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:23 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000065_0 is done. And is in the process of commiting
16/02/03 22:53:23 INFO mapred.LocalJobRunner:
16/02/03 22:53:23 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000065_0' done.
16/02/03 22:53:23 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000065_0
16/02/03 22:53:23 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000066_0
16/02/03 22:53:23 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 69f8d767
16/02/03 22:53:23 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:436207616+33554432
16/02/03 22:53:23 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:23 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:23 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:24 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:24 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000066_0 is done. And is in the process of commiting
16/02/03 22:53:24 INFO mapred.LocalJobRunner:
16/02/03 22:53:24 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000066_0' done.
16/02/03 22:53:24 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000066_0
16/02/03 22:53:24 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000067_0
16/02/03 22:53:24 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 758c2762
16/02/03 22:53:24 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:1073741824+24468851
16/02/03 22:53:24 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:24 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:24 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:25 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:25 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000067_0 is done. And is in the process of commiting
16/02/03 22:53:25 INFO mapred.LocalJobRunner:
16/02/03 22:53:25 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000067_0' done.
16/02/03 22:53:25 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000067_0
16/02/03 22:53:25 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000068_0
16/02/03 22:53:25 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 4e8059e
16/02/03 22:53:25 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:201326592+21730977
16/02/03 22:53:25 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:25 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:25 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:25 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:25 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000068_0 is done. And is in the process of commiting
16/02/03 22:53:25 INFO mapred.LocalJobRunner:
16/02/03 22:53:25 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000068_0' done.
16/02/03 22:53:25 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000068_0
16/02/03 22:53:25 INFO mapred.LocalJobRunner: Starting task: attempt_local1001281779_0002_m_000069_0
16/02/03 22:53:25 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2412cc06
16/02/03 22:53:25 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:469762048+16405281
16/02/03 22:53:25 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:25 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:25 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:26 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:26 INFO mapred.MapTask: Finished spill 0
16/02/03 22:53:26 INFO mapred.Task: Task:attempt_local1001281779_0002_m_000069_0 is done. And is in the process of commiting
16/02/03 22:53:26 INFO mapred.LocalJobRunner:
16/02/03 22:53:26 INFO mapred.Task: Task 'attempt_local1001281779_0002_m_000069_0' done.
16/02/03 22:53:26 INFO mapred.LocalJobRunner: Finishing task: attempt_local1001281779_0002_m_000069_0
16/02/03 22:53:26 INFO mapred.LocalJobRunner: Map task executor complete.
16/02/03 22:53:26 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 59e1cd7f
16/02/03 22:53:26 INFO mapred.LocalJobRunner:
16/02/03 22:53:26 INFO mapred.MerATr: Merging 70 sorted segments
16/02/03 22:53:26 INFO mapred.MerATr: Merging 7 intermediate segments out of a total of 27
16/02/03 22:53:26 INFO mapred.JobClient:  map 38% reduce 0%
16/02/03 22:53:26 INFO mapred.MerATr: Merging 10 intermediate segments out of a total of 21
16/02/03 22:53:26 INFO mapred.MerATr: Merging 10 intermediate segments out of a total of 12
16/02/03 22:53:26 INFO mapred.MerATr: Down to the last merAT-pass, with 3 segments left of total size: 26272610 bytes
16/02/03 22:53:26 INFO mapred.LocalJobRunner:
16/02/03 22:53:27 INFO mapred.Task: Task:attempt_local1001281779_0002_r_000000_0 is done. And is in the process of commiting
16/02/03 22:53:27 INFO mapred.LocalJobRunner:
16/02/03 22:53:27 INFO mapred.Task: Task attempt_local1001281779_0002_r_000000_0 is allowed to commit now
16/02/03 22:53:27 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1001281779_0002_r_000000_0' to /home/ubuntu/AT/AT-vectors/wordcount
16/02/03 22:53:27 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 22:53:27 INFO mapred.Task: Task 'attempt_local1001281779_0002_r_000000_0' done.
16/02/03 22:53:28 INFO mapred.JobClient:  map 38% reduce 100%
16/02/03 22:53:28 INFO mapred.JobClient: Job complete: job_local1001281779_0002
16/02/03 22:53:28 INFO mapred.JobClient: Counters: 20
16/02/03 22:53:28 INFO mapred.JobClient:   File Output Format Counters
16/02/03 22:53:28 INFO mapred.JobClient:     Bytes Written=10568796
16/02/03 22:53:28 INFO mapred.JobClient:   FileSystemCounters
16/02/03 22:53:28 INFO mapred.JobClient:     FILE_BYTES_READ=482085220251
16/02/03 22:53:28 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=170116836260
16/02/03 22:53:28 INFO mapred.JobClient:   File Input Format Counters
16/02/03 22:53:28 INFO mapred.JobClient:     Bytes Read=11885543289
16/02/03 22:53:28 INFO mapred.JobClient:   Map-Reduce Framework
16/02/03 22:53:28 INFO mapred.JobClient:     Reduce input groups=669034
16/02/03 22:53:28 INFO mapred.JobClient:     Map output materialized bytes=26273024
16/02/03 22:53:28 INFO mapred.JobClient:     Combine output records=1212071
16/02/03 22:53:28 INFO mapred.JobClient:     Map input records=223
16/02/03 22:53:28 INFO mapred.JobClient:     Reduce shuffle bytes=0
16/02/03 22:53:28 INFO mapred.JobClient:     Physical memory (bytes) snapshot=0
16/02/03 22:53:28 INFO mapred.JobClient:     Reduce output records=366380
16/02/03 22:53:28 INFO mapred.JobClient:     Spilled Records=3526970
16/02/03 22:53:28 INFO mapred.JobClient:     Map output bytes=30942276
16/02/03 22:53:28 INFO mapred.JobClient:     CPU time spent (ms)=0
16/02/03 22:53:28 INFO mapred.JobClient:     Total committed heap usaAT (bytes)=834207744000
16/02/03 22:53:28 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
16/02/03 22:53:28 INFO mapred.JobClient:     Combine input records=1663367
16/02/03 22:53:28 INFO mapred.JobClient:     Map output records=1663367
16/02/03 22:53:28 INFO mapred.JobClient:     SPLIT_RAW_BYTES=9100
16/02/03 22:53:28 INFO mapred.JobClient:     Reduce input records=1212071
16/02/03 22:53:28 INFO common.HadoopUtil: Deleting /home/ubuntu/AT/AT-vectors/partial-vectors-0
16/02/03 22:53:29 INFO input.FileInputFormat: Total input paths to process : 4
16/02/03 22:53:29 INFO filecache.TrackerDistributedCacheManaATr: Creating dictionary.file-0 in /tmp/hadoop-ubuntu/mapred/local/archive/7146177558952407944_1005254572_693578981/file/home/ubuntu/AT/AT-vectors-work--7007506440757752248 with rwxr-xr-x
16/02/03 22:53:29 INFO filecache.TrackerDistributedCacheManaATr: Cached /home/ubuntu/AT/AT-vectors/dictionary.file-0 as /tmp/hadoop-ubuntu/mapred/local/archive/7146177558952407944_1005254572_693578981/file/home/ubuntu/AT/AT-vectors/dictionary.file-0
16/02/03 22:53:29 INFO filecache.TrackerDistributedCacheManaATr: Cached /home/ubuntu/AT/AT-vectors/dictionary.file-0 as /tmp/hadoop-ubuntu/mapred/local/archive/7146177558952407944_1005254572_693578981/file/home/ubuntu/AT/AT-vectors/dictionary.file-0
16/02/03 22:53:29 INFO mapred.JobClient: Running job: job_local1308764206_0003
16/02/03 22:53:29 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000000_0
16/02/03 22:53:29 INFO mapred.LocalJobRunner: Waiting for map tasks
16/02/03 22:53:29 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 58e58e3d
16/02/03 22:53:29 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:469762048+36442589
16/02/03 22:53:29 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:29 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:29 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:30 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:30 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000000_0 is done. And is in the process of commiting
16/02/03 22:53:30 INFO mapred.LocalJobRunner:
16/02/03 22:53:30 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000000_0' done.
16/02/03 22:53:30 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000000_0
16/02/03 22:53:30 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000001_0
16/02/03 22:53:30 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 1d163e1a
16/02/03 22:53:30 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:0+33554432
16/02/03 22:53:30 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:30 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:30 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:30 INFO mapred.JobClient:  map 0% reduce 0%
16/02/03 22:53:39 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:39 INFO mapred.MapTask: Finished spill 0
16/02/03 22:53:39 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000001_0 is done. And is in the process of commiting
16/02/03 22:53:39 INFO mapred.LocalJobRunner:
16/02/03 22:53:39 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000001_0' done.
16/02/03 22:53:39 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000001_0
16/02/03 22:53:39 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000002_0
16/02/03 22:53:39 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 6fc2c671
16/02/03 22:53:39 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:33554432+33554432
16/02/03 22:53:39 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:39 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:39 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:40 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:40 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000002_0 is done. And is in the process of commiting
16/02/03 22:53:40 INFO mapred.LocalJobRunner:
16/02/03 22:53:40 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000002_0' done.
16/02/03 22:53:40 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000002_0
16/02/03 22:53:40 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000003_0
16/02/03 22:53:40 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2196e0f0
16/02/03 22:53:40 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:67108864+33554432
16/02/03 22:53:40 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:40 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:40 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:40 INFO mapred.JobClient:  map 1% reduce 0%
16/02/03 22:53:44 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:44 INFO mapred.MapTask: Finished spill 0
16/02/03 22:53:44 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000003_0 is done. And is in the process of commiting
16/02/03 22:53:44 INFO mapred.LocalJobRunner:
16/02/03 22:53:44 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000003_0' done.
16/02/03 22:53:44 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000003_0
16/02/03 22:53:44 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000004_0
16/02/03 22:53:44 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 5305036e
16/02/03 22:53:44 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:100663296+33554432
16/02/03 22:53:44 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:44 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:44 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:45 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:45 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000004_0 is done. And is in the process of commiting
16/02/03 22:53:45 INFO mapred.LocalJobRunner:
16/02/03 22:53:45 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000004_0' done.
16/02/03 22:53:45 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000004_0
16/02/03 22:53:45 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000005_0
16/02/03 22:53:45 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 13c03044
16/02/03 22:53:45 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:134217728+33554432
16/02/03 22:53:45 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:45 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:45 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:53:45 INFO mapred.JobClient:  map 2% reduce 0%
16/02/03 22:53:53 INFO mapred.MapTask: Record too larAT for in-memory buffer: 99614729 bytes
16/02/03 22:53:57 INFO mapred.LocalJobRunner:
16/02/03 22:53:58 INFO mapred.JobClient:  map 4% reduce 0%
16/02/03 22:53:59 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:53:59 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000005_0 is done. And is in the process of commiting
16/02/03 22:53:59 INFO mapred.LocalJobRunner:
16/02/03 22:53:59 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000005_0' done.
16/02/03 22:53:59 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000005_0
16/02/03 22:53:59 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000006_0
16/02/03 22:53:59 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 19e55091
16/02/03 22:53:59 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:167772160+33554432
16/02/03 22:53:59 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:53:59 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:53:59 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:02 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:02 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000006_0 is done. And is in the process of commiting
16/02/03 22:54:02 INFO mapred.LocalJobRunner:
16/02/03 22:54:02 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000006_0' done.
16/02/03 22:54:02 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000006_0
16/02/03 22:54:02 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000007_0
16/02/03 22:54:02 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2208f9fb
16/02/03 22:54:02 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:201326592+33554432
16/02/03 22:54:02 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:02 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:02 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:03 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:03 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000007_0 is done. And is in the process of commiting
16/02/03 22:54:03 INFO mapred.LocalJobRunner:
16/02/03 22:54:03 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000007_0' done.
16/02/03 22:54:03 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000007_0
16/02/03 22:54:03 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000008_0
16/02/03 22:54:03 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 409e3341
16/02/03 22:54:03 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:234881024+33554432
16/02/03 22:54:03 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:03 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:03 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:04 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:04 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000008_0 is done. And is in the process of commiting
16/02/03 22:54:04 INFO mapred.LocalJobRunner:
16/02/03 22:54:04 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000008_0' done.
16/02/03 22:54:04 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000008_0
16/02/03 22:54:04 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000009_0
16/02/03 22:54:04 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 1d6867a4
16/02/03 22:54:04 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:268435456+33554432
16/02/03 22:54:04 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:04 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:04 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:06 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:06 INFO mapred.MapTask: Finished spill 0
16/02/03 22:54:06 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000009_0 is done. And is in the process of commiting
16/02/03 22:54:06 INFO mapred.LocalJobRunner:
16/02/03 22:54:06 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000009_0' done.
16/02/03 22:54:06 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000009_0
16/02/03 22:54:06 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000010_0
16/02/03 22:54:06 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 110910a1
16/02/03 22:54:06 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:301989888+33554432
16/02/03 22:54:06 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:06 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:06 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:07 INFO mapred.JobClient:  map 5% reduce 0%
16/02/03 22:54:09 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:09 INFO mapred.MapTask: Finished spill 0
16/02/03 22:54:09 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000010_0 is done. And is in the process of commiting
16/02/03 22:54:09 INFO mapred.LocalJobRunner:
16/02/03 22:54:09 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000010_0' done.
16/02/03 22:54:09 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000010_0
16/02/03 22:54:09 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000011_0
16/02/03 22:54:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 1edb4489
16/02/03 22:54:09 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:335544320+33554432
16/02/03 22:54:09 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:09 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:09 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:10 INFO mapred.JobClient:  map 7% reduce 0%
16/02/03 22:54:11 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:11 INFO mapred.MapTask: Finished spill 0
16/02/03 22:54:11 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000011_0 is done. And is in the process of commiting
16/02/03 22:54:11 INFO mapred.LocalJobRunner:
16/02/03 22:54:11 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000011_0' done.
16/02/03 22:54:11 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000011_0
16/02/03 22:54:11 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000012_0
16/02/03 22:54:11 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 7df9088f
16/02/03 22:54:11 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:369098752+33554432
16/02/03 22:54:11 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:11 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:11 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:12 INFO mapred.JobClient:  map 8% reduce 0%
16/02/03 22:54:12 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:12 INFO mapred.MapTask: Finished spill 0
16/02/03 22:54:12 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000012_0 is done. And is in the process of commiting
16/02/03 22:54:12 INFO mapred.LocalJobRunner:
16/02/03 22:54:12 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000012_0' done.
16/02/03 22:54:12 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000012_0
16/02/03 22:54:12 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000013_0
16/02/03 22:54:12 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 300f2577
16/02/03 22:54:12 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:402653184+33554432
16/02/03 22:54:12 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:12 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:12 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:13 INFO mapred.JobClient:  map 10% reduce 0%
16/02/03 22:54:13 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:13 INFO mapred.MapTask: Finished spill 0
16/02/03 22:54:13 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000013_0 is done. And is in the process of commiting
16/02/03 22:54:13 INFO mapred.LocalJobRunner:
16/02/03 22:54:13 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000013_0' done.
16/02/03 22:54:13 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000013_0
16/02/03 22:54:13 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000014_0
16/02/03 22:54:13 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 63155812
16/02/03 22:54:13 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:436207616+33554432
16/02/03 22:54:13 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:13 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:13 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:14 INFO mapred.JobClient:  map 11% reduce 0%
16/02/03 22:54:15 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:15 INFO mapred.MapTask: Finished spill 0
16/02/03 22:54:15 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000014_0 is done. And is in the process of commiting
16/02/03 22:54:15 INFO mapred.LocalJobRunner:
16/02/03 22:54:15 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000014_0' done.
16/02/03 22:54:15 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000014_0
16/02/03 22:54:15 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000015_0
16/02/03 22:54:15 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 66e9a2c4
16/02/03 22:54:15 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:0+33554432
16/02/03 22:54:15 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:15 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:15 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:16 INFO mapred.JobClient:  map 12% reduce 0%
16/02/03 22:54:20 INFO mapred.MapTask: Spilling map output: buffer full= true
16/02/03 22:54:20 INFO mapred.MapTask: bufstart = 0; bufend = 19767646; bufvoid = 99614720
16/02/03 22:54:20 INFO mapred.MapTask: kvstart = 0; kvend = 12; length = 327680
16/02/03 22:54:20 INFO mapred.MapTask: Finished spill 0
16/02/03 22:54:21 INFO mapred.LocalJobRunner:
16/02/03 22:54:22 INFO mapred.MapTask: Record too larAT for in-memory buffer: 99614721 bytes
16/02/03 22:54:22 INFO mapred.JobClient:  map 14% reduce 0%
16/02/03 22:54:24 INFO mapred.LocalJobRunner:
16/02/03 22:54:24 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:24 INFO mapred.MerATr: Merging 2 sorted segments
16/02/03 22:54:24 INFO mapred.MerATr: Down to the last merAT-pass, with 2 segments left of total size: 141254097 bytes
16/02/03 22:54:25 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000015_0 is done. And is in the process of commiting
16/02/03 22:54:25 INFO mapred.LocalJobRunner:
16/02/03 22:54:25 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000015_0' done.
16/02/03 22:54:25 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000015_0
16/02/03 22:54:25 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000016_0
16/02/03 22:54:25 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> b6c1c07
16/02/03 22:54:25 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:33554432+33554432
16/02/03 22:54:25 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:25 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:25 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:27 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:27 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000016_0 is done. And is in the process of commiting
16/02/03 22:54:27 INFO mapred.LocalJobRunner:
16/02/03 22:54:27 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000016_0' done.
16/02/03 22:54:27 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000016_0
16/02/03 22:54:27 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000017_0
16/02/03 22:54:27 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 52d7f51e
16/02/03 22:54:27 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:67108864+33554432
16/02/03 22:54:27 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:27 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:27 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:29 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:29 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000017_0 is done. And is in the process of commiting
16/02/03 22:54:29 INFO mapred.LocalJobRunner:
16/02/03 22:54:29 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000017_0' done.
16/02/03 22:54:29 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000017_0
16/02/03 22:54:29 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000018_0
16/02/03 22:54:29 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 4c46261c
16/02/03 22:54:29 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:100663296+33554432
16/02/03 22:54:29 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:29 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:29 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:29 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:29 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000018_0 is done. And is in the process of commiting
16/02/03 22:54:29 INFO mapred.LocalJobRunner:
16/02/03 22:54:29 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000018_0' done.
16/02/03 22:54:29 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000018_0
16/02/03 22:54:29 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000019_0
16/02/03 22:54:29 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 1fbf2845
16/02/03 22:54:29 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:134217728+33554432
16/02/03 22:54:29 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:29 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:29 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:31 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:31 INFO mapred.MapTask: Finished spill 0
16/02/03 22:54:31 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000019_0 is done. And is in the process of commiting
16/02/03 22:54:31 INFO mapred.LocalJobRunner:
16/02/03 22:54:31 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000019_0' done.
16/02/03 22:54:31 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000019_0
16/02/03 22:54:31 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000020_0
16/02/03 22:54:31 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 647bf4bc
16/02/03 22:54:31 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:167772160+33554432
16/02/03 22:54:31 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:31 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:31 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:32 INFO mapred.JobClient:  map 15% reduce 0%
16/02/03 22:54:37 INFO mapred.LocalJobRunner:
16/02/03 22:54:38 INFO mapred.JobClient:  map 17% reduce 0%
16/02/03 22:54:38 INFO mapred.MapTask: Spilling map output: buffer full= true
16/02/03 22:54:38 INFO mapred.MapTask: bufstart = 0; bufend = 732445; bufvoid = 99614720
16/02/03 22:54:38 INFO mapred.MapTask: kvstart = 0; kvend = 2; length = 327680
16/02/03 22:54:38 INFO mapred.MapTask: Finished spill 0
16/02/03 22:54:38 INFO mapred.MapTask: Record too larAT for in-memory buffer: 99614722 bytes
16/02/03 22:54:40 INFO mapred.LocalJobRunner:
16/02/03 22:54:43 INFO mapred.LocalJobRunner:
16/02/03 22:54:44 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:44 INFO mapred.MerATr: Merging 2 sorted segments
16/02/03 22:54:44 INFO mapred.MerATr: Down to the last merAT-pass, with 2 segments left of total size: 197457402 bytes
16/02/03 22:54:45 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000020_0 is done. And is in the process of commiting
16/02/03 22:54:45 INFO mapred.LocalJobRunner:
16/02/03 22:54:45 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000020_0' done.
16/02/03 22:54:45 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000020_0
16/02/03 22:54:45 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000021_0
16/02/03 22:54:45 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 56ae1008
16/02/03 22:54:45 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:201326592+33554432
16/02/03 22:54:45 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:45 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:45 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:48 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:48 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000021_0 is done. And is in the process of commiting
16/02/03 22:54:48 INFO mapred.LocalJobRunner:
16/02/03 22:54:48 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000021_0' done.
16/02/03 22:54:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000021_0
16/02/03 22:54:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000022_0
16/02/03 22:54:48 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 46f94b51
16/02/03 22:54:48 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:234881024+33554432
16/02/03 22:54:48 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:48 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:48 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:51 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:51 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000022_0 is done. And is in the process of commiting
16/02/03 22:54:51 INFO mapred.LocalJobRunner:
16/02/03 22:54:51 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000022_0' done.
16/02/03 22:54:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000022_0
16/02/03 22:54:51 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000023_0
16/02/03 22:54:51 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3c2beb47
16/02/03 22:54:51 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:268435456+33554432
16/02/03 22:54:51 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:51 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:51 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:53 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:53 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000023_0 is done. And is in the process of commiting
16/02/03 22:54:53 INFO mapred.LocalJobRunner:
16/02/03 22:54:53 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000023_0' done.
16/02/03 22:54:53 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000023_0
16/02/03 22:54:53 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000024_0
16/02/03 22:54:53 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 26f57041
16/02/03 22:54:53 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:301989888+33554432
16/02/03 22:54:53 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:53 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:53 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:55 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:55 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000024_0 is done. And is in the process of commiting
16/02/03 22:54:55 INFO mapred.LocalJobRunner:
16/02/03 22:54:55 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000024_0' done.
16/02/03 22:54:55 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000024_0
16/02/03 22:54:55 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000025_0
16/02/03 22:54:55 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 1e81432
16/02/03 22:54:55 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:335544320+33554432
16/02/03 22:54:55 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:55 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:55 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:56 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:54:56 INFO mapred.MapTask: Finished spill 0
16/02/03 22:54:57 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000025_0 is done. And is in the process of commiting
16/02/03 22:54:57 INFO mapred.LocalJobRunner:
16/02/03 22:54:57 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000025_0' done.
16/02/03 22:54:57 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000025_0
16/02/03 22:54:57 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000026_0
16/02/03 22:54:57 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 26c0e0f3
16/02/03 22:54:57 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:369098752+33554432
16/02/03 22:54:57 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:54:57 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:54:57 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:54:57 INFO mapred.JobClient:  map 18% reduce 0%
16/02/03 22:55:03 INFO mapred.LocalJobRunner:
16/02/03 22:55:43 INFO mapred.MapTask: Spilling map output: buffer full= true
16/02/03 22:55:43 INFO mapred.MapTask: bufstart = 0; bufend = 12065; bufvoid = 99614720
16/02/03 22:55:43 INFO mapred.MapTask: kvstart = 0; kvend = 1; length = 327680
16/02/03 22:55:43 INFO mapred.MapTask: Finished spill 0
16/02/03 22:55:43 INFO mapred.MapTask: Record too larAT for in-memory buffer: 99614751 bytes
16/02/03 22:55:46 INFO mapred.LocalJobRunner:
16/02/03 22:55:47 INFO mapred.JobClient:  map 20% reduce 0%
16/02/03 22:55:49 INFO mapred.LocalJobRunner:
16/02/03 22:55:56 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:55:56 INFO mapred.MerATr: Merging 2 sorted segments
16/02/03 22:55:58 INFO mapred.MerATr: Down to the last merAT-pass, with 2 segments left of total size: 703162844 bytes
16/02/03 22:55:59 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000026_0 is done. And is in the process of commiting
16/02/03 22:55:59 INFO mapred.LocalJobRunner:
16/02/03 22:55:59 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000026_0' done.
16/02/03 22:55:59 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000026_0
16/02/03 22:55:59 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000027_0
16/02/03 22:55:59 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2c111e4d
16/02/03 22:55:59 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:402653184+33554432
16/02/03 22:55:59 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:55:59 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:55:59 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:56:14 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:56:14 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000027_0 is done. And is in the process of commiting
16/02/03 22:56:14 INFO mapred.LocalJobRunner:
16/02/03 22:56:14 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000027_0' done.
16/02/03 22:56:14 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000027_0
16/02/03 22:56:14 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000028_0
16/02/03 22:56:14 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 69d81fd2
16/02/03 22:56:14 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:436207616+33554432
16/02/03 22:56:14 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:56:14 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:56:14 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:56:28 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:56:28 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000028_0 is done. And is in the process of commiting
16/02/03 22:56:28 INFO mapred.LocalJobRunner:
16/02/03 22:56:28 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000028_0' done.
16/02/03 22:56:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000028_0
16/02/03 22:56:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000029_0
16/02/03 22:56:28 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 68428ea9
16/02/03 22:56:28 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:469762048+33554432
16/02/03 22:56:28 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:56:28 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:56:28 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:56:41 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:56:41 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000029_0 is done. And is in the process of commiting
16/02/03 22:56:41 INFO mapred.LocalJobRunner:
16/02/03 22:56:41 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000029_0' done.
16/02/03 22:56:41 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000029_0
16/02/03 22:56:41 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000030_0
16/02/03 22:56:41 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 7301c3be
16/02/03 22:56:41 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:503316480+33554432
16/02/03 22:56:41 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:56:41 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:56:41 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:56:54 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:56:54 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000030_0 is done. And is in the process of commiting
16/02/03 22:56:54 INFO mapred.LocalJobRunner:
16/02/03 22:56:54 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000030_0' done.
16/02/03 22:56:54 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000030_0
16/02/03 22:56:54 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000031_0
16/02/03 22:56:54 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 66aade2a
16/02/03 22:56:54 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:536870912+33554432
16/02/03 22:56:54 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:56:54 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:56:54 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:57:06 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:57:06 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000031_0 is done. And is in the process of commiting
16/02/03 22:57:06 INFO mapred.LocalJobRunner:
16/02/03 22:57:06 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000031_0' done.
16/02/03 22:57:06 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000031_0
16/02/03 22:57:06 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000032_0
16/02/03 22:57:06 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 7c9c6aa5
16/02/03 22:57:06 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:570425344+33554432
16/02/03 22:57:06 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:57:06 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:57:06 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:57:17 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:57:17 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000032_0 is done. And is in the process of commiting
16/02/03 22:57:17 INFO mapred.LocalJobRunner:
16/02/03 22:57:17 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000032_0' done.
16/02/03 22:57:17 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000032_0
16/02/03 22:57:17 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000033_0
16/02/03 22:57:17 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 4b22adee
16/02/03 22:57:17 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:603979776+33554432
16/02/03 22:57:17 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:57:17 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:57:17 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:57:27 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:57:27 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000033_0 is done. And is in the process of commiting
16/02/03 22:57:27 INFO mapred.LocalJobRunner:
16/02/03 22:57:27 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000033_0' done.
16/02/03 22:57:27 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000033_0
16/02/03 22:57:27 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000034_0
16/02/03 22:57:27 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 35efc0a
16/02/03 22:57:27 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:637534208+33554432
16/02/03 22:57:27 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:57:27 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:57:27 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:57:37 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:57:37 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000034_0 is done. And is in the process of commiting
16/02/03 22:57:37 INFO mapred.LocalJobRunner:
16/02/03 22:57:37 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000034_0' done.
16/02/03 22:57:37 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000034_0
16/02/03 22:57:37 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000035_0
16/02/03 22:57:37 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2564575
16/02/03 22:57:37 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:671088640+33554432
16/02/03 22:57:37 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:57:37 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:57:37 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:57:46 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:57:46 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000035_0 is done. And is in the process of commiting
16/02/03 22:57:46 INFO mapred.LocalJobRunner:
16/02/03 22:57:46 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000035_0' done.
16/02/03 22:57:46 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000035_0
16/02/03 22:57:46 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000036_0
16/02/03 22:57:46 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 4b492c47
16/02/03 22:57:46 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:704643072+33554432
16/02/03 22:57:46 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:57:46 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:57:46 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:57:54 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:57:54 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000036_0 is done. And is in the process of commiting
16/02/03 22:57:54 INFO mapred.LocalJobRunner:
16/02/03 22:57:54 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000036_0' done.
16/02/03 22:57:54 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000036_0
16/02/03 22:57:54 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000037_0
16/02/03 22:57:54 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 6dc74dfb
16/02/03 22:57:54 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:738197504+33554432
16/02/03 22:57:54 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:57:54 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:57:54 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:02 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:02 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000037_0 is done. And is in the process of commiting
16/02/03 22:58:02 INFO mapred.LocalJobRunner:
16/02/03 22:58:02 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000037_0' done.
16/02/03 22:58:02 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000037_0
16/02/03 22:58:02 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000038_0
16/02/03 22:58:02 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 4b8a8992
16/02/03 22:58:02 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:771751936+33554432
16/02/03 22:58:02 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:02 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:02 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:09 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:09 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000038_0 is done. And is in the process of commiting
16/02/03 22:58:09 INFO mapred.LocalJobRunner:
16/02/03 22:58:09 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000038_0' done.
16/02/03 22:58:09 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000038_0
16/02/03 22:58:09 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000039_0
16/02/03 22:58:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 212ce10a
16/02/03 22:58:09 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:805306368+33554432
16/02/03 22:58:09 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:09 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:09 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:15 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:15 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000039_0 is done. And is in the process of commiting
16/02/03 22:58:15 INFO mapred.LocalJobRunner:
16/02/03 22:58:15 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000039_0' done.
16/02/03 22:58:15 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000039_0
16/02/03 22:58:15 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000040_0
16/02/03 22:58:15 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3ee82600
16/02/03 22:58:15 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:838860800+33554432
16/02/03 22:58:15 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:15 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:15 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:20 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:20 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000040_0 is done. And is in the process of commiting
16/02/03 22:58:20 INFO mapred.LocalJobRunner:
16/02/03 22:58:20 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000040_0' done.
16/02/03 22:58:20 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000040_0
16/02/03 22:58:20 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000041_0
16/02/03 22:58:20 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 3c603dc5
16/02/03 22:58:20 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:872415232+33554432
16/02/03 22:58:20 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:20 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:20 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:25 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:25 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000041_0 is done. And is in the process of commiting
16/02/03 22:58:25 INFO mapred.LocalJobRunner:
16/02/03 22:58:25 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000041_0' done.
16/02/03 22:58:25 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000041_0
16/02/03 22:58:25 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000042_0
16/02/03 22:58:25 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 61dee8db
16/02/03 22:58:25 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:905969664+33554432
16/02/03 22:58:25 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:25 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:25 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:29 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:29 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000042_0 is done. And is in the process of commiting
16/02/03 22:58:29 INFO mapred.LocalJobRunner:
16/02/03 22:58:29 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000042_0' done.
16/02/03 22:58:29 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000042_0
16/02/03 22:58:29 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000043_0
16/02/03 22:58:29 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 11b2a12e
16/02/03 22:58:29 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:939524096+33554432
16/02/03 22:58:29 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:29 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:29 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:32 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:32 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000043_0 is done. And is in the process of commiting
16/02/03 22:58:32 INFO mapred.LocalJobRunner:
16/02/03 22:58:32 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000043_0' done.
16/02/03 22:58:32 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000043_0
16/02/03 22:58:32 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000044_0
16/02/03 22:58:32 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 51038e5
16/02/03 22:58:32 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:973078528+33554432
16/02/03 22:58:32 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:32 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:32 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:35 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:35 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000044_0 is done. And is in the process of commiting
16/02/03 22:58:35 INFO mapred.LocalJobRunner:
16/02/03 22:58:35 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000044_0' done.
16/02/03 22:58:35 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000044_0
16/02/03 22:58:35 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000045_0
16/02/03 22:58:35 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 6e6175da
16/02/03 22:58:35 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:1006632960+33554432
16/02/03 22:58:35 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:35 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:35 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:37 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:37 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000045_0 is done. And is in the process of commiting
16/02/03 22:58:37 INFO mapred.LocalJobRunner:
16/02/03 22:58:37 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000045_0' done.
16/02/03 22:58:37 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000045_0
16/02/03 22:58:37 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000046_0
16/02/03 22:58:37 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 75fbe6b7
16/02/03 22:58:37 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:1040187392+33554432
16/02/03 22:58:37 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:37 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:37 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:38 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:38 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000046_0 is done. And is in the process of commiting
16/02/03 22:58:38 INFO mapred.LocalJobRunner:
16/02/03 22:58:38 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000046_0' done.
16/02/03 22:58:38 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000046_0
16/02/03 22:58:38 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000047_0
16/02/03 22:58:38 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 16d6b576
16/02/03 22:58:38 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:0+33554432
16/02/03 22:58:38 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:38 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:38 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:41 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:41 INFO mapred.MapTask: Finished spill 0
16/02/03 22:58:41 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000047_0 is done. And is in the process of commiting
16/02/03 22:58:41 INFO mapred.LocalJobRunner:
16/02/03 22:58:41 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000047_0' done.
16/02/03 22:58:41 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000047_0
16/02/03 22:58:41 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000048_0
16/02/03 22:58:41 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 584ced04
16/02/03 22:58:41 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:33554432+33554432
16/02/03 22:58:41 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:41 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:41 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:41 INFO mapred.JobClient:  map 21% reduce 0%
16/02/03 22:58:42 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:42 INFO mapred.MapTask: Finished spill 0
16/02/03 22:58:42 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000048_0 is done. And is in the process of commiting
16/02/03 22:58:42 INFO mapred.LocalJobRunner:
16/02/03 22:58:42 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000048_0' done.
16/02/03 22:58:42 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000048_0
16/02/03 22:58:42 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000049_0
16/02/03 22:58:42 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 692dfdbb
16/02/03 22:58:42 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:67108864+33554432
16/02/03 22:58:42 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:42 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:42 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:42 INFO mapred.JobClient:  map 22% reduce 0%
16/02/03 22:58:43 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:43 INFO mapred.MapTask: Finished spill 0
16/02/03 22:58:43 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000049_0 is done. And is in the process of commiting
16/02/03 22:58:43 INFO mapred.LocalJobRunner:
16/02/03 22:58:43 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000049_0' done.
16/02/03 22:58:43 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000049_0
16/02/03 22:58:43 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000050_0
16/02/03 22:58:43 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> b3b6d82
16/02/03 22:58:43 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:100663296+33554432
16/02/03 22:58:43 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:43 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:43 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:44 INFO mapred.JobClient:  map 24% reduce 0%
16/02/03 22:58:47 INFO mapred.MapTask: Spilling map output: buffer full= true
16/02/03 22:58:47 INFO mapred.MapTask: bufstart = 0; bufend = 10164150; bufvoid = 99614720
16/02/03 22:58:47 INFO mapred.MapTask: kvstart = 0; kvend = 9; length = 327680
16/02/03 22:58:47 INFO mapred.MapTask: Finished spill 0
16/02/03 22:58:47 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:48 INFO mapred.MapTask: Finished spill 1
16/02/03 22:58:48 INFO mapred.MerATr: Merging 2 sorted segments
16/02/03 22:58:48 INFO mapred.MerATr: Down to the last merAT-pass, with 2 segments left of total size: 94226336 bytes
16/02/03 22:58:48 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000050_0 is done. And is in the process of commiting
16/02/03 22:58:48 INFO mapred.LocalJobRunner:
16/02/03 22:58:48 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000050_0' done.
16/02/03 22:58:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000050_0
16/02/03 22:58:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000051_0
16/02/03 22:58:48 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 394220b0
16/02/03 22:58:48 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:134217728+33554432
16/02/03 22:58:48 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:48 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:48 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:48 INFO mapred.JobClient:  map 25% reduce 0%
16/02/03 22:58:49 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:49 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000051_0 is done. And is in the process of commiting
16/02/03 22:58:49 INFO mapred.LocalJobRunner:
16/02/03 22:58:49 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000051_0' done.
16/02/03 22:58:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000051_0
16/02/03 22:58:49 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000052_0
16/02/03 22:58:49 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2711815d
16/02/03 22:58:49 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:167772160+33554432
16/02/03 22:58:49 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:49 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:49 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:51 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:51 INFO mapred.MapTask: Finished spill 0
16/02/03 22:58:51 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000052_0 is done. And is in the process of commiting
16/02/03 22:58:51 INFO mapred.LocalJobRunner:
16/02/03 22:58:51 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000052_0' done.
16/02/03 22:58:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000052_0
16/02/03 22:58:51 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000053_0
16/02/03 22:58:51 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 25902a6a
16/02/03 22:58:51 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:0+33554432
16/02/03 22:58:51 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:51 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:51 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:51 INFO mapred.JobClient:  map 27% reduce 0%
16/02/03 22:58:54 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:54 INFO mapred.MapTask: Finished spill 0
16/02/03 22:58:54 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000053_0 is done. And is in the process of commiting
16/02/03 22:58:54 INFO mapred.LocalJobRunner:
16/02/03 22:58:54 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000053_0' done.
16/02/03 22:58:54 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000053_0
16/02/03 22:58:54 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000054_0
16/02/03 22:58:54 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 2ce038b6
16/02/03 22:58:54 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:33554432+33554432
16/02/03 22:58:54 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:54 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:54 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:54 INFO mapred.JobClient:  map 28% reduce 0%
16/02/03 22:58:57 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:57 INFO mapred.MapTask: Finished spill 0
16/02/03 22:58:57 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000054_0 is done. And is in the process of commiting
16/02/03 22:58:57 INFO mapred.LocalJobRunner:
16/02/03 22:58:57 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000054_0' done.
16/02/03 22:58:57 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000054_0
16/02/03 22:58:57 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000055_0
16/02/03 22:58:57 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 6f34a809
16/02/03 22:58:57 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:67108864+33554432
16/02/03 22:58:57 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:58 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:58 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:58:58 INFO mapred.JobClient:  map 30% reduce 0%
16/02/03 22:58:59 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:58:59 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000055_0 is done. And is in the process of commiting
16/02/03 22:58:59 INFO mapred.LocalJobRunner:
16/02/03 22:58:59 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000055_0' done.
16/02/03 22:58:59 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000055_0
16/02/03 22:58:59 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000056_0
16/02/03 22:58:59 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 51b267b7
16/02/03 22:58:59 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:100663296+33554432
16/02/03 22:58:59 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:58:59 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:58:59 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:04 INFO mapred.MapTask: Record too larAT for in-memory buffer: 99614722 bytes
16/02/03 22:59:05 INFO mapred.LocalJobRunner:
16/02/03 22:59:06 INFO mapred.JobClient:  map 31% reduce 0%
16/02/03 22:59:06 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:06 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000056_0 is done. And is in the process of commiting
16/02/03 22:59:06 INFO mapred.LocalJobRunner:
16/02/03 22:59:06 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000056_0' done.
16/02/03 22:59:06 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000056_0
16/02/03 22:59:06 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000057_0
16/02/03 22:59:06 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 6d224703
16/02/03 22:59:06 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:134217728+33554432
16/02/03 22:59:06 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:06 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:06 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:08 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:08 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000057_0 is done. And is in the process of commiting
16/02/03 22:59:08 INFO mapred.LocalJobRunner:
16/02/03 22:59:08 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000057_0' done.
16/02/03 22:59:08 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000057_0
16/02/03 22:59:08 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000058_0
16/02/03 22:59:08 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 7da054f5
16/02/03 22:59:08 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:167772160+33554432
16/02/03 22:59:08 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:08 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:08 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:10 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:10 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000058_0 is done. And is in the process of commiting
16/02/03 22:59:10 INFO mapred.LocalJobRunner:
16/02/03 22:59:10 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000058_0' done.
16/02/03 22:59:10 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000058_0
16/02/03 22:59:10 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000059_0
16/02/03 22:59:10 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 18749cf8
16/02/03 22:59:10 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:201326592+33554432
16/02/03 22:59:10 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:10 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:10 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:12 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:12 INFO mapred.MapTask: Finished spill 0
16/02/03 22:59:12 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000059_0 is done. And is in the process of commiting
16/02/03 22:59:12 INFO mapred.LocalJobRunner:
16/02/03 22:59:12 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000059_0' done.
16/02/03 22:59:12 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000059_0
16/02/03 22:59:12 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000060_0
16/02/03 22:59:12 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 529720c9
16/02/03 22:59:12 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:234881024+33554432
16/02/03 22:59:12 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:12 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:12 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:13 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:13 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000060_0 is done. And is in the process of commiting
16/02/03 22:59:13 INFO mapred.LocalJobRunner:
16/02/03 22:59:13 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000060_0' done.
16/02/03 22:59:13 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000060_0
16/02/03 22:59:13 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000061_0
16/02/03 22:59:13 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 53f7eb48
16/02/03 22:59:13 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:268435456+33554432
16/02/03 22:59:13 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:13 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:13 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:13 INFO mapred.JobClient:  map 32% reduce 0%
16/02/03 22:59:17 INFO mapred.MapTask: Spilling map output: buffer full= true
16/02/03 22:59:17 INFO mapred.MapTask: bufstart = 0; bufend = 9027753; bufvoid = 99614720
16/02/03 22:59:17 INFO mapred.MapTask: kvstart = 0; kvend = 4; length = 327680
16/02/03 22:59:17 INFO mapred.MapTask: Finished spill 0
16/02/03 22:59:17 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:17 INFO mapred.MapTask: Finished spill 1
16/02/03 22:59:17 INFO mapred.MerATr: Merging 2 sorted segments
16/02/03 22:59:17 INFO mapred.MerATr: Down to the last merAT-pass, with 2 segments left of total size: 87965296 bytes
16/02/03 22:59:17 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000061_0 is done. And is in the process of commiting
16/02/03 22:59:17 INFO mapred.LocalJobRunner:
16/02/03 22:59:17 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000061_0' done.
16/02/03 22:59:17 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000061_0
16/02/03 22:59:17 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000062_0
16/02/03 22:59:17 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> ff35374
16/02/03 22:59:17 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:301989888+33554432
16/02/03 22:59:17 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:17 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:17 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:18 INFO mapred.JobClient:  map 34% reduce 0%
16/02/03 22:59:19 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:19 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000062_0 is done. And is in the process of commiting
16/02/03 22:59:19 INFO mapred.LocalJobRunner:
16/02/03 22:59:19 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000062_0' done.
16/02/03 22:59:19 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000062_0
16/02/03 22:59:19 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000063_0
16/02/03 22:59:19 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> af3e7bc
16/02/03 22:59:19 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:335544320+33554432
16/02/03 22:59:19 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:19 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:19 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:20 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:21 INFO mapred.MapTask: Finished spill 0
16/02/03 22:59:21 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000063_0 is done. And is in the process of commiting
16/02/03 22:59:21 INFO mapred.LocalJobRunner:
16/02/03 22:59:21 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000063_0' done.
16/02/03 22:59:21 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000063_0
16/02/03 22:59:21 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000064_0
16/02/03 22:59:21 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 6bc10171
16/02/03 22:59:21 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:369098752+33554432
16/02/03 22:59:21 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:21 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:21 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:22 INFO mapred.JobClient:  map 35% reduce 0%
16/02/03 22:59:26 INFO mapred.MapTask: Spilling map output: buffer full= true
16/02/03 22:59:26 INFO mapred.MapTask: bufstart = 0; bufend = 16485499; bufvoid = 99614720
16/02/03 22:59:26 INFO mapred.MapTask: kvstart = 0; kvend = 12; length = 327680
16/02/03 22:59:26 INFO mapred.MapTask: Finished spill 0
16/02/03 22:59:27 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:27 INFO mapred.MapTask: Finished spill 1
16/02/03 22:59:27 INFO mapred.MerATr: Merging 2 sorted segments
16/02/03 22:59:27 INFO mapred.MerATr: Down to the last merAT-pass, with 2 segments left of total size: 30808565 bytes
16/02/03 22:59:27 INFO mapred.LocalJobRunner:
16/02/03 22:59:28 INFO mapred.JobClient:  map 37% reduce 0%
16/02/03 22:59:30 INFO mapred.LocalJobRunner:
16/02/03 22:59:31 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000064_0 is done. And is in the process of commiting
16/02/03 22:59:31 INFO mapred.LocalJobRunner:
16/02/03 22:59:31 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000064_0' done.
16/02/03 22:59:31 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000064_0
16/02/03 22:59:31 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000065_0
16/02/03 22:59:31 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 62d6a33f
16/02/03 22:59:31 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:402653184+33554432
16/02/03 22:59:31 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:31 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:31 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:34 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:34 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000065_0 is done. And is in the process of commiting
16/02/03 22:59:34 INFO mapred.LocalJobRunner:
16/02/03 22:59:34 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000065_0' done.
16/02/03 22:59:34 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000065_0
16/02/03 22:59:34 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000066_0
16/02/03 22:59:34 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 112740e8
16/02/03 22:59:34 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00002:436207616+33554432
16/02/03 22:59:34 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:34 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:34 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:35 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:35 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000066_0 is done. And is in the process of commiting
16/02/03 22:59:35 INFO mapred.LocalJobRunner:
16/02/03 22:59:35 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000066_0' done.
16/02/03 22:59:35 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000066_0
16/02/03 22:59:35 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000067_0
16/02/03 22:59:35 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 71679f5c
16/02/03 22:59:35 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00000:1073741824+24468851
16/02/03 22:59:35 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:35 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:35 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:36 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:36 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000067_0 is done. And is in the process of commiting
16/02/03 22:59:36 INFO mapred.LocalJobRunner:
16/02/03 22:59:36 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000067_0' done.
16/02/03 22:59:36 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000067_0
16/02/03 22:59:36 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000068_0
16/02/03 22:59:36 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 6bb2e497
16/02/03 22:59:36 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00001:201326592+21730977
16/02/03 22:59:36 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:36 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:36 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:36 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:38 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000068_0 is done. And is in the process of commiting
16/02/03 22:59:38 INFO mapred.LocalJobRunner:
16/02/03 22:59:38 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000068_0' done.
16/02/03 22:59:38 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000068_0
16/02/03 22:59:38 INFO mapred.LocalJobRunner: Starting task: attempt_local1308764206_0003_m_000069_0
16/02/03 22:59:38 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 23b804dd
16/02/03 22:59:38 INFO mapred.MapTask: Processing split: file:/home/ubuntu/AT/AT-vectors/tokenized-documents/part-m-00003:469762048+16405281
16/02/03 22:59:38 INFO mapred.MapTask: io.sort.mb = 100
16/02/03 22:59:38 INFO mapred.MapTask: data buffer = 79691776/99614720
16/02/03 22:59:38 INFO mapred.MapTask: record buffer = 262144/327680
16/02/03 22:59:39 INFO mapred.MapTask: Starting flush of map output
16/02/03 22:59:39 INFO mapred.MapTask: Finished spill 0
16/02/03 22:59:39 INFO mapred.Task: Task:attempt_local1308764206_0003_m_000069_0 is done. And is in the process of commiting
16/02/03 22:59:39 INFO mapred.LocalJobRunner:
16/02/03 22:59:39 INFO mapred.Task: Task 'attempt_local1308764206_0003_m_000069_0' done.
16/02/03 22:59:39 INFO mapred.LocalJobRunner: Finishing task: attempt_local1308764206_0003_m_000069_0
16/02/03 22:59:39 INFO mapred.LocalJobRunner: Map task executor complete.
16/02/03 22:59:39 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin <at> 1548f4ac
16/02/03 22:59:39 INFO mapred.LocalJobRunner:
16/02/03 22:59:39 INFO mapred.MerATr: Merging 70 sorted segments
16/02/03 22:59:40 INFO mapred.JobClient:  map 38% reduce 0%
16/02/03 22:59:44 INFO mapred.MerATr: Merging 7 intermediate segments out of a total of 27
16/02/03 22:59:45 INFO mapred.MerATr: Merging 10 intermediate segments out of a total of 21
^[[B16/02/03 22:59:48 INFO mapred.MerATr: Merging 10 intermediate segments out of a total of 12
16/02/03 22:59:48 INFO mapred.LocalJobRunner: reduce > sort
16/02/03 22:59:49 INFO mapred.JobClient:  map 38% reduce 33%
16/02/03 22:59:51 INFO mapred.LocalJobRunner: reduce > sort
16/02/03 22:59:51 INFO mapred.MerATr: Down to the last merAT-pass, with 3 segments left of total size: 2214020137 bytes
16/02/03 22:59:51 INFO mapred.LocalJobRunner: reduce > sort
16/02/03 22:59:51 INFO common.HadoopUtil: trying find a file in distributed cache containing [dictionary.file-] in its name
16/02/03 22:59:51 INFO common.HadoopUtil: found file [/home/ubuntu/AT/AT-vectors/dictionary.file-0] containing [dictionary.file-]
16/02/03 22:59:57 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 22:59:58 INFO mapred.JobClient:  map 38% reduce 70%
16/02/03 23:00:00 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:01 INFO mapred.JobClient:  map 38% reduce 72%
16/02/03 23:00:03 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:04 INFO mapred.JobClient:  map 38% reduce 74%
16/02/03 23:00:06 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:09 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:10 INFO mapred.JobClient:  map 38% reduce 75%
16/02/03 23:00:12 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:13 INFO mapred.JobClient:  map 38% reduce 77%
16/02/03 23:00:15 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:18 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:19 INFO mapred.JobClient:  map 38% reduce 79%
16/02/03 23:00:21 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:24 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:27 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:28 INFO mapred.JobClient:  map 38% reduce 82%
16/02/03 23:00:30 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:36 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:37 INFO mapred.JobClient:  map 38% reduce 83%
16/02/03 23:00:39 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:00:40 INFO mapred.JobClient:  map 38% reduce 96%
16/02/03 23:00:42 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:01:25 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:01:55 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:01:56 INFO mapred.JobClient:  map 38% reduce 97%
16/02/03 23:01:58 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:02:01 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:02:01 INFO mapred.JobClient:  map 38% reduce 100%
16/02/03 23:02:06 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:02:17 INFO mapred.LocalJobRunner: reduce > reduce
16/02/03 23:02:18 WARN mapred.LocalJobRunner: job_local1308764206_0003
java.io.EOFException
        at java.io.DataInputStream.readByte(DataInputStream.java:267)
        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:299)
        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:320)
        at org.apache.hadoop.io.Text.readFields(Text.java:263)
        at org.apache.mahout.common.StringTuple.readFields(StringTuple.java:142)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
        at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117)
        at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
16/02/03 23:02:18 INFO mapred.JobClient: Job complete: job_local1308764206_0003
16/02/03 23:02:18 INFO mapred.JobClient: Counters: 20
16/02/03 23:02:18 INFO mapred.JobClient:   File Output Format Counters
16/02/03 23:02:18 INFO mapred.JobClient:     Bytes Written=14923244
16/02/03 23:02:18 INFO mapred.JobClient:   FileSystemCounters
16/02/03 23:02:18 INFO mapred.JobClient:     FILE_BYTES_READ=1412144036729
16/02/03 23:02:18 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=323876626568
16/02/03 23:02:18 INFO mapred.JobClient:   File Input Format Counters
16/02/03 23:02:18 INFO mapred.JobClient:     Bytes Read=11885543289
16/02/03 23:02:18 INFO mapred.JobClient:   Map-Reduce Framework
16/02/03 23:02:18 INFO mapred.JobClient:     Reduce input groups=223
16/02/03 23:02:18 INFO mapred.JobClient:     Map output materialized bytes=2214020551
16/02/03 23:02:18 INFO mapred.JobClient:     Combine output records=0
16/02/03 23:02:18 INFO mapred.JobClient:     Map input records=223
16/02/03 23:02:18 INFO mapred.JobClient:     Reduce shuffle bytes=0
16/02/03 23:02:18 INFO mapred.JobClient:     Physical memory (bytes) snapshot=0
16/02/03 23:02:18 INFO mapred.JobClient:     Reduce output records=222
16/02/03 23:02:18 INFO mapred.JobClient:     Spilled Records=638
16/02/03 23:02:18 INFO mapred.JobClient:     Map output bytes=2214019100
16/02/03 23:02:18 INFO mapred.JobClient:     CPU time spent (ms)=0
16/02/03 23:02:18 INFO mapred.JobClient:     Total committed heap usaAT (bytes)=735978192896
16/02/03 23:02:18 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
16/02/03 23:02:18 INFO mapred.JobClient:     Combine input records=0
16/02/03 23:02:18 INFO mapred.JobClient:     Map output records=223
16/02/03 23:02:18 INFO mapred.JobClient:     SPLIT_RAW_BYTES=9100
16/02/03 23:02:18 INFO mapred.JobClient:     Reduce input records=222
Exception in thread "main" java.lang.IllegalStateException: Job failed!
        at org.apache.mahout.vectorizer.DictionaryVectorizer.makePartialVectors(DictionaryVectorizer.java:329)
        at org.apache.mahout.vectorizer.DictionaryVectorizer.createTermFrequencyVectors(DictionaryVectorizer.java:199)
        at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:274)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:56)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
ubuntu <at> :~/mahout/trunk/bin$
Mahmood N | 3 Feb 19:35 2016

Code execution path of mahout

Hi,
This is a question about Mahout 0.6! which is pretty old and I know that. Consider this command (which I don't
know if it is valid in the newer versions or not)

./bin/mahout testclassifier -m $CLASSIFICATION_MODEL -d $CLASSIFICATION_INPUT --method mapreduce

I want to know which parts of the code are being executed with that command. I mean the execution path and functions.

Although the question is for an old version, but if you can shed a light on that (even for new versions), I
appreciate that. 

 
Regards,
Mahmood

Suet Lam Felix CHUNG | 2 Feb 17:25 2016
Picon

Mahout out put to plot graph

I use Mahout to run various algorithms,e.g. kmeans, and then I would like
to use the result to plot graph. Im using R as my graph plotting tool. I
use the seqsdumper and export format as graph_ml. However, the output file
(graph ml ) contains difference result from csv output.r also cannot plot
the result.

My question Anyway to plot the Mahout result by r
2016/2/3 上午12:22於 "BahaaEddin AlAila" <bahaelaila7 <at> gmail.com>寫道:

> Greetings mahout users,
>
> I have been trying to use mahout samsara as a library with scala/spark, but
> I haven't been successful in doing so.
>
> I am running spark 1.6.0 binaries, didn't build it myself.
> However, I tried both readily available binaries on Apache mirrors, and
> cloning and compiling mahout's repo, but neither worked.
>
> I keep getting
>
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/mahout/sparkbindings/SparkDistributedContext
>
> The way I am doing things is:
> I have spark in ~/spark-1.6
> and mahout in ~/mahout
> I have set both $SPARK_HOME and $MAHOUT_HOME accordingly, along with
> $MAHOUT_LOCAL=true
>
> and I have:
>
> ~/app1/build.sbt
> ~/app1/src/main/scala/App1.scala
>
> in build.sbt I have these lines to declare mahout dependecies:
>
> libraryDependencies += "org.apache.mahout" %% "mahout-math-scala" %
> "0.11.1"
>
> libraryDependencies += "org.apache.mahout" % "mahout-math" % "0.11.1"
>
> libraryDependencies += "org.apache.mahout" % "mahout-spark_2.10" % "0.11.1"
>
> along with other spark dependencies
>
> and in App1.scala, in the main function, I construct a context object using
> mahoutSparkContext, and of course, the sparkbindings are imported
>
> everything compiles successfully
>
> however, when I submit to spark, I get the above mentioned error.
>
> I have a general idea of why this is happening: because the compiled app1
> jar depends on mahout-spark dependency jar but it cannot find it in the
> class path upon being submitted to spark.
>
> In the instructions I couldn't find how to explicitly add the mahout-spark
> dependency jar to the class path.
>
> The question is: Am I doing the configurations correctly or not?
>
> Sorry for the lengthy email
>
> Kind Regards,
> Bahaa
>
BahaaEddin AlAila | 2 Feb 17:22 2016
Picon

Confusion regarding Samsara's configuration

Greetings mahout users,

I have been trying to use mahout samsara as a library with scala/spark, but
I haven't been successful in doing so.

I am running spark 1.6.0 binaries, didn't build it myself.
However, I tried both readily available binaries on Apache mirrors, and
cloning and compiling mahout's repo, but neither worked.

I keep getting

Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/mahout/sparkbindings/SparkDistributedContext

The way I am doing things is:
I have spark in ~/spark-1.6
and mahout in ~/mahout
I have set both $SPARK_HOME and $MAHOUT_HOME accordingly, along with
$MAHOUT_LOCAL=true

and I have:

~/app1/build.sbt
~/app1/src/main/scala/App1.scala

in build.sbt I have these lines to declare mahout dependecies:

libraryDependencies += "org.apache.mahout" %% "mahout-math-scala" % "0.11.1"

libraryDependencies += "org.apache.mahout" % "mahout-math" % "0.11.1"

libraryDependencies += "org.apache.mahout" % "mahout-spark_2.10" % "0.11.1"

along with other spark dependencies

and in App1.scala, in the main function, I construct a context object using
mahoutSparkContext, and of course, the sparkbindings are imported

everything compiles successfully

however, when I submit to spark, I get the above mentioned error.

I have a general idea of why this is happening: because the compiled app1
jar depends on mahout-spark dependency jar but it cannot find it in the
class path upon being submitted to spark.

In the instructions I couldn't find how to explicitly add the mahout-spark
dependency jar to the class path.

The question is: Am I doing the configurations correctly or not?

Sorry for the lengthy email

Kind Regards,
Bahaa
jgali | 1 Feb 10:24 2016

Exception in task 0.0 in stage 13.0 (TID 13) java.lang.OutOfMemoryError: Java heap space

Hello everybody,

We are experimenting problems when we use "mahout spark-rowsimilarity” operation. We have an input
matrix with 100k rows and 100 items and process throws an exception about “Exception in task 0.0 in stage
13.0 (TID 13) java.lang.OutOfMemoryError: Java heap space” and we try to increase JAVA HEAP MEMORY,
MAHOUT HEAP MEMORY and spark.driver.memory. 

Environment versions:
Mahout: 0.11.1
Spark: 1.6.0.

Mahout command line:
	/opt/mahout/bin/mahout spark-rowsimilarity -i 50k_rows__50items.dat -o test_output.tmp
--maxObservations 500 --maxSimilaritiesPerRow 100 --omitStrength --master local
--sparkExecutorMem 8g

This process is running on a machine with following specifications:
Mem RAM: 8gb 
CPU with 8 cores
	
.profile file:
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_HOME=/opt/hadoop-2.6.0
export SPARK_HOME=/opt/spark
export MAHOUT_HOME=/opt/mahout
export MAHOUT_HEAPSIZE=8192

Throws exception:
	
16/01/22 11:45:06 ERROR Executor: Exception in task 0.0 in stage 13.0 (TID 13)
java.lang.OutOfMemoryError: Java heap space
        at org.apache.mahout.math.DenseMatrix.<init>(DenseMatrix.java:66)
        at org.apache.mahout.sparkbindings.drm.package$$anonfun$blockify$1.apply(package.scala:70)
        at org.apache.mahout.sparkbindings.drm.package$$anonfun$blockify$1.apply(package.scala:59)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
16/01/22 11:45:06 WARN NettyRpcEndpointRef: Error sending message [message =
Heartbeat(driver,[Lscala.Tuple2; <at> 12498227,BlockManagerId(driver, localhost, 42107))] in 1 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is
controlled by spark.rpc.askTimeout
        at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
        at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:448)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:468)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:468)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:468)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1741)
        at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:468)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
16/01/22 11:45:06 WARN NettyRpcEndpointRef: Error sending message [message =
Heartbeat(driver,[Lscala.Tuple2; <at> 12498227,BlockManagerId(driver, localhost, 42107))] in 1 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is
controlled by spark.rpc.askTimeout
        at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
        at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:448)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:468)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:468)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:468)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1741)
        at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:468)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:107)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
        ...

Can you please advise?

Thanks for advance.
Cheers.
David Starina | 1 Feb 08:59 2016
Picon

Mahout - problem importing to Eclipse

Hi,

I have problem importing the project to Eclipse - I get the error "Could not update project mahout-mr configuration". Attaching the error as image. Anyone seen this problem before? I am using Eclipse 4.5.1 (Mars.1) of Fedora 22. I did a Maven build successfully, installed m2eclipse and m2eclipse-scala plugins to Eclipse, then imported maven project to Eclipse, and that is when I get the error.



Thank you for any help,
Best regards,
David

Andrew Musselman | 27 Jan 19:30 2016
Picon
Gravatar

User interview

To the List, if anyone would be open to being interviewed as a user of
Mahout for an article please let me know.  I can let you know details and
put you in touch with the writer.

Thanks!
Alok Tanna | 14 Jan 20:31 2016
Picon

Mahout : 20-newsgroups Classification Example : Split command

Hi ,

This request is in referece to the 20-newsgroups Classification Example on
the below link
https://mahout.apache.org/users/classification/twenty-newsgroups.html

I am able to run the example and get the results as mentioned in the link,
but when I am trying to do this example without the split command the
results are not same. Also when I try to run the other test data against
the same model results are not accurate.

Can we have this example run without the split command ?

Basically I am trying to do this :

I took both the datasets for training & testing.

Run below commands on both sets:
1. seqdirectory
2. seq2sparse

Now I  have vectors generated for both datasets.
- Run trainnb command using first dataset's vectors output. So instead of
training a model on 80% of the data, I am  using the whole dataset.
- Run testnb command using second dataset's vectors output. This is not the
20% of the data, it's completely new dataset, solely used for testing.

So instead of using mahout split, we I have specified separate dataset for
testing the model.

Results for this exercise is totally different then what I get when I am
using split command to split the data .

Thanks & Regards,

Alok R. Tanna
Peter K | 3 Jan 16:01 2016
Picon

User similarity in Mahout

Hi all,

I'm trying to implement a recommender based 
on Mahout to recommend jobs for users. 
There are 2 actions - an user applied for a job or 
viewed a job. In terms of weight I'm using 5 for 
an apply and 2 for a view.

Now I'm trying to find best user similarity to capture 
these relations.
For example:
User1 applied to jobs: J1,J2,J3,J4,J5
User2 applied to jobs: J1,J2,J3,J4,J6
User3 applied to jobs: J1, J7

When using Euclidean distance similarity if I'm not mistaken 
users 2 and 3 are equal (when 
calculating similarity to User1). But I feel User2 is more similar 
and thus J6 should be 
higher in the recommendations than J7.

Generally, I'm looking into more suggestions what algorithms 
might be the best for this 
case.

Thank you very much for any suggestions.

P.


Gmane