pol | 18 Sep 19:00 2014

How can creating a recommender by spark-itemsimilarity

Hi, All
	I saw spark-itemsimilarity doc at
http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html, but I don’t
understand how can creating a recommender by spark-itemsimilarity? I don’t understand "3 Creating a
Recommender" chapter.
For input of the form:
  |-- indicator-matrix - TDF part files
  \-- cross-indicator-matrix - TDF part-files
(Continue reading)

Donni Khan | 17 Sep 10:31 2014

Map Rowsimilarity results to the orginal documents

Hi all,

I run Rowsimilarity between text documents. my documents are sorted as the
*DocID                DocText*
    0                      xxxxx
    1                    xxxxxxxx
     2                    xxxxxxxx
  ......                      ......
The DocID is sorted from 0 and so on. I added the all documents into
sequence file(Tokenization) as:
 writer.append(new Text(Integer.toString(DocID)), new Text(DocText));

then I created tfidf vectors from sequence file,  after that I run
RowSimilarity on the tfidf vectors.
after dumping: the output was as:

key: xx        value:   key1,  key 2 ............
everything is good.

My question is How do I know if the "key" is the same of orginal number
"DocID". Im not sure if they are the same. In more details, the final
output of the RowSimilarity  is "Keys" and "values", how do I can map the
keys to the orginals "DocID"?

Thank you,
Wei Li | 15 Sep 09:15 2014

Question on RecommenderJob

Hi Mahout Users:

    We are using the RecommderJob to perform the item-based
recommendations, the following settings are used:

other parameters are set to default values

while we see that the size of the recommendation results for some users is
less than 20, only 1 or 2. Since we have no time to dive into the source
code now, we do know if we see the right parameters. Does any one can help
us on this issue? many thanks :)

kalmohsen | 14 Sep 21:15 2014


Hello all!

From where to download spark-itemsimilarity as a library?

I am really willing to look at the source code and see:
-how the co-occurrence matrix is being generated and stored?
-using which data structure?

I wish to study it and see if I can improve it?!

Your reply is highly appreciated
Thanks in advance

Peter Wolf | 13 Sep 18:06 2014

Combining knowledge of users with ratings

Hello, I am new to Mahout but not ML in general

I want to create a Recommender that combines things I know about Users with
their Ratings.

For example, perhaps I know the sex, age and nationality of my users.  I'd
like to use that information to improve the recommendations.

How is this information represented in the Mahout API?  I have not been
able to find any documentation or examples about this.

Phil Wills | 12 Sep 18:42 2014

ItemSimilarityDriver failing to write text file

I've been experimenting with the fairly new ItemSimilarityDriver, which is
working fine up until the point it tries to write out it's results.
Initially I was getting an issue with the akka frameSize being too small,
but after expanding that I'm now getting a much more cryptic error:

14/09/12 15:54:55 INFO scheduler.DAGScheduler: Failed to run saveAsTextFile
at TextDelimitedReaderWriter.scala:288
Exception in thread "main" org.apache.spark.SparkException: Job aborted due
to stage failure: Task 8.0:3 failed 4 times, most recent failure: TID 448
on host ip-10-105-176-77.eu-west-1.compute.internal failed for unknown

This is from the master node, but there doesn't seem to be anything more
intelligible in the slave node logs.

I've tried writing to the local file system as well as s3n and can see it's
not an access problem, as I am seeing a zero length file appear.

Thanks for any pointers and apologies if this would be better to ask on the
Spark list,

Frank Scholten | 12 Sep 15:46 2014

Using ItemSimilarity.scala from Java

Hi all,

Trying out the new spark-itemsimilarity code, but I am new to Scala and
have hard time calling certain methods from Java.

Here is a Gist with a Java main that runs the cooccurrence analysis:


When I run this I get an exception:

Exception in thread "main" java.lang.NoSuchMethodError:

What do I have to do here to use the Scala readers from Java?


Manuel Blechschmidt | 11 Sep 00:48 2014

AW: Re: deployment of mahout recommender as a web service

Hi Alexander,


It is for mahout 8.0 but should work as well for 9.0.


Manuel Blechschmidt
Mobil: +49 (0) 173 6322621

<div>-------- Ursprüngliche Nachricht --------</div><div>Von: Alexandros Kontogeorgos
<alejandro.konto <at> gmail.com> </div><div>Datum:10.09.2014  15:48  (GMT-05:00) </div><div>An:
user <at> mahout.apache.org </div><div>Betreff: Re: deployment of mahout recommender as a web service </div><div>
</div>Hello and thank you very much for the prompt reply.

I'm not sure if this something i'm looking for because what i simply need
is a way to deploy my maven project into a ready to deploy .war file with
no further complexity.

To give you a clearer image, its just a small item-based recommender which
applies on a csv file that runs perfectly on my eclipse GUI, but i dont
know how to deploy it on tomcat server other than creating a .war file
according to the guidance of Manning Mahout book.

This is for my diploma thesis which concern small synthetic-data i've
created so far
(Continue reading)

aishsesh | 10 Sep 22:55 2014

LogLikelihoodSimilarity calculation


I have the following case where numItems = 1,000,000, prefs1Size = 900,000
and prefs2Size = 100.

It is the case when i have two users, one who has seen 90% of the movies in
the database and another only 100 of the million movies. Suppose they have
90 movies in common (user 2 has seen only 100 movies totally), i would
assume the similarity to be high compared to when they have only 10 movies
in common. But the similarities i am getting are 
0.9971 for intersection size 10 and 
0 for intersection size 90.

This seems counter intuitive. 

Am i missing something? Is there an explanation for the above mentioned

View this message in context: http://lucene.472066.n3.nabble.com/LogLikelihoodSimilarity-calculation-tp4158035.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Alexandros Kontogeorgos | 10 Sep 21:06 2014

deployment of mahout recommender as a web service

Hello all,

I have a huge problem concerning the deployment of a simple mahout
recommender as a web service on mahout version 0.9 so far. I've been
following Manning Mahout's guides so far but these seem to apply only to
mahout 0.5 versions and previous ones.

I have found a relative thread on this mailing list that concerns this
matter, but i do have very basic knowledge concerning web
services/maven/mahout and it has not helped me at all.

Maybe the answer was in front of me and i could not see it.

Is there any possibility i could find some step-by-step guidance online or
by someone with same problem as me?

Yang | 10 Sep 20:07 2014

can I read the mahout sparse vector using R ?

we generated some output using mahout packages,
now we'd like to do more exploring on an interactive basis.

R is much faster so i'd like to use R to do it.
I'd like to read a sparse Vector and find the original text representation
, probably utilzing the dictionary from mahout too

is it possible to do this?