Re: mapreduce ItemSimilarity input optimization
Pat Ferrel <pat.ferrel <at> gmail.com>
2014-08-19 19:08:58 GMT
If you have purchase data, train with that. Purchase data is always much better than recommending views.
Don’t worry that you have 100 views per purchase, trust me this will give you much better recommendations.
Filter with un-related categories. So filter electronics, throwing aways clothes and home appliances
but not filtering our other possibly related categories like accessories. There are other ways to do this
but they are more complicated, let’s get this working first.
You should consider doing personalized recommendations next instead of only “other people purchased
these items”. All users see the same recs, since you have no personalization yet.
If you are keeping user purchase history you can train itemsimilarity with that—it will give you
"“other people purchased these items”, these are similar items. Then index the similarity data with
a search engine (Solr, Elastic Search) and query with the current users purchase history as the query. The
results will be an ordered list of items that are personalized recs.
I think you can get a free copy of “Practical Machine Learning” from the MapR site:
https://www.mapr.com/practical-machine-learning It describes the Solr recommender method that
On Aug 19, 2014, at 10:23 AM, Serega Sheypak <serega.sheypak <at> gmail.com> wrote:
>> #2 data includes #1 data, right?
Yes, #1 are "raw" output of ItemSimilarity recommendtions
#2 are recommednations #1 with category filter applied.
I can't drop #1 "look-with" since #2(ith category filter) doesn't have
accessories. Category filter would remove accessory recommendations for
iphone and leave only other iphones.