Thursday 10 May 2012

My first experience with Mahout (as a researcher)

One of the problems in the area of Recommender Systems is that it is hard to reproduce someone else's experiments. I find that normally in papers, the implementation and the configuration of algorithms is vague and incomplete. Little tweaks in an algorithm can lead to big changes in the output, and it's hard to repeat experiments if these small (but crucial) details are not known.

Recently I have seen that researchers are using frameworks to implement existing algorithms which I think is a necessary step in order to allow for the repeatability of experiments. For this reason I decided to try Mahout and implement some traditional recommendation algorithms, in particular User-based Collaborative Filtering. I want to describe my experience with this framework in case there are other researchers thinking about using it, or that have already started using it.

The Mahout distribution I've been using is 0.6. I've been looking at the taste package which includes the code for recommendations.

The first thing I noticed when looking at some classes is that the default algorithms in Mahout aren't ideal. For example, the predictor used by Mahout for CF is just a simple weighted average predictor. Resnick is known to work better so I found it strange that they haven't implemented it. Indeed, after implemented a new Resnick predictor and running it on the Movielens dataset I got better results. So if you are going to run UBCF, first of all try to modify the predictor; the theory is that you'll see better results with Resnick.


The second thing I realised is that when I ran User Based CF it was a bit slow. Then I found out that Mahout computes the user similarities on the fly. This is because in a life system users tend to join and leave the system. But when running experiments we don't need this! Computing similarities in advance is a common practice so I would recommend modifying this on Mahout if your datasets are too big and you don't want to be waiting for days to get your results.

Finally there are a couple of silly mistakes I made because of a non-obvious implementation in Mahout.

Be careful when loading files with the same name into a FileDataModel !!!!

When loading a particular file into a FileDataModel I realised I was getting a model with a lot more users and items in it. I almost ended up crazy trying to find out what was happening. Well, it seems that if you load a particular file (for example "data.test") and there is another file in the same folder with the same name but different file extension (for example, "data.train"), it will also load that file into your model. Apparently, the reason why they do this provide updated data to the main file to allow pushing new updates without having to copy the same data again.

But for researchers we can have the same name for test and training. Funny enough, this is the format of the commonly used Movielens dataset test and training files... You can imagine how wrong (and "better") results  can be if you are trying to load your training data and at the same time you are also loading the test data!! So I think this is something to be extremely careful about!

Be careful when manipulating a PreferenceArray.

Do not create a PreferenceArray using the constructor GenericUserPreferenceArray(int size) if you don't know the size of the array. It's not dynamic like a Vector for example, so if you don't fill it all, there will be elements with default values, which can be dangerous and most probably lead to wrong outputs!!
PreferenceArray newPrefs = new GenericUserPreferenceArray(100); //Avoid this!!
newPrefs.set(0, p0);
newPrefs.set(1, p1);

Instead fill an ArrayList of Preferences first and then use that to create the PreferenceArray.

ArrayList<Preference> list = new ArrayList...
list.set(0, p0);
list.set(1, p1);

PreferenceArray newPrefs = new GenericUserPreferenceArray(list);

My conclusion after a first "taste" ;) of the framework is that although Mahout might be very useful for someone trying to deploy a life recommender system I think that for someone working on research is not appropriate. Researchers might use it as a framework to plug in components, but even in that case I'm not sure if that's the best framework since the default configuration might lead to errors in the results. It would have been interesting to see a spin off of Mahout focused in research, but unfortunately that doesn't seem to be the direction they are taking with it.

Lately I started hearing good things about other recommender systems frameworks: LensKit, (created by the GrupLens research group) and MyMediaLite (developed at the University of Hildesheim, Germany). Although I haven't used it yet the fact that is focused on research gives me a good feeling and I'll probably be using one of these in the future. Both have been previously presented in the last ACM Recommender Systems Conference (2011):


MyMediaLite: a free recommender system library


Rethinking the recommender research ecosystem: reproducibility, openness, and LensKit

LensKit: A modular Recommender Framework