LION COMMUNITY USAGE CASE
Collaborative recommendation: movies and viewers.
Collaborative filtering and recommendation.
Word of mouth has always been a powerful and effective technique to spread
information and opinions from person to person, in a viral manner. It as a distributed
and human way to mine the data implicitly contained in many human minds. A similar
process can be simulated through data mining and modeling methods. Starting from
the raw data (potentially huge quantities, ranging from thousands to billions of items)
one extracts information bits which are relevant for the specific final user, based on
models of his explicit or implicit preferences, and on similarities with other people.
An interesting application is in the marketing sector: data collected about users
and products, either bought or at least evaluated, can be used to estimate how a customer
would evaluate a product he did not see before. The final purpose of predicting
evaluations is to encourage the user to buy, for example by recommending a list of
items corresponding to the highest predicted evaluations. Advertising is more effective
if the presented products are filtered based on the user preferences.
Collaborative filtering and recommendation is a method of predicting the interests
of a person by collecting taste information from many other collaborating people.
The underlying assumption is that those who agreed in the past tend to agree again
in the future. For example, a collaborative recommendation system for movies could
make personal predictions about which movie a user should like, given some knowledge
of the user's tastes and the information gathered from many other users.
Getting data from a database (mySQL)
In this demo we start from data saved in a mySQL database. Of course, any database can be easily linked to LIONoso.
Some figures showing the organization of the records containing movies, users, and votes expressed by users on movies
are shown below.
Getting data from a database (mySQL)
The data cab be loaded into LION by using the Data Sources - MySQL table tool and inserting the data as shown below.
Building the models for movies and viewers.
The flow in the LION workbench to extract the model is shown below. The first Python script (matrify.py) transforms the
original table into the appropriate matrix form. Then the external program recommend.exe produces the desired model in
the Vectors table. Values are the predictions for the missing cells. Additional Tables are for checks (Check contains the
difference, on the existing votes between the original votes and the predicted votes) and for additional parameters of the model
(pls see the book in the bibliography for details).
By using LIONoso 7D plot and similarity map , movies and viewers can be analyzed in the same
space. In the plots below, each movie is represented by a red ball, each viewer by a green one.
Movies that a user will like tend to be close to the user. Similar movies and viewers with
similar tastes tend to be mapped to nearby positions.
This kind of analysis is powerful to group entities and discover interesting similarity
relationships. Entities can be users, customers, business partners, stocks, social network friends, etc.
Predicting the unknown votes.
After the model for each viewer and each movie is derived, the missing votes (the votes for movies that the
viewer did not watch yet) can be predicted by the system. They are saved in the Table "Values".
The predictions can be used for customized suggestions or for marketing strategies,
when trying to sell a new product to an existing customer.
Let's note that no deep knowledge of the techniques is required to use these models in
an effective manner. Starting from a large quantity of data and testing the prediction
results on a testing set is sufficient to develop effective recommendations with an estimate
on their precision.
Please contact us if you are interested in the details of this usage case and we will be happy to
send you more details. The web mining and network analytics tools of LIONoso can be used for any situation with interacting entities,
like products, customers, employees, biological systems, etc.
The LION way
Roberto Battiti and Mauro Brunato. LIONlab, University of Trento, Feb 2014.
http://www.movielens.org/,Free and personalized movie recommendations. Department of Computer Science and Engineering at the University of Minnesota.