Information Retrieval and Filtering

Information retrieval evolved in response to the need to be able to ask questions about a large collection of documents. We have a static content base, and there is dynamic information need (a query). So we spend our time and invest in indexing the content base. The common approach used is called TFIDF, which ranks documents and terms.

As time passes, the assumption of information retrieval reversed. The information need is pretty much static, but content base is dynamic. So in information filtering, we switched our effort to modeling user’s need.



Collaborative Filtering

Collaborative filtering emerged as a reaction to the problem that you want really good content, instead of just what’s on topic. The first effort was manual, based on the premise that keyword were insufficient. Automated collaborative filtering is the first system became known as recommender system, started with GroupLens project. The premise of GroupLens was that the user would rate the articles as they read them. Users would be matched to each other with similar tastes. You would get personalized prediction on what you would like or dislike.

In the mid or late of 90s, work has been down left and right, and people went out, got these things into commercial practice. We are seeing personalized recommender systems deployed pretty much everywhere.

Recommenders

We can define recommenders as tools helping people find worthwhile stuff. We can break them down in the sense of interfaces:

filtering interfacetakes a stream of content and identifies the one you want
recommendation interfacesuggestion list, top-10 list, offers and promotions
prediction interfaceevaluate candidates, predicted rating, etc.

Recommendation Approaches

Non-personalized and stereotypedsomething popular, or group preference
Product associationpeople who like / buy this also like / buy that
Content-basedstart learning what individual likes and building a profile
Collaborativelearn what individual likes and use other’s experience to recommend

Preference and Ratings

Very broadly we want to learn preference. What do users do that might tell us something about their preference? Explicit preference include ratings, reviews, votes / likes, continuous scale, pairwise preference, etc. Ratings are not always accurate, users’ preference may change over time. The ratings could occur:

  1. during consumption (rate when experiencing the item)
  2. some time after the consumption (based on their memory to rate), or
  3. not consumed the (high cost low volume) item yet (expectation)

Implicit preference are inferred from users’ actions. How much time user spend reasonably correlated well to their ratings. There are also binary actions include search, click, follow, purchase.

Predictions and Recommendations

Predictions are estimates of how much you will like an item. Recommendations don’t make bold statement that predictions make. Recommendations are suggestions for items you might like.

PredictionsRecommendations
Proshelps quantify
gives you clear understanding on some scale
provides a set of good choices
Consgives you something can be wrong (falsifiable)poor items can result in failure to explore

Explicit predictions or recommendations may let customers feel that they are pushed and being manipulated.

Analytical Framework

This is a framework for analyzing recommender systems in general, there are 8 dimensions:

  1. Domain – what is being recommended.
  2. Purpose – sales, information, education, building community
  3. Recommendation Context – what is the user doing at the time of recommendation?
  4. Whose Opinions – experts, ordinary folks, or people like you
  5. Personalization Level
    • non-personalized
    • demographic
    • ephemeral
    • persistent
  6. Privacy and Trustworthiness
    • personal info revealed, identity, deniability.
    • is the recommendation honest? biased?
  7. Interfaces – prediction, recommendation, filtering, organic or explicit representation
  8. Recommendation Algorithms

Recommendation Algorithms

In the basic model of recommenders, there are 3 concept:

  1. users – with user attributes (demographics, etc), user model
  2. items – with item attributes (properties,,etc)
  3. ratings – where users meet items is the space of ratings.

Broadly there are 4 categories of algorithms:

users
non-personalized summary statisticssimple summary of statistics, no model, no user and item attributes
content-based filteringbuild models using user ratings and item attributes
collaborative filteringcommon core is a sparse matrix of ratings
fill in missing values (predict)
select promising cells (recommend)
othersinteractive approaches
hybrids of various techniques

Types of evaluation:

  1. Accuracy of predictions
  2. Usefulness of recommendations
  3. Computational performance


My Certificate

For more on recommender systems, please refer to the wonderful course here https://www.coursera.org/learn/recommender-systems-introduction


Related Quick Recap


I am Kesler Zhu, thank you for visiting my website. Checkout more course reviews at https://KZHU.ai

Leave a Reply

Your email address will not be published. Required fields are marked *