- English only
Distributed Programming Laboratory LPD
INRIA/EPFLWeb-Alter-Egos service focuses on a novel architecture for personalization of Web services across multiple applications. It addresses the problem of extracting the alter-egos of a Web user, namely profiles of users who share similar interests (across multiple Internet applications), in real-time and in the presence of high dynamics. The profiles of alter-egos of a user are crucial to personalize the Web navigation of that user. Web-Alter-Egos is sponsored by Google Inc.
The major challenges addressed by Web-Alter-Egos service are Heterogeneity, Privacy and Scalability.
Heterogeneity of user profiles are applicable for both source and target applications. Heterogeneity of source applications enables user profiles to be extracted from their navigation properties. To achieve heterogeneity of target applications, these extracted profiles should be represented in a form general enough to be leveraged (e.g. similarity computation) in the context of other web activities.
Privacy is a growing concern for users which makes them reluctant to reveal their profiles, unless they have significant incentive to improve navigation experience and sufficient guarantees about anonymity of their sensitive data. Web-Alter-Egos service provides a trade-off between navigation experience (personalization) and privacy for a user.
Scalability for the web-alter-ego service would provide a good quality of clustering of the alter-egos despite the number of users and dynamics of a system.
1. A. Boutet, D. Frey, R. Guerraoui, A.-M. Kermarrec, and R. Patra. Hyrec: Leveraging browsers for scalable recommenders. In MIDDLEWARE 2014.
2. R. Guerraoui, A.-M. Kermarrec, R. Patra, and M. Taziki. D2P: Distance-Based Differential Privacy in Recommenders. In Volume 8 Issue 8, VLDB, 2015.
3. A. Boutet, D.Frey, R. Guerraoui, A.-M. Kermarrec, A. Rault, F. Taïani and J. Wang. Hide & Share: Landmark-based Similarity for Private KNN Computation. In DSN, 2015.
4. D. Frey, R. Guerraoui, A.-M. Kermarrec, and A. Rault. Collaborative Filtering Under a Sybil Attack: Analysis of a Privacy Threat. In EuroSec, 2015.
5. G. Giakkoupis, R. Guerraoui, A. Jégou, A.-M. Kermarrec, N. Mittal. Privacy-conscious information diffusion in social networks. In DISC, 2015.
6. A. Boutet, A.-M. Kermarrec, N. Mittal, F. Taïani. Being prepared in a sparse world: the case of KNN graph construction. In ICDE, 2016.
- 2016/03: "Being prepared in a sparse world: the case of KNN graph construction." gets accepted in ICDE 2016.
- 2015/09: "Privacy-conscious information diffusion in social networks." gets accepted in DISC 2015.
- 2015/03: "Collaborative Filtering Under a Sybil Attack: Analysis of a Privacy Threat" gets accepted in EuroSec 2015 .
- 2015/03: Talk at Google Zurich.
- 2015/03: "Hide & Share: Landmark-based Similarity for Private KNN Computation" gets accepted in DSN 2015 .
- 2015/02: "D2P: Distance-Based Differential Privacy in Recommenders" gets accepted in VLDB 2015 .
- 2014/08: "Hyrec: Leveraging browsers for scalable recommenders" gets accepted in MIDDLEWARE 2014 .
- 2014/01: Workshop on "Personalizing the Internet" at EPFL.
Team Members & Research
A graph is a natural representation for capturing the relationships among users and their interests across multiple internet applications. With such modelling, personalization for each user can be essentially viewed as a link prediction problem which concerns with predicting future interests for a particular user based on her history. My work focuses on scaling matrix factorization techniques to address this problem on heterogeneous compute units.
With an increasing advent of various online services, spanning almost every domain like social networks, music, movies and e-commerce etc, recommendation systems play a vital role in catering to the interest-specific needs of the customers. With such services, comes the need for personalization, that is most popularly taken care by collaborative filtering approaches. Due to the large computation power involved, the vast amount of contents is mostly compromised (and unused) in centralized recommendation systems. This not only neglects the available data but also compromises the quality of recommendation. My thesis is aimed to explore the content based aspect of the recommendation systems in a more scalable setting (with the computation constraints) and to come up with a preciser recommendation system.
The steady growth in the number of online users has led to the emergence of various online services such as Social Networks (Google+, Facebook, Twitter), e-commerce services (Movies: IMDB, Music: last.fm, Books: Goodreads). These online services leverage personalization schemes mostly Collaborative Filtering. Collaborative filtering schemes leverage profiles of other users to improve personalization quality. On the other hand, it opens up scalability and privacy issues. Scalability for online services stems from the fact that these services need to provide personalized services to millions of customers in real-time. Furthermore, these customers would stop using the online services if they experience privacy concerns which leads us to the privacy issue in online services. My research work aims to address these two primary issues in the context of online services mainly recommender systems.
Personalization is now the norm for web services, and is most often done through recommendation. This pervasiveness of recommendation requires recommender systems to ensure two properties: scalability and privacy. Scalability enables good recommendation quality while accommodating web services of any size. Privacy guarantees are incentives for users to share personal information, which in turn improves the accuracy of recommendations. My work aims at addressing scalability and privacy issues in recommender systems through techniques inspired from distributed and decentralized systems.
Due to the overfloading amount of products in online services, recommenders systems have become essential to help users to meet their needs. Until recently, the main concern of the research about recommender systems was to improve the precision of the algorithm, without any concern about the complexity of the resulting system. This had led to very powerful algorithms which however are sometimes hardly used in practice: Netflix could not use the algorithm that has won its competition because of its prohibitive complexity. Indeed it was using an adaptive recommender systems combining 107 other recommender systems requiring more than 2000 hours of training. My work aims at designing an adaptive recommender systems which is efficient and scalable. The idea is to lower the computing cost while minimizing the loss in precision.
In the modern generation of web-based services, the number of web users is increasing exponentially. Moreover, the web has become a big storehouse of information. This makes it impossible for an individual to explore the whole web contents to extract relevant data. Hence, personalization has become crucial in most web applications. It raises, however, privacy concerns as its quality relies on leveraging user profiles.
It was shown even anonymising data before releasing it publicly is not good enough to preserve privacy. Differential privacy is a recent and novel notation of privacy which implies that the output of a given function becomes significantly more or less likely based on a parameter, epsilon, if the inputs differ in one element. Still, the adversary can know about a user’s preference without knowing the exact items rated by the user. My research work focuses on privacy preserving algorithms in recommender systems and design attacks to existing ones.
MovieLens, for example, collects users' rating to provide personalized recommendations. It is important for users to protect the privacy of their profiles from the service provider and, at the same time, receive the satisfiable recommendation service. However, privacy-preserving solutions based on cryptography are inefficient. The killing factor is the multiplication. Either they use expensive secure two-party computation or multiple rounds based on additive homomorphic encryption.
The goal of AnonRec is to understand the trade-off between privacy and efficiency of current solutions, and to propose potential improvements. The tasks are to (1) identify the potential weakness, e.g. collusion of multiple users, a malicious server (a hacker controlling the server); corruption of a user; leakage of profile patterns, etc. ; (2) propose potential improvements: e.g. improve the communication efficiency by adapting homomorphic encryption supporting both multiplication and addition, protecting users' profiles when users are offline by key-switching the encrypted profiles, etc.
Collaborative filtering systems such as Whatsup seem to be dynamic and scalable, thanks to heterogeneity in amplitude and orientation. However, even if the system is decentralized, some privacy threats still remain. AnonRec offers an interesting point of view, via homomorphic encryption, with the system computing directly on encrypted data. My research consists in studying the scalability of AnonRec over large computing clusters, trying to bring privacy to Whatsup and cross-domain recommendation.
According to the previous research works in privacy preserving recommenders, they assumed honest-but-curious parties and demanded most of the users to be online for good recommendation quality. Many of these are not practical because of the significant recommendation latency. Using an efficient encryption scheme, we designed AnonRec which provides data confidentiality while providing good recommendations in real-time.