Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
distributed_ml [2018/04/10 14:36]
damaskin
distributed_ml [2018/04/10 16:05] (current)
damaskin
Line 5: Line 5:
 === Asynchronous ML on android devices=== === Asynchronous ML on android devices===
 This project is related to training ML algorithms asynchronously on Android devices. The challenges here are primarily: mobile churn, latency, energy consumption,​ memory, bandwidth and accuracy. ​ This project is related to training ML algorithms asynchronously on Android devices. The challenges here are primarily: mobile churn, latency, energy consumption,​ memory, bandwidth and accuracy. ​
 +This project involves multiple semester projects that tackle subsets of these challenges from the algorithmic (SGD variants) and the system (framework for android) perspective.
  
 Related papers:\\ Related papers:\\
-[1] __[[http://​ttic.uchicago.edu/​~kgimpel/​papers/​gimpel+das+smith.conll10.pdf|Distributed Asynchronous Online Learning for Natural Language Processing]]__ \\ +[1] __[[http://​net.pku.edu.cn/​~cuibin/​Papers/​2017%20sigmod.pdf|Heterogeneity-aware Distributed Parameter Servers]]__ \\ 
-[2] __[[http://​net.pku.edu.cn/​~cuibin/​Papers/​2017%20sigmod.pdf|Heterogeneity-aware Distributed Parameter Servers]]__ \\ +[2] __[[http://​proceedings.mlr.press/​v70/​zhang17e.html|ZipML:​ Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning]]__
-[3] __[[http://​proceedings.mlr.press/​v70/​zhang17e.html|ZipML:​ Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning]]__ +
- +
-=== Multi-output multi-class classification === +
-The goal of this project is to design a distributed ML algorithm suitable for multi-output classification (e.g. music tag prediction on mobile devices). Deep learning-based approaches seem promising for this task. Nevertheless,​ current methods target only single-output classification. +
- +
-Related papers:\\ +
-[1] __[[https://​static.googleusercontent.com/​media/​research.google.com/​en//​pubs/​archive/​45530.pdf|Deep Neural Networks for YouTube Recommendations]]__ \\ +
-[2] __[[http://​papers.nips.cc/​paper/​5004-deep-content-based-music-recommendation.pdf|Deep content-based music recommendation]]__ \\ +
-[3] __[[http://​www.columbia.edu/​~jwp2128/​Papers/​LiangPaisleyEllis2014.pdf|Codebook-based scalable music tagging with poisson matrix factorization]]__ +
  
 ===Personalized/​Private ML in P2P network=== ===Personalized/​Private ML in P2P network===
Line 29: Line 19:
 [2] __[[https://​www.cs.cornell.edu/​~shmat/​shmat_ccs15.pdf|Privacy-Preserving Deep Learning]]__\\ [2] __[[https://​www.cs.cornell.edu/​~shmat/​shmat_ccs15.pdf|Privacy-Preserving Deep Learning]]__\\
 [3] __[[http://​static.googleusercontent.com/​media/​research.google.com/​en//​pubs/​archive/​45428.pdf|Deep Learning with Differential Privacy]]__ [3] __[[http://​static.googleusercontent.com/​media/​research.google.com/​en//​pubs/​archive/​45428.pdf|Deep Learning with Differential Privacy]]__
- 
-===Federated optimization:​ distributed SGD with fault tolerance=== 
-This project explores the case where data does not leave each user device while certain (arbitrary) devices fail and recover. The challenge is to accelerating learning under this scenario leveraging various techniques like importance sampling. 
-  
-Related papers:\\ 
-[1] __[[https://​arxiv.org/​pdf/​1405.3080.pdf|Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling]]__ \\ 
-[2] __[[https://​arxiv.org/​pdf/​1401.2753.pdf|Stochastic Optimization with Importance Sampling]]__ ​ 
- 
  
 ===P2P data market=== ===P2P data market===
Line 48: Line 30:
 [3] __[[http://​pages.cs.wisc.edu/​~paris/​papers/​data_pricing.pdf|Query-Based Data Pricing]]__\\ [3] __[[http://​pages.cs.wisc.edu/​~paris/​papers/​data_pricing.pdf|Query-Based Data Pricing]]__\\
  
 +===Federated optimization:​ distributed SGD with fault tolerance===
 +This project explores the case where data does not leave each user device while certain (arbitrary) devices fail and recover. The challenge is to accelerating learning under this scenario leveraging various techniques like importance sampling.
 + 
 +Related papers:\\
 +[1] __[[https://​arxiv.org/​pdf/​1405.3080.pdf|Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling]]__ \\
 +[2] __[[https://​arxiv.org/​pdf/​1401.2753.pdf|Stochastic Optimization with Importance Sampling]]__ ​
 +
 +===Byzantine-tolerant machine learning===
 +Each node in the distributed setting can exhibit arbitrary (byzantine) behaviour during the learning procedure.
 +This project explores algorithms (SGD variants) both in the synchronous and asynchronous setup.
 +The student will work on our code base on top of tensorflow for the implementation of these algorithms.
  
-===Black-Box ​Attacks ​against ​Recommender Systems===+Related papers:\\ 
 +[1] __[[https://​arxiv.org/​pdf/​1802.07928.pdf|Asynchronous Byzantine Machine Learning]]__\\ 
 +[2] __[[http://​papers.nips.cc/​paper/​6617-machine-learning-with-adversaries-byzantine-tolerant-gradient-descent|Machine Learning with Adversaries:​ Byzantine Tolerant Gradient Descent]]__ \\ 
 + 
 +===Black-Box ​attacks ​against ​recommender systems===
 A recommender system can be viewed as a black-box that users query with feedback (e.g., ratings, clicks) before getting the output list of recommendations. A recommender system can be viewed as a black-box that users query with feedback (e.g., ratings, clicks) before getting the output list of recommendations.
 The goal is to infer properties of the recommendation algorithm by observing the output from different queries. The goal is to infer properties of the recommendation algorithm by observing the output from different queries.
Line 57: Line 54:
 [2] __[[https://​arxiv.org/​pdf/​1602.02697v3.pdf|Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples]]__\\ [2] __[[https://​arxiv.org/​pdf/​1602.02697v3.pdf|Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples]]__\\
  
 +
 +=== Multi-output multi-class classification ===
 +The goal of this project is to design a distributed ML algorithm suitable for multi-output classification (e.g. music tag prediction on mobile devices). Deep learning-based approaches seem promising for this task. Nevertheless,​ current methods target only single-output classification.
 +
 +Related papers:\\
 +[1] __[[https://​static.googleusercontent.com/​media/​research.google.com/​en//​pubs/​archive/​45530.pdf|Deep Neural Networks for YouTube Recommendations]]__ \\
 +[2] __[[http://​papers.nips.cc/​paper/​5004-deep-content-based-music-recommendation.pdf|Deep content-based music recommendation]]__ \\
 +[3] __[[http://​www.columbia.edu/​~jwp2128/​Papers/​LiangPaisleyEllis2014.pdf|Codebook-based scalable music tagging with poisson matrix factorization]]__
  
 **Contact:​** __[[http://​people.epfl.ch/​georgios.damaskinos|Georgios Damaskinos]]__ **Contact:​** __[[http://​people.epfl.ch/​georgios.damaskinos|Georgios Damaskinos]]__