Differences

This shows you the differences between two versions of the page.

--- distributed_ml [2018/03/14 22:18]
patra
+++ distributed_ml [2018/04/10 16:05] (current)
damaskin
@@ Line 4: / Line 4: @@
 === Asynchronous ML on android devices===
-This project is related to training ML algorithms asynchronously on Android devices. The challenges here are primarily: mobile churn, latency, memory, bandwidth and accuracy. The main goal is building a framework to address these challenges.
+This project is related to training ML algorithms asynchronously on Android devices. The challenges here are primarily: mobile churn, latency, energy consumption, memory, bandwidth and accuracy.
+This project involves multiple semester projects that tackle subsets of these challenges from the algorithmic (SGD variants) and the system (framework for android) perspective.
 Related papers:\\
-[1] __[[http://ttic.uchicago.edu/~kgimpel/papers/gimpel+das+smith.conll10.pdf|Distributed Asynchronous Online Learning for Natural Language Processing]]__ \\
+[1] __[[http://net.pku.edu.cn/~cuibin/Papers/2017%20sigmod.pdf|Heterogeneity-aware Distributed Parameter Servers]]__ \\
-[2] __[[http://net.pku.edu.cn/~cuibin/Papers/2017%20sigmod.pdf|Heterogeneity-aware Distributed Parameter Servers]]__
+[2] __[[http://proceedings.mlr.press/v70/zhang17e.html|ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning]]__
-=== Multi-output multi-class classification ===
-The goal of this project is to design a distributed ML algorithm suitable for multi-output classification (e.g. music tag prediction on mobile devices). Deep learning-based approaches seem promising for this task. Nevertheless, current methods target only single-output classification.
-Related papers:\\
-[1] __[[https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45530.pdf|Deep Neural Networks for YouTube Recommendations]]__ \\
-[2] __[[http://papers.nips.cc/paper/5004-deep-content-based-music-recommendation.pdf|Deep content-based music recommendation]]__ \\
-[3] __[[http://www.columbia.edu/~jwp2128/Papers/LiangPaisleyEllis2014.pdf|Codebook-based scalable music tagging with poisson matrix factorization]]__
 ===Personalized/Private ML in P2P network===
@@ Line 27: / Line 19: @@
 [2] __[[https://www.cs.cornell.edu/~shmat/shmat_ccs15.pdf|Privacy-Preserving Deep Learning]]__\\
 [3] __[[http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45428.pdf|Deep Learning with Differential Privacy]]__
+===P2P data market===
+The goal is the design of a P2P infrastructure that enables service providers (peers) to buy and sell data.
+The main challenge for a candidate scheme is the definition and measurement of the data utility from the perspective of each peer.
+The revenue model and privacy guarantees are also two important challenges for this setting.
+Related papers:\\
+[1] __[[http://www.cs.utexas.edu/users/shmat/shmat_kdd08.pdf|The Cost of Privacy: Destruction of Data-Mining Utility in Anonymized Data Publishing]]__\\
+[2] __[[http://www.vldb.org/pvldb/vol9/p1695-upadhyaya.pdf|Price-Optimal Querying with Data APIs]]__\\
+[3] __[[http://pages.cs.wisc.edu/~paris/papers/data_pricing.pdf|Query-Based Data Pricing]]__\\
 ===Federated optimization: distributed SGD with fault tolerance===
@@ Line 35: / Line 37: @@
 [2] __[[https://arxiv.org/pdf/1401.2753.pdf|Stochastic Optimization with Importance Sampling]]__
+===Byzantine-tolerant machine learning===
+Each node in the distributed setting can exhibit arbitrary (byzantine) behaviour during the learning procedure.
+This project explores algorithms (SGD variants) both in the synchronous and asynchronous setup.
+The student will work on our code base on top of tensorflow for the implementation of these algorithms.
-===P2P data market===
+Related papers:\\
-The goal is the design of a P2P infrastructure that enables service providers (peers) to buy and sell data.
+[1] __[[https://arxiv.org/pdf/1802.07928.pdf|Asynchronous Byzantine Machine Learning]]__\\
-The main challenge for a candidate scheme is the definition and measurement of the data utility from the perspective of each peer.
+[2] __[[http://papers.nips.cc/paper/6617-machine-learning-with-adversaries-byzantine-tolerant-gradient-descent|Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent]]__ \\
-The revenue model and privacy guarantees are also two important challenges for this setting.
+===Black-Box attacks against recommender systems===
+A recommender system can be viewed as a black-box that users query with feedback (e.g., ratings, clicks) before getting the output list of recommendations.
+The goal is to infer properties of the recommendation algorithm by observing the output from different queries.
 Related papers:\\
-[1] __[[http://www.cs.utexas.edu/users/shmat/shmat_kdd08.pdf|The Cost of Privacy: Destruction of Data-Mining Utility in Anonymized Data Publishing]]__\\
+[1] __[[https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_tramer.pdf|Stealing Machine Learning Models via Prediction APIs]]__\\
-[2] __[[http://www.vldb.org/pvldb/vol9/p1695-upadhyaya.pdf|Price-Optimal Querying with Data APIs]]__\\
+[2] __[[https://arxiv.org/pdf/1602.02697v3.pdf|Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples]]__\\
-[3] __[[http://pages.cs.wisc.edu/~paris/papers/data_pricing.pdf|Query-Based Data Pricing]]__\\
+=== Multi-output multi-class classification ===
+The goal of this project is to design a distributed ML algorithm suitable for multi-output classification (e.g. music tag prediction on mobile devices). Deep learning-based approaches seem promising for this task. Nevertheless, current methods target only single-output classification.
+Related papers:\\
+[1] __[[https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45530.pdf|Deep Neural Networks for YouTube Recommendations]]__ \\
+[2] __[[http://papers.nips.cc/paper/5004-deep-content-based-music-recommendation.pdf|Deep content-based music recommendation]]__ \\
+[3] __[[http://www.columbia.edu/~jwp2128/Papers/LiangPaisleyEllis2014.pdf|Codebook-based scalable music tagging with poisson matrix factorization]]__
 **Contact:** __[[http://people.epfl.ch/georgios.damaskinos|Georgios Damaskinos]]__