Project Details
Projekt Print View

Private Secure and Efficient Codes forDistributed Machine Learning

Subject Area Security and Dependability, Operating-, Communication- and Distributed Systems
Communication Technology and Networks, High-Frequency Technology and Photonic Systems, Signal Processing and Machine Learning for Information Technology
Mathematics
Term since 2023
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 508621276
 
In the era of Big Data, massive amount of information needs to be analyzed on a daily basis. Machine learning algorithms treating this data require intensive, time-consuming, calculations that may not even fit on a single computing node. Distributed computing emerged as the solution to enabling such massive computations and reducing its run-time. Distributed computing comes with its challenges, that need to be understood and overcome. The computational task is split into smaller subtasks, distributed to worker nodes. Slow or unresponsive workers, stragglers, are shown to be the Achilles' heel of distributed computing. Waiting for stragglers incurs significant delays and outweighs the benefit of distributing the computation. The second challenge is that of maintaining the privacy of the processed data. When dealing with medical records, face recognition problems, private data generated on social media and so on, privacy of the processed data is of utmost importance. The third challenge of distributed computing is that of maintaining the security of the computation against malicious worker nodes. It is enough that one worker returns erroneous computation without being caught to corrupt the whole process. Coded computing schemes based on error-correcting codes are proposed as a main method for mitigating stragglers, ensuring privacy and security. A scheme is deemed efficient if it can leverage the properties of the desired computation to either assign smaller computational tasks to the workers, or adapt to the behaviour of the workers and assign tasks that are proportional in size to the speed of the workers. In this project, we develop new efficient coding techniques for private and secure coded computing. For linear computations, i.e., matrix-matrix multiplication, we construct efficient schemes that maintain the privacy and security constraints. We consider applications such as face recognition in which the input matrices are sparse (have a significant number of zero entries). We build schemes that can leverage the sparsity of the input matrices even under the desired privacy constraints. For general gradient descent-based machine learning algorithms, we explore the fact that computing a good estimate of the gradient is sufficient. In this setting, we first explore the interplay between the number of tasks assigned to each worker and the number of the tolerated stragglers. We then explore the idea of learning the computational power of the workers while assigning computational tasks only to a subset of the workers, deemed to be the fastest.
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung