Federated Learning (FL) empowers a multitude of devices, including mobile phones and sensors, to collaboratively train a global machine learning model while retaining their data locally. A prominent example of FL in action is Google's Gboard, which uses a FL-trained model to predict subsequent user inputs on smartphones.
Two primary challenges arise during the training phase of FL:
Data Privacy: How to ensure user data remains confidential? Even though the data is kept locally by the devices, it has been shown that an honest-but-curious server can still reconstruct data samples, sensitive attributes, and the local model of a targeted device. Moreover, the server can perform membership inference attacks to identify whether a data sample was used in training or source inference attacks to determine which device stores a given data sample.
Security Against Malicious Participants: How to ensure the learning process is not derailed by harmful actors? Recent research has demonstrated that, in the absence of protective measures, a malicious agent can deteriorate model performance by simply flipping the labels and/or the sign of the gradient, and even inject backdoors into the model.
Differentially private algorithms have been proposed to tackle the challenges of protecting user privacy. These algorithms rely on FL clients clipping the gradients and adding noise to them before updating the model, ensuring that minor alterations in a user's training dataset will not be discernible to potential adversaries. By leveraging the differentially private mechanisms, it has been shown that adversaries are unable to deduce the exact local information of vehicles for applications such as Uber. Furthermore, it demonstrates that the quality of data reconstruction attack is significantly reduced when training a convolutional neural network on the CIFAR-10 dataset.
To enhance system security against adversarial threats, Byzantine resilient mechanisms are implemented on the server side. These algorithms are designed to identify and mitigate potentially detrimental actions or inputs from users, ensuring that even if some components act maliciously or erratically, the overall system remains functional and secure. Experiments reveal that integrating these Byzantine resilient mechanisms sustains neural network accuracy at 90.7%, even when 10% of the agents maliciously flip the labels on the MNIST dataset. In contrast, without such protection, the accuracy of the neural network drops significantly.
Integrating differential privacy with Byzantine resilience presents a notable challenge. Recent research suggests that when these two security measures are combined in their current forms, the effectiveness of the resulting algorithm disproportionately depends on the number of parameters in the machine learning model. In particular, it requires either the batch size to grow proportionally to the square root of the number of parameters, or the proportion of the malicious agents in the system to decrease inversely proportional to the square root of the number of parameters. For a realistic model such as ResNet-50, the batch size should be larger than 5000, which is clearly impractical. To tackle this problem, novel Byzantine resilient algorithms have been recently proposed. However, these algorithms encounter significant computational complexity at each communication round. Hence, there is a pressing need for innovative methods that can seamlessly integrate differential privacy and Byzantine resilience with low computational complexity to train practical neural networks.
Objective
In this project, we aim to propose novel FL algorithms to effectively tackle these two mutually linked challenges. In particular, we want to explore the potentialities of compression in FL training, as these techniques can highly reduce the model dimension, which may provide a solution for a computation-efficient, private, and secure FL system.
Compression techniques were initially introduced to alleviate communication costs in distributed training processes, where only a proportion of model parameters are sent from the device to the server in each communication round. The primary objective of compression design is to ensure a communication-efficient machine learning/FL system, by providing model parameters selection rules at the device side which optimize the trained model performance under a given communication budget. Combined Byzantine resilient methods with compression, to ensure a communication-efficient secure FL system. However, in these studies, even though devices transmit compressed models to the server, Byzantine resilient methods still operate on the full model. Consequently, their solutions still require high computation load.
In this project, our goal is different: we target a best compression strategy for a computation-efficient private and secure FL system. More precisely, the goal of this project is to study a compression strategy that provides the best trade-off among privacy, robustness against adversarial threats, computational complexity, and model performance. This is still an open question, with no prior research delving into this specific direction.
There are two main challenges in this project. Firstly, the combination of existing compression techniques and the Byzantine resilient algorithms is not straightforward. Traditional compression techniques may result in heterogeneous selections of model parameters from each client. Consequently, the aggregator would need to work on the union of all clients’ selected parameters, which may still form a high-dimensional vector, leading to high computational costs. Hence, a meticulous design is required to synchronize the selection of the model parameters among devices, which should also provide analytical guarantees on the model performance. Secondly, the impact of compression on the trade-off among the privacy, the robustness, and model utility is not apparent. Although some experimental studies demonstrate that compression can render models less vulnerable to attacks and even boosts differential privacy, it remains unclear whether these effects persist in the context of both differential privacy and Byzantine resilience. The project necessitates theoretical and experimental results to illustrate that the proposed methods yield low computational cost by compression while still maintaining reasonable guarantees on privacy, robustness, and model utility.
This project will be co-supervised with Giovanni Neglia (Inria, France) and Gupta Nirupam (EPFL, Switzerland).
Main activities: Research
The candidate should have a solid mathematical background, good programming skills and previous experience with PyTorch or TensorFlow. He/She should also be knowledgeable on machine learning, especially federated learning, and have good analytical skills. We expect the candidate to be fluent in English.
Avantages
Gross Salary: 2788 € per month