Federated Learning. Yang Liu
Чтение книги онлайн.
Читать онлайн книгу Federated Learning - Yang Liu страница 7
At inference time, the model is applied to a new data instance. For example, in a B2B setting, a federated medical-imaging system may receive a new patient who’s diagnosis come from different hospitals. In this case, the parties collaborate in making a prediction. Finally, there should be a fair value-distribution mechanism to share the profit gained by the collaborative model. Mechanism design should done in such a way to make the federation sustainable.
In broad terms, federated learning is an algorithmic framework for building ML models that can be characterized by the following features, where a model is a function mapping a data instance at some party to an outcome.
• There are two or more parties interested in jointly building an ML model. Each party holds some data that it wishes to contribute to training the model.
• In the model-training process, the data held by each party does not leave that party.
• The model can be transferred in part from one party to another under an encryption scheme, such that other parties cannot re-engineer the data at any given party.
• The performance of the resulting model is a good approximation of ideal model built with all data transferred to a single party.
More formally, consider N data owners
Federated learning is a ML process in which the data owners collaboratively train a model MFED without collecting all data
We can capture what we mean by performance guarantee more precisely. Let δ be a non-negative real number. We say that the federated learning model MFED has δ-performance loss if
The previous equation expresses the following intuition: if we use secure federated learning to build a ML model on distributed data sources, this model’s performance on future data is approximately the same as the model that is built on joining all data sources together.
We allow the federated learning system to perform a little less than a joint model because in federated learning data owners do not expose their data to a central server or any other data owners. This additional security and privacy guarantee can be worth a lot more than the loss in accuracy, which is the δ value.
A federated learning system may or may not involve a central coordinating computer depending on the application. An example involving a coordinator in a federated learning architecture is shown in Figure 1.1. In this setting, the coordinator is a central aggregation server (a.k.a. the parameter server), which sends an initial model to the local data owners A–C (a.k.a. clients or participants). The local data owners A–C each train a model using their respective dataset, and send the model weight updates to the aggregation server. The aggregation sever then combines the model updates received from the data owners (e.g., using federated averaging [McMahan et al., 2016a]), and sends the combined model updates back to the local data owners. This procedure is repeated until the model converges or until the maximum number of iterations is reached. Under this architecture, the raw data of the local data owners never leaves the local data owners. This approach not only ensures user privacy and data security, but also saves communication overhead needed to send raw data. The communication between the central aggregation server and the local data owners can be encrypted (e.g., using homomorphic encryption [Yang et al., 2019, Liu et al., 2019]) to guard against information leakage.
The federated learning architecture can also be designed in a peer to peer manner, which does not require a coordinator. This ensures further security guarantee in which the parties communicate directly without the help of a third party, as illustrated in Figure 1.2. The advantage of this architecture is increased security, but a drawback is potentially more computation to encrypt and decrypt messages.
Federated learning brings several benefits. It preserves user privacy and data security by design since no data transfer is required. Federated learning also enables several parties to collaboratively train a ML model so that each of the parties can enjoy a better model than what it can achieve alone. For example, federated learning can be used by private commercial banks to detect multi-party borrowing, which has always been a headache in the banking industry, especially in the Internet finance industry [WeBank AI, 2019]. With federated learning, there is no need to establish a central database, and any financial institution participating in federated learning can initiate new user queries to other agencies within the federation. Other agencies only need to answer questions about local lending without knowing specific information of the user. This not only protects user privacy and data integrity, but also achieves an important business objective of identifying multi-party lending.
While federated learning has great potential, it also faces several challenges. The communication link between the local data owner and the aggregation server may be slow and unstable [Hartmann, 2018]. There may be a very large number of local data owners (e.g., mobile users). In theory, every mobile user can participate in federated learning, making the system unstable and unpredictable. Data from different participants in federated learning may follow non-identical distributions [Zhao et al., 2018, Sattler et al., 2019, van Lier, 2018], and different participants may have unbalanced numbers of data samples, which may result in a biased model or even failure of training a model. As the participants are distributed and difficult to authenticate, federated learning model poisoning attacks [Bhagoji et al., 2019, Han, 2019], in which one or more malicious participants send ruinous model updates to make the federated model useless, can take place and confound the whole operation.
Figure 1.1: An example federated learning architecture: client-server model.
Figure 1.2: An example federated learning architecture: peer-to-peer model.
1.2.2 CATEGORIES OF FEDERATED LEARNING
Let matrix Di denote the data held by the i th data owner. Suppose that each row of the matrix Di represents a data sample, and each column represents a specific feature. At the same time, some datasets may also contain label data. We denote the feature space as X, the label space as Y, and we use I to denote the sample ID space. For example, in the financial field, labels may be users’ credit. In the marketing field labels may be the user’s purchasing desire. In the education field, Y may