
Matswarm: trusted swarm transfer learning driven materials computation for secure big data sharing
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN
Play all audios:

ABSTRACT The rapid advancement of Industry 4.0 necessitates close collaboration among material research institutions to accelerate the development of novel materials. However,
multi-institutional cooperation faces significant challenges in protecting sensitive data, leading to data silos. Additionally, the heterogeneous and non-independent and identically
distributed (non-i.i.d.) nature of material data hinders model accuracy and generalization in collaborative computing. In this paper, we introduce the MatSwarm framework, built on swarm
learning, which integrates federated learning with blockchain technology. MatSwarm features two key innovations: a swarm transfer learning method with a regularization term to enhance the
alignment of local model parameters, and the use of Trusted Execution Environments (TEE) with Intel SGX for heightened security. These advancements significantly enhance accuracy,
generalization, and ensure data confidentiality throughout the model training and aggregation processes. Implemented within the National Material Data Management and Services (NMDMS)
platform, MatSwarm has successfully aggregated over 14 million material data entries from more than thirty research institutions across China. The framework has demonstrated superior
accuracy and generalization compared to models trained independently by individual institutions. SIMILAR CONTENT BEING VIEWED BY OTHERS THE COLLABORATIVE ROLE OF BLOCKCHAIN, ARTIFICIAL
INTELLIGENCE, AND INDUSTRIAL INTERNET OF THINGS IN DIGITALIZATION OF SMALL AND MEDIUM-SIZE ENTERPRISES Article Open access 30 January 2023 SWARM LEARNING FOR DECENTRALIZED AND CONFIDENTIAL
CLINICAL MACHINE LEARNING Article Open access 26 May 2021 DESIGN OF AN IMPROVED MODEL USING FEDERATED LEARNING AND LSTM AUTOENCODERS FOR SECURE AND TRANSPARENT BLOCKCHAIN NETWORK
TRANSACTIONS Article Open access 10 January 2025 INTRODUCTION The integration of Industrial Internet of Things (IIoT) and machine learning is revolutionizing research and development in
materials science1,2. The advent of Industry 4.0 has revolutionized materials science through the integration of IIoT. Advanced sensors and data acquisition technologies enable real-time
monitoring of material parameters such as temperature, hardness, melting point, and boiling point, providing unprecedented data support3. Concurrently, machine learning algorithms analyze
this vast data, allowing researchers to predict material properties, optimize designs, and discover new materials based on performance, structural properties, and preparatory conditions4,5.
However, creating accurate predictive models requires large, diverse training datasets. Today, various materials and big data platforms6,7,8 have been developed, providing researchers with
aggregated data. Nonetheless, for sensitive datasets that cannot be publicly shared, material data mining and analysis remain limited due to small sample sizes9,10. This poses a challenge
for training effective models. While data augmentation11,12 offers a potential solution, relying on simulated data can compromise model accuracy and generalization13,14. Additionally, even
with sufficient samples, standardized testing environments and methodologies can limit data diversity, further hindering model generalization for new materials. Transfer learning15 is often
used as a solution, but it involves sharing complete models with third parties, which raises concerns about data security and potential leakage. To accelerate the development of new
materials, a secure and collaborative computing methodology is essential. This approach must ensure data protection while allowing collaborative modeling across different organizations to
improve model accuracy and generalization. federated learning (FL)16 offers a viable solution by enabling organizations to collaborate without revealing their original data, sharing only
insights from local models. This protects sensitive data while allowing effective aggregation17. However, the traditional FL framework, which relies on a central server to aggregate local
model parameters18, raises concerns about the integrity and authenticity of the global model19. This centralization also makes the server susceptible to internal and external attacks20,21.
Moreover, most existing FL solutions have primarily been validated theoretically, using publicly available datasets and focusing on classification problems22,23. This theoretical focus fails
to address the practical challenges faced by non-i.i.d. datasets owned by different organizations, where issues of model accuracy and generalization are more pronounced. The lack of
empirical validation in real-world applications further questions the practicality and feasibility of these solutions. To truly harness the potential of FL in materials science, it is
crucial to develop methodologies that not only perform well in controlled, theoretical settings but also demonstrate robustness and effectiveness in diverse, real-world environments. This
will ensure the models are reliable, secure, and capable of advancing material discovery and development. To accelerate materials science research and development, building on the Materials
Genome Engineering (MGE) project24, we developed the NMDMS platform25,26 to facilitate the collection, storage, retrieval, and computation of material data. As the cornerstone of MGE’s data
applications, NMDMS platform provides data consumers with access to an extensive repository of material data contributed by over thirty research institutions across China. This platform also
serves as a data exchange and sharing hub for materials researchers. Although the NMDMS platform provides basic collaborative computing services, it lacks solutions for handling the
inherent limitations of FL in the context of material science. For example, while it achieves relatively high prediction accuracy for i.i.d. (independent and identically distributed)
training sets, it falls short in generalization capability for non-i.i.d. training sets and cannot ensure the confidentiality and integrity of parameters during the training process. Here,
we introduce MatSwarm as part of the NMDMS platform to address the limitations in materials science collaboration, particularly in the context of Industry 4.0, where efficient cooperation
among research institutions is crucial for accelerating novel material development. MatSwarm tackles the challenges posed by non-i.i.d. data and ensures the confidentiality and integrity of
sensitive material information through a decentralized collaborative computing framework. To the best of our knowledge, this application of the MatSwarm framework is unprecedented in the
materials field. Validated with real datasets from NMDMS, MatSwarm significantly enhances model training accuracy and generalization under heterogeneous data conditions. Additionally, by
integrating trusted execution environments (TEE) based on Intel SGX, the framework ensures secure and accurate model aggregations. Ultimately, MatSwarm not only addresses the collaborative
computing challenges but also unlocks the full potential of material data, driving innovation and meeting the demands of high-throughput computing and experimentation, thus accelerating
material discovery. A general introduction on MatSwarm is available in Supplementary Movie 1. RESULTS To date, the MatSwarm platform for material genome engineering (MGE) has collected over
14 million pieces of valid material data27. The platform predominantly encompasses data on special alloys, material thermodynamics/kinetics, composite functional materials, catalytic
materials, first-principles calculations, and biomedical materials. Data consumers from various fields can submit sharing tasks via the framework according to their specific needs, enabling
collaborative prediction of material properties and the development of new materials with other stakeholders. In our experiments, we utilize the prediction of perovskite formation energies
as an illustrative example to evaluate the performance of the MatSwarm framework. The following research questions (RQs) are addressed: * RQ 1: How does MatSwarm address security issues
during the collaborative computing process in the material science domain? * RQ 2: What are the advantages of MatSwarm compared to other methodologies within the MatSwarm? * RQ 3: How do
different factors affect the performance of MatSwarm, such as data distribution (non-i.i.d. vs i.i.d.), different local models and aggregation methods, and TEE? * RQ 4: How scalable is
MatSwarm in terms of its performance, including the size of the dataset, the number of features, and the number of participants? EXPERIMENTAL SETUP In this experiment, all services and
participants’ applications were deployed on cloud servers. The 16-core Intel Xeon (Ice Lake) Platinum 8369B processor with 32GB RAM (16GB as trusted RAM) was used to enable Intel ®Software
Guard Extensions, allowing organizations to employ enclaves for protecting the confidentiality and integrity of their code and data. The MatSwarm framework was implemented on a consortium
blockchain based on Hyperledger Fabric, with each node initiated as a Docker container and connected to the blockchain network using Docker Swarm28. Local models and aggregation methods are
available for participants to choose from on the MatSwarm platform. The batch size was set to 128, the number of iterations was 200, and the learning rate was 0.002. In the training
objective, _γ_ and _λ_ were set to 0.5 and 1, respectively. Dataset and Model Selection. In our experiments, we illustrate our approach using the prediction of perovskite formation energies
as a case study. We utilized perovskite data from our NMDMS platform to evaluate the performance of the MatSwarm framework, selecting 4016 perovskite samples. The training set consists of
3694 samples, evenly distributed among organizations. The test set comprises 322 samples. Detailed feature engineering on the dataset is described in Supplementary Note 4. Unless specified
otherwise, the number of participants in the experiment is set to three. This experiment aimed to test the performance of MatSwarm for non-i.i.d. material data. To this end, we divided the
training dataset into non-independent and identically distributed (non-i.i.d.) and independent and identically distributed (i.i.d.) datasets for comparative testing. For the non-i.i.d.
dataset, since the label values are normally distributed, we divided the training set into three datasets with different means and variances. The distribution of label values in these
datasets is illustrated in Supplementary Fig. 10. Regarding model selection, unless otherwise specified, the local training models utilize a Multilayer Perceptron (MLP) neural network for
training, with a hidden size of 32 and three network layers. On the MatSwarm framework, the task issuer can select different local training models and aggregation methods based on the
sharing task. For joint training, all organizations’ data was combined, and model training was also conducted using the MLP neural network. Evaluated Attacks. In this scenario, four nodes
participate in the collaborative task, with one acting as a Byzantine node launching the attack. Since all attack methods target the gradients, we modify the model updates in this experiment
to gradients instead of the model parameters. The aggregation methods include the five provided by the MatSwarm framework. We evaluate the impact of different attack methods on the accuracy
of these aggregation methods both inside and outside the TEE. Given the susceptibility of existing swarm learning frameworks to data poisoning attacks29, our experiment aims to demonstrate
the robustness of MatSwarm against such attacks. We consider the following popular poisoning attacks: * _Noise Attack_. The Byzantine nodes send noise-perturbed gradients generated by adding
Gaussian noise to the honest gradients30. We set the Gaussian distribution parameters as \({\mathbb{N}}(0.1,0.1)\). * _Label-Flipping_. The Byzantine nodes flip the local sample labels
during the training process to generate faulty gradients31. Specifically, a label _l_ is flipped as − _l_, where _l_ is the formation energy of perovskite in our experiment. *
_Sign-Flipping_. During each round of learning, participants calculate the gradients ∇ _f__Θ_ of the local model, which are then uploaded to a central server for aggregation32. After
calculating the local gradients, the Byzantine nodes flip the signs of these gradients and send − ∇_f__Θ_. * _A Little is Enough_. The Byzantine nodes send malicious gradient vector with
elements crafted33. For each node _i_ ∈ [_d_ ], where _d_ is the number of Byzantine nodes, the Byzantine nodes calculate mean (_μ_i) and standard deviation (_σ_i) over benign updates, and
set corrupted updates _Δ_i to values in the range \(({\mu }_{{{{\rm{i}}}}}-{z}_{\max }{\sigma }_{{{{\rm{i}}}}},{\mu }_{{{{\rm{i}}}}}+{z}_{\max }{\sigma }_{{{{\rm{i}}}}})\), where \({z}_{\max
}\) ranges from 0 to 1. We set \({z}_{\max }=0.3\) in our experiment. * _Inner Product Manipulation_. The primary goal of IPM is to disrupt model performance by manipulating the inner
product of gradients to affect the direction and speed of model training34. For example, an attacker could enhance or diminish the effects of gradients in a particular dimension. We set the
scaling factor _α_ = 2, the gradient mean to be \(\overline{\nabla {f}_{{{{\boldsymbol{\theta }}}}}}\), and the gradient of the attack sent to be \(-\overline{\nabla
{f}_{{{{\boldsymbol{\theta }}}}}}*\alpha\). SECURITY ANALYSIS (RQ1) Confidentiality protection for local datasets: this framework enables collaborative computing among multiple organizations
while maintaining the confidentiality of local datasets. Traditional centralized machine learning methods require storing all datasets on a central server, posing risks of sensitive data
leakage. Through MatSwarm, each organization trains models on its local dataset without sharing the original data. Instead, organizations only share encrypted model parameters, not raw data.
This approach prevents the disclosure of sensitive information without disrupting the task processes of the participating organizations. Secure model training and aggregation based on TEE:
ensuring the security of model training processes during swarm learning is a significant challenge. To address this issue, MatSwarm employs a TEE established by Intel SGX. In this framework,
the original dataset is encrypted before being sent to the blockchain, using a shared key established through the Diffie-Hellman key exchange protocol. This ensures that data cannot be
stolen or tampered with during transmission. During model training and aggregation, the SGX Enclave performs these operations in a trusted execution environment, preventing attackers from
accessing or altering model parameters. Blockchain-based secure storage: this framework uses blockchain technology to replace untrusted third parties, significantly reducing the risk of data
leakage. Smart contracts are employed to standardize and automate model training and aggregation processes. Transactions are stored on the blockchain in hash form, and due to the uniqueness
of hash values, any tampering with transaction data will result in a change in the hash value. During the consensus process, nodes reject transactions with inconsistent hash values,
ensuring the integrity of global model storage and preventing network attacks. Additionally, digital signatures and hashes protect model updates, further enhancing the security of model
training and preventing tampering or contamination. Impact of attacks on different aggregation methods inside/outside TEE: as shown in Fig. 1a–e present test results in a non-TEE
environment. The results indicate that different aggregation methods, by design, can resist various data poisoning attacks. However, no single aggregation method can resist all types of data
poisoning attacks. The convergence speed and final model accuracy are affected to varying degrees depending on the specific attack and aggregation method used. To verify the TEE’s
resistance to data poisoning attacks, we tested the aggregation methods that were ineffective in the non-TEE within the TEE. Figure. 1f–j show that the TEE effectively resists all types of
data poisoning attacks. The convergence speed and model accuracy remain virtually unaffected, closely matching the performance observed in the absence of attacks. This demonstrates that
MatSwarm can effectively mitigate the risk of data poisoning attacks. METHODOLOGIES COMPARISON (RQ2) Within the MatSwarm framework, we conducted comparative experiments on prediction
accuracy and response time between MatSwarm, local independent training (referred to as “Solo”), joint data training (referred to as “Joint”), and other existing solutions, including
FedAvg35, FedProx36, Homomorphic Encryption Federated Transfer Learning (referred to as “HE-FTM”)37, and a similar framework proposed by Kalapaaking et al. (referred to as “Trust-FL”)38, to
illustrate the performance advantages of this framework. The performance comparison between MatSwarm and other methodologies is presented in Fig. 2. The results of model accuracy, evaluated
using mean squared error (MSE), are shown in Fig. 2. MatSwarm significantly improves prediction accuracy compared to Solo while maintaining the privacy of local datasets across various
organizations. Among the methodologies compared, MatSwarm achieves prediction accuracy closest to that of Joint, which can be considered the upper bound of accuracy for collaborative
computation. In contrast, HE-FTM involves polynomial approximation for evaluating nonlinear functions, resulting in some accuracy loss during training. Trust-FL, employing a horizontal FL
model, is more suited for i.i.d. training data and is less effective at predicting non-i.i.d. material data models. In terms of prediction accuracy, our model is more suitable for the
material science domain, demonstrating better prediction accuracy. Regarding response time, as shown in Fig. 2, Solo exhibits the shortest response time for each organization. Due to the
communication required between organizations, MatSwarm takes longer to execute compared to Solo and Joint. The average response time of MatSwarm increases by approximately 4 seconds compared
to Joint. Despite this increase, the security and privacy protection offered by MatSwarm are highly valuable. Moreover, in practical applications, organizations typically do not require
real-time model training, and the response time difference remains within an acceptable range. Notably, compared to HE-FTM, our model demonstrates lower computational complexity and
significantly improved response time. Compared to Trust-FL, our framework shows a slight increase in response time, primarily due to the enhanced security measures. Model training in our
framework occurs in a trusted execution environment, adding some communication overhead. Additionally, the blockchain consensus algorithm, inspired by PBFT, effectively addresses security
concerns arising from Byzantine nodes. Although the consensus algorithm slightly impacts response time by increasing communication frequency, the trade-off is justified by the improved
security performance. ABLATION EXPERIMENT (RQ3) To understand how different factors affect the performance of MatSwarm, we conducted ablation experiments varying data label distribution,
local model architectures, and w/wo a TEE. Unless specified otherwise, all local models use the same MLP architecture, and the aggregation algorithm is Mean. 1) non-i.i.d. VS i.i.d.: To
demonstrate the performance of the MatSwarm on non-i.i.d. material data, we tested both non-i.i.d. and i.i.d. datasets. * _i.i.d. Training Sets_: Fig. 3a depicts the prediction results for
perovskite formation energy using i.i.d. training sets selected independently for each organization. Our algorithm exhibits extremely high prediction accuracy for the i.i.d. dataset, nearing
the accuracy of Joint. * _Non-i.i.d. Training Sets_: As shown in Fig. 3b, the prediction accuracy for the non-i.i.d. dataset is slightly lower compared to the i.i.d. dataset but still close
to the accuracy of Joint. Compared to Solo, the accuracy for Org1 decreases due to the different label distributions between its data and the test set. A similar trend is observed for Org3.
However, as displayed in Table 1, using MatSwarm for predictions, the prediction MSE for Organization 1 decreased from 1.0291 to 0.2096, and for Organization 3, it decreased from 1.6159 to
0.5849, with the global model achieving an accuracy as low as 0.0903. Despite Organization 2 having a similar label distribution to the test set and thus showing good prediction accuracy,
its local model prediction accuracy also improved slightly after training with MatSwarm. This demonstrates that MatSwarm has strong generalization capabilities for non-i.i.d. material data.
2) Different local models and aggregation methods: Since MatSwarm will perform various training tasks beyond predicting perovskite formation energy, the choice of local models and
aggregation methods significantly impacts the accuracy of model training for different tasks. In this experiment, we compared the performance of MatSwarm using different local models and
aggregation methods to identify the most suitable collaborative computing scheme for predicting perovskite formation energy. The local models capable of solving regression problems include
MLP, recurrent neural network (RNN), Lasso, and long short-term memory (LSTM). The aggregation methods considered are Mean, Median, MultiKrum, CenteredClipping, and GeoMed. Ultimately, we
obtained the prediction results shown in Fig. 3c, d, and Table 1. The results indicate that using MLP within MatSwarm is the most suitable for predicting perovskite formation energy.
Building on the MLP local model architecture, we tested the impact of different aggregation methods on model accuracy and response time. In terms of accuracy, Mean and CenteredClipping
achieved the higher precision, while Mean was the most efficient in terms of response time. Therefore, to choose a suitable aggregation method, one should balance the trade-offs among the
needs of efficiency, accuracy, and security to achieve an optimal solution. This modular development approach facilitates participants in selecting the most suitable solutions for training
tasks and simplifies platform iterations and updates to meet diverse training demands in the material science domain. 3) non-TEE vs TEE: To evaluate the impact of TEE on the accuracy and
efficiency of the MatSwarm framework, we compared the MSE and response time of MatSwarm before and after using TEE. The comparison, shown in Fig. 4, indicates that using TEE does not
significantly affect the prediction accuracy of the model, whether training is conducted individually, with MatSwarm, or on joint data. However, the use of TEE introduces some communication
overhead, leading to an increase in response time. In the materials science domain, unlike in transaction systems, there is typically no strong demand for real-time response, and large model
training often takes hours. Therefore, the increase in response time due to TEE is negligible compared to the enhancement in security it provides. The TEE-based MatSwarm fully meets the
performance requirements for model prediction in the materials science field. SCALABILITY TESTING (RQ4) In this experiment, we evaluated the scalability of MatSwarm by examining the impact
of different dataset sizes, the number of features, and the number of participants. It is important to note that our NMDMS platform operates within a limited number of material
organizations. Currently, the platform accommodates up to 30 registered material organizations, with typically no >10 participants in a sharing task. Therefore, in our experiments, we
tested the framework with a maximum of 15 participants (material organizations). 1) Dataset size: Fig. 5a illustrates the MSE and response time of MatSwarm across varying dataset sizes. The
results indicate that dataset size has a negligible impact on the response time of MatSwarm, while the model accuracy continues to improve with increasing amounts of data. Notably, even when
each organization’s dataset comprises only 30% of the original dataset, our method demonstrates high accuracy. This indicates that our approach can achieve highly accurate training models
even with small sample sizes within each organization, effectively addressing the small sample problem in the materials science domain. 2) Number of features: as shown in Fig. 5b, increasing
the number of features does not significantly affect the response time of MatSwarm, demonstrating good scalability in terms of computational efficiency. In terms of prediction accuracy,
even with sample features constituting only 30% of the total features, our method achieves an MSE value as low as 0.155, indicating high accuracy. Therefore, using MatSwarm, even if each
organization can only obtain a limited number of feature values, it is still possible to achieve highly accurate training models. This makes our approach particularly effective for scenarios
where organizations have limited data or feature availability, ensuring robust and reliable model performance. Furthermore, after reaching ~90% of all features, the addition of less
important features does not substantially impact accuracy. In practical applications, selecting an appropriate set of features is crucial for balancing accuracy and efficiency, often
involving feature extraction optimization methods39,40. 3) Number of participants: as shown in Fig. 5c, the response time of MatSwarm increases linearly with the number of participants. This
increase is primarily due to the additional time required for communication and data coordination. The observed increase in response time aligns with theoretical expectations. In terms of
accuracy, the prediction accuracy of MatSwarm shows a notable upward trend as the number of participants increases. However, after a certain threshold, the accuracy may slightly decline due
to issues such as communication delays, data inconsistencies, and model overfitting introduced by a higher number of participants. Therefore, in the participant selection process, more is
not necessarily better. This demonstrates that MatSwarm can effectively learn the data characteristics of each organization, achieving highly accurate training models without the need for a
large number of participants for collaborative training. Consequently, this approach can also enhance the efficiency of model training. DISCUSSION ADVANTAGES OF THE MATSWARM FRAMEWORK
Security: MatSwarm incorporates advanced security measures to ensure data confidentiality and integrity. A key component of our security strategy is the use of TEEs, specifically Intel SGX,
which protect code and data from external attacks during computation. This approach effectively mitigates poisoning attacks associated with traditional FL setups. Furthermore, our
experimental setup included various attack scenarios to test the resilience of the MatSwarm framework. These tests demonstrated that MatSwarm effectively maintains data integrity and model
accuracy, even in the face of malicious attempts to corrupt the training process. Feasibility: MatSwarm is crucial for enabling collaborative computation over non-i.i.d. material data, a
common challenge due to the diverse nature of data sources and formats in this field. Compared to independent training by organizations and other FL methodologies, our method significantly
improves prediction accuracy and generalization ability. This highlights MatSwarm’s potential to unlock the full value of material data, facilitating more informed and accurate materials
discovery and development processes. Extensive testing with real-world data from the material science domain validated the usability of the MatSwarm framework. By engaging with actual
datasets from participating institutions, we demonstrated the feasibility and accuracy of the models generated through our platform. This use of real data underscores the framework’s ability
to address the ‘data silos’ problem prevalent in materials science. Scalability: MatSwarm has been rigorously tested across multiple dimensions, including varying dataset sizes, feature
quantities, and participant counts. The results show that the model maintains high and stable predictive accuracy, demonstrating excellent scalability and practical applicability. This
consistent performance, even with smaller sample sizes and fewer features, underscores MatSwarm’s capability to adapt to a broad range of scenarios. Such robustness enhances its potential
for widespread adoption in collaborative settings that require handling complex, heterogeneous data landscapes. Additionally, the MatSwarm platform utilizes a modular architecture, allowing
participants to select appropriate local models and aggregation methods based on their training tasks. As task demands increase, we will continuously iterate and update the platform’s local
models and aggregation methods. This approach aims to address various challenges in the material science domain, including performance prediction, material classification, and structural
optimization, ultimately creating a versatile collaborative computing platform. Adaptability: MatSwarm is a secure collaborative computing framework designed for non-public data across
organizations on the NMDMS, specifically addressing key regression challenges in the materials science domain. In this paper, we demonstrate the capabilities of the MatSwarm framework by
using it to predict perovskite formation energies, selecting a perovskite dataset as our example case. To be noted, our framework is suitable for general regression tasks within the
materials science domain, such as predicting the elastic properties of silicon materials and optimizing the microstructure of high-performance alloys. For each shared task, participants can
choose relevant datasets from their organization based on the task’s requirements. This ensures that the framework is not restricted to specific datasets during implementation; instead, it
dynamically adapts to select appropriate local datasets according to the specific needs of each task. Moreover, although MatSwarm is specifically designed for collaborative computing in the
materials science domain, its design principles can be leveraged by other domains with similar needs to construct their own swarm-based collaborative computing frameworks. For other domains
with similar application requirements, the framework can be adapted by modifying the objective function and selecting suitable local models and aggregation methods to fit specific needs.
Additionally, in Section 6 of the Supplementary Materials, we provide a detailed guide on how to extend and apply the MatSwarm framework to other domains. LIMITATIONS OF THE MATSWARM
FRAMEWORK Implementation complexity: while incorporating TEEs enhances security and privacy, it also increases the complexity of system setup and operations, necessitating robust
infrastructure and specialized expertise. To mitigate this, we provide detailed platform deployment and configuration documentation in the supplementary materials, which stakeholders can use
to deploy new training tasks on this platform. Potential latency issues: the decentralized nature of blockchain and remote attestation based on TEEs can introduce delays in model training
and aggregation. However, in the field of materials science, real-time requirements for training are not stringent. The minor increase in latency is negligible compared to the benefits of
resolving the issue of data silos in material data. Hardware dependency: dependence on TEEs, such as Intel SGX, to protect data during computation may limit the applicability of our
framework in environments without such hardware support. Nevertheless, our demonstration system offers the option to choose whether to use TEEs to secure the confidentiality of the model
aggregation process. Even without TEE protection, data security during transmission is ensured through data encryption and secure communication channels. In the future, we plan to offer
additional privacy protection technologies, such as homomorphic encryption and differential privacy, to support a broader range of application scenarios. METHODS OVERALL ARCHITECTURE OF
MATSWARM In this section, we present our proposed framework, MatSwarm, designed for the secure sharing of material big data using swarm transfer learning combined with TEEs. Table 2
summarizes the critical symbols used in our framework. The organizations illustrated in Fig. 6 represent examples of entities involved in materials science. It is noteworthy that our
MatSwarm framework is primarily used to address challenges in collaborative computing within the domain of materials science, as evidenced by its application to a regression problem, such as
predicting material properties like perovskite formation energies, as discussed in this paper. Nevertheless, the framework possesses the potential for extension and application in other
fields facing analogous collaborative computing challenges. Further elaboration on this aspect can be found in Supplementary Note 5. The MatSwarm framework enables collaborative computing
tasks between material organizations. As depicted in Fig. 6, MatSwarm involves multiple organizations (denoted as _N_) collaborating to execute shared tasks. Each organization is responsible
for training its own local models. The blockchain nodes provide a distributed computing environment for the participating organizations and store aggregated models. Additionally, the
trusted execution environment ensures the secure aggregation of local model parameters and collaborates with the blockchain to generate the swarm global model. * 1. Organizations: within the
MatSwarm framework, organization _O_i (1 ≤ _i_ ≤ _N_) collaboratively trains models to meet shared material performance prediction requirements. Initially, each organization conducts
material features sampling locally, and the collected samples are stored as local datasets on their respective cloud servers. Subsequently, organizations choose an appropriate machine
learning method to train a local model. To ensure security during model training, each organization deploys at least one blockchain node on an Intel SGX-enabled cloud server, and the local
model training is performed in SGX’s application enclave. This setup establishes encrypted and authenticated channels, allowing sensitive data to be securely transferred between the cloud
server and the Intel SGX Enclave. * 2. Blockchain Network: MatSwarm leverages the decentralized nature of blockchain to create a collaborative computing environment. Each organization joins
the blockchain network at local blockchain nodes. Within the MatSwarm framework, three transaction types are defined: retrieval, sharing, and uploading. The retrieval transaction verifies
the existence of relevant sharing global models on the blockchain before initiating a new sharing task. The sharing transactions involve organizations initiating new tasks, such as material
performance prediction, with the option for other organizations to participate. The uploading transactions store the final global model on the blockchain, ensuring its integrity and
preventing tampering, thus facilitating model retrieval and usage by other organizations. * 3. Trusted Execution Environment: the TEE, implemented via Intel SGX, ensures the confidentiality
and integrity of local and global models. Each organization applies for two Application Enclaves (denoted as AE) in SGX. AE1 is used to load encrypted local datasets and execute local
models, ensuring confidentiality and integrity during execution. AE2 is used to aggregate global models. This approach ensures the integrity of model aggregation, with all organizations
automatically executing the same model aggregation code through smart contracts41 in AE2. Smart contracts automate the enforcement and management of agreed-upon processes and conditions,
ensuring consistent execution, eliminating discrepancies, enhancing security, and reducing reliance on third-party intermediaries. Additionally, the Quoting Enclave (denoted as QE) generate
attestation REPORT R to assist in remote authentication between AEs in various organizations. PROBLEM FORMULATION We consider a MatSwarm framework constructed by _N_(_N_ > 2)
organizations, where _K_(_K_ ≤ _N_) organizations are in a sharing task, each possessing a local dataset DLi, ∀ _i_ ∈ _K_. Each organization maintains a local model
\({f}_{{{{{\boldsymbol{\theta }}}}}_{{{{\rm{i}}}}}}:{{{{\bf{X}}}}}_{{{{\rm{i}}}}}\to {y}_{{{{{\rm{pre}}}}}_{{{{\rm{i}}}}}}\) with parameters _Θ_i, where Xi and
\({y}_{{{{{\rm{pre}}}}}_{{{{\rm{i}}}}}}\) denote the input and output spaces, respectively. In our study, we assume that all organizations have the same input/output specifications and
homogeneous local model architectures. However, they may choose different local models and aggregation methods based on the sharing task. The objective is to collaboratively train the local
models to ensure that each generalizes well on the joint data distribution, thereby improving prediction accuracy for non-i.d.d. material data. To achieve this objective, we propose a swarm
transfer learning method within the MatSwarm framework. The core of our method is to identify invariances between resource-rich source domains and resource-scarce target domains,
facilitating the learning of common representation spaces and enabling knowledge transfer across domains. The objective function reveals that during the swarm transfer learning process
between organization _O_i and organization _O_i+1, local model training is interdependent, necessitating the exchange of intermediate training results. The training process adheres to a
linear cycle method, with sequential training conducted between organizations in the order [_O_1→_O_2→. . .→_O_K→_O_1]. The completion of training between organizations _O_K and _O_1
signifies the end of a local training round. After each round of local model parameter updates, the parameters are aggregated, and the updated global model parameters are sent back to each
organization for the next round of local model updates. This iterative process continues until the model converges to a specified threshold. The training objective is typically formulated as
the following algorithm: $${\min}_{{{{{\boldsymbol{\theta
}}}}}_{{{{\rm{i}}}},{{{\rm{i}}}}+1{{\mathrm{mod}}}\,K}}f({{{{\bf{X}}}}}_{{{{\rm{i}}}},{{{\rm{i}}}}+1{{\mathrm{mod}}}\,K},{y}_{{{{\rm{i}}}}{{\mathrm{mod}}}\,K})=
{\sum}_{{{{\rm{i=1}}}}}^{{{{\rm{K}}}}}{{{{\mathscr{L}}}}}_{1}({{{{\bf{X}}}}}_{{{{\rm{i}}}} \, {{\mathrm{mod}}}\,K},{y}_{{{{\rm{i}}}}{{\mathrm{mod}}}\,K})+\gamma
{{{{\mathscr{L}}}}}_{2}({{{{\bf{X}}}}}_{{{{\rm{i}}}},{{{\rm{i}}}}+1{{\mathrm{mod}}}\,K})\\ +\frac{\lambda }{2}\left(\parallel {{{{\boldsymbol{\theta }}}}}_{{{{\rm{i}}}} \,
{{\mathrm{mod}}}\,K}{\parallel }^{2}+\parallel {{{{\boldsymbol{\theta }}}}}_{{{{\rm{i}}}}+1{{\mathrm{mod}}}\,K}{\parallel }^{2}\right)$$ (1)
$${{{{\mathscr{L}}}}}_{1}({{{{\bf{X}}}}}_{{{{\rm{i}}}} \, {{\mathrm{mod}}}\,K},{y}_{{{{\rm{i}}}} \, {{\mathrm{mod}}}\,K})={({y}_{{{{\rm{i}}}} \, {{\mathrm{mod}}}\,K}-\varphi
({{{{\bf{X}}}}}_{{{{\rm{i}}}} \, {{\mathrm{mod}}}\,K}))}^{2}$$ (2) $${{{{\mathscr{L}}}}}_{2}({{{{\bf{X}}}}}_{{{{\rm{i}}}},{{{\rm{i}}}}+1{{\mathrm{mod}}}\,K})={\left\Vert {u}_{{{{\rm{i}}}} \,
{{\mathrm{mod}}}\,K}({{{{\bf{X}}}}}_{{{{\rm{i}}}} \, {{\mathrm{mod}}}\,K})-{u}_{{{{\rm{i}}}}+1{{\mathrm{mod}}}\,K}({{{{\bf{X}}}}}_{{{{\rm{i}}}}+1{{\mathrm{mod}}}\,K})\right\Vert }_{F}^{2}$$
(3) The loss function in Equation (1) is formulated to optimize the parameters _Θ_. It aims to minimize the overall loss by integrating multiple components, including
\({{{{\mathscr{L}}}}}_{1}\), \({{{{\mathscr{L}}}}}_{2}\), and regularization terms. \({{{{\mathscr{L}}}}}_{1}\): This term represents the first component of the loss function, capturing the
discrepancy between the predicted outputs and the true labels. Specifically, \({y}_{{{{\rm{i}}}}{{\mathrm{mod}}}\,K}\) denotes the label of organization _O__i_. The form of the objective
function _φ_ depends on the nature of the sharing task, such as classification or regression, and the chosen local model. \({{{{\mathscr{L}}}}}_{2}\): The term \({{{{\mathscr{L}}}}}_{2}\)
typically corresponds to a regularization technique, such as L2 regularization, which helps prevent overfitting and promotes model generalization, where _u_ denotes the representation
converted from the original data, and \(\parallel \cdot {\parallel }_{F}^{2}\) refers to the square of the Frobenius norm. It penalizes large parameter values to create a more balanced and
robust model. _γ_: This parameter _γ_ represents the weight assigned to the \({{{{\mathscr{L}}}}}_{2}\) component in the overall loss function. Adjusting _γ_ controls the trade-off between
fitting the training data and applying regularization. _λ_: The parameter _λ_ determines the weight assigned to the regularization terms that penalize the magnitude of the parameters
\({{{{\boldsymbol{\theta }}}}}_{{{{\rm{i}}}}{{\mathrm{mod}}}\,K}\) and \({{{{\boldsymbol{\theta }}}}}_{{{{\rm{i}}}}+1{{\mathrm{mod}}}\,K}\). It controls the strength of the regularization,
helping manage the model’s complexity. Based on the above illustration, Equation (1) represents a combined loss function designed to optimize the parameters _Θ_. This function integrates the
task-specific loss \({{{{\mathscr{L}}}}}_{1}\), a regularization term \({{{{\mathscr{L}}}}}_{2}\), and a regularization of parameter magnitudes. The components are weighted by _γ_ and _λ_
to achieve a balance between data fitting and model complexity control. To ensure the confidentiality and integrity of the local model and global model generation, MatSwarm incorporates TEE
utilizing Intel SGX. Each organization’s cloud server is enabled with Intel SGX and includes _M_(_M_ ≥ _N_) blockchain nodes within the blockchain network. Each organization can deploy
multiple blockchain nodes. To facilitate understanding, we assume that each organization has deployed a single blockchain node, denoted as BNi (_i_ ≤ _M_). The blockchain node BNi must be
deployed on an Intel SGX-enabled cloud server. To train local and global models, each organization requests the creation of two AEs. AE1 is used to load encrypted local datasets and smart
contract SC1 for training local models. AE2 is used to load encrypted local model parameters and smart contract SC2 for aggregating local model parameters. WORKING MECHANISMS The overall
working mechanism of MatSwarm includes three main stages: task submission, task execution, and task archive. Videos on the procedures and operations of MatSwarm are available as
Supplementary Movies 2 and 3. 1) Task Submission: assume that all material organizations willing to conduct joint model training have registered and stored their metadata on the blockchain.
Organization _O_1, as the task issuer among participants, initiates a retrieval transaction request to the local blockchain node with a task information digest. The local blockchain node
retrieves the blockchain history to determine whether an archived task related to the task information digest has been generated. If such a task exists, the corresponding retrieval result is
returned. If organization _O_1 does not obtain retrieval results for the archived task, it will retrieve the metadata of organizations from the blockchain. Once the task issuer identifies
the organizations to be invited, it will design the sharing task scheme, including the task description, metadata description, and the selection of local models and aggregation methods. The
task issuer subsequently initiates a sharing transaction request to organizations with relevant datasets to join the sharing task as participants. The blockchain nodes of participating
organizations become active nodes, while those of non-participating organizations remain passive. The active nodes participate in the global model consensus mechanism for the task. 2) Task
Execution: to facilitate model aggregation, it is essential to standardize the structure and format of the input datasets among participants. The task issuer should create a virtual dataset
and broadcast it to the blockchain network, enabling each participant to align their local datasets with the standardized format. Subsequently, participants can use their standardized
datasets to train their local models. The task issuer trains a local model and deploys the code into Smart Contract 1 (SC1) running in its AE1. Other participants can invoke SC1 via the
blockchain to train their local models in a similar manner, ensuring uniformity in the local model training code. After each round of local model training, remote attestation is required
between the AE1 of each participant to verify the credibility of the remote nodes and the integrity and confidentiality of the local model. Following remote attestation, encrypted local
models are shared among organizations to generate the global model. To ensure the integrity and confidentiality of the aggregation process, each organization’s blockchain node performs model
aggregation in its AE2. The steps involved are as follows: the task issuer deploys the aggregation algorithm onto the smart contract SC2 running in the AE2. Other participants invoke SC2
from the blockchain, subsequently loading the smart contract and encrypted local model sets submitted by others into their respective AE2. Each participant’s AE2 independently aggregates the
models to generate a global model. To ensure the credibility of each organization’s AE2 and the integrity of the global model, the blockchain network must receive all attestation reports
generated by each organization’s AE2. Consequently, the blockchain nodes complete remote attestation through a consensus mechanism. 3) Task Archive: during each round of training,
organizations obtain the current global model and use it to update their local model until the loss function converges to a specific threshold. However, before a credible global model is
ultimately generated, a consensus must be reached among participants. Once a consensus is achieved, the global model is stored on the blockchain to prevent tampering. Therefore, participants
must ensure that the final global model is recognized by all participants through a consensus mechanism. The model is then securely stored for future retrieval and use. LOCAL MODEL
GENERATION The initial step in local model training involves loading the encrypted local dataset. To ensure security, Intel SGX’s AE only accepts encrypted data. Therefore, before sending
the local dataset DLi to the local AE1, the blockchain node _B__N__i_ must encrypt it using a symmetric encryption algorithm such as advanced encryption standard (AES)42 or Triple Data
Encryption Standard43, represented as Er(. , Kr). Symmetric encryption and decryption between BNi and its AE1 are performed using the key Kri. This process is denoted as Er(. , Kri). The key
Er(. , Kr) is transmitted through a secure channel established by the Diffie-Hellman key exchange protocol44, which allows two parties to establish a shared secret over an unsecured
communication channel, providing a foundation for encrypting further communications. BNi generates an encrypted local dataset Er(DLi, Kri) and sends it to _A__E_1. Upon receipt, AE1 uses the
key Kri to decrypt Er(DLi, Kri)(1 ≤ _i_ ≤ _M_), obtaining the plaintext DLi of the local dataset. This process can be represented as BNi∣Er(DLi, Kri) → AE1∣Dr(Er(DLi, Kri), Kri). The second
step involves local model training. Organizations deploy the local model using the smart contract SC1 to train their local datasets in AE1. For local model training, participants on our
platform can select the appropriate machine learning model based on their task requirements, including MLP45, Lasso46, RNN47, and LSTM48. As the platform evolves, it will offer a broader
range of local training models to meet diverse task requirements. In this paper, we demonstrate the local model training process using an example of predicting perovskite formation energy,
employing the MLP neural network for local model training and the stochastic gradient descent algorithm for parameter updating. As shown in Fig. 7, consider the training between organization
_O_1 and organization _O_2 as an example. Organization _O_2 calculates intermediate results and encrypts them with the public key PKIAS from Intel Authentication Service (IAS). The
encrypted intermediate results \({[{{{{\bf{u}}}}}_{2}^{{{{\rm{t}}}}}]}_{{{{{\bf{PK}}}}}_{{{{\rm{IAS}}}}}}\) and
\({[{{{{\boldsymbol{\theta}}}}}_{2}^{{{{\rm{t}}}}}]}_{{{{{\bf{PK}}}}}_{{{{\rm{IAS}}}}}}\) are sent to organization _O_1. The organization _O_1 decrypts the intermediate results using the
private key PRIAS and calculates the local model gradient \(\frac{\partial {{{{\mathscr{L}}}}}_{1}^{{{{\rm{t}}}}}}{\partial {{\boldsymbol{\theta}} }_{1}^{{{{\rm{t}}}}}}\) and loss function
\({{{{\mathscr{L}}}}}_{1}^{{{{\rm{t}}}}}\). Similarly, organization _O_1 calculates a set of intermediate results, encrypts them with the public key PKIAS, and sends the encrypted
intermediate results \({[{{{{\boldsymbol{\theta }}}}}_{1}^{{{{\rm{t}}}}}]}_{{{{{\bf{PK}}}}}_{{{{\rm{IAS}}}}}}\) and
\({[{{{{\bf{u}}}}}_{1}^{{{{\rm{t}}}}}]}_{{{{{\bf{PK}}}}}_{{{{\rm{IAS}}}}}}\) are sent to organization _O_2, which then calculates the local model gradient \(\frac{\partial
{{{{\mathscr{L}}}}}_{2}^{{{{\rm{t}}}}}}{\partial {{{{\boldsymbol{\theta }}}}}_{2}^{{{{\rm{t}}}}}}\) and loss function \({{{{\mathscr{L}}}}}_{2}^{t}\). Both organizations update their local
model parameters \({{{{\boldsymbol{\theta }}}}}_{1}^{{{{\rm{t+1}}}}}\) and \({{\boldsymbol{\theta }}}_{2}^{{{{\rm{t+1}}}}}\) using the calculated local model gradients. After each
organization completes this round of local model training, the blockchain nodes of each organization perform remote certification of all AE1 through a consensus mechanism. Subsequently,
organizations encrypt and share the updated local model parameters \({{{{\boldsymbol{\theta }}}}}_{i}^{{{{\rm{t+1}}}}}\) with other participants to aggregate local model parameters for the
current round. GLOBAL MODEL GENERATION This section will elaborate on generating global models, covering crucial aspects such as smart contract deployment, model aggregation, remote
attestation, and consensus mechanisms. Figure 8 illustrates the process of global model generation. 1) Smart contract deployment: the task issuer _O_1 deploys the model aggregation algorithm
to the blockchain via the local blockchain node BN1 as a smart contract. Each participant can retrieve and invoke the smart contract from the blockchain. The blockchain node BNi loads the
smart contract and encrypted local model parameter set \({[[{{{{\bf{M}}}}}_{{{{\rm{L}}}}}^{{{{\rm{t}}}}}]]}_{{{{{\bf{PK}}}}}_{IAS}}\) into AE2 of TEEi. The parameters are then decrypted
using PRIAS to construct the plaintext of the local model parameter set \({{{{\bf{M}}}}}_{{{{\rm{L}}}}}^{{{{\rm{t}}}}}=\left({{{{\boldsymbol{\theta
}}}}}_{1}^{{{{\rm{t}}}}},{{{{\boldsymbol{\theta }}}}}_{{{{\rm{2}}}}}^{{{{\rm{t}}}}},\cdots \,,{{{{\boldsymbol{\theta }}}}}_{{{{\rm{K}}}}}^{{{{\rm{t}}}}}\right)\). The calculation of global
model parameters occurs in AE2 of TEEi to ensure the confidentiality of sensitive parameters. Smart contracts facilitate the transfer of global model parameters. 2) Model aggregation: the
global model parameters \({{{{\bf{M}}}}}_{{{{\rm{G}}}}}^{{{{\rm{t+1}}}}}\) calculated in each participant’s AE2 are given by:
$${{{{\bf{M}}}}}_{{{{\rm{G}}}}}^{{{{\rm{t+1}}}}}={\sum}_{i=1}^{K}\frac{| {{{{\rm{DL}}}}}_{{{{\rm{i}}}}}| }{n}{{{{\bf{M}}}}}_{{{{{\rm{L}}}}}_{{{{\rm{i}}}}}}^{{{{\rm{t}}}}},n={\sum}_{i=1}^{K}|
{{{{\rm{DL}}}}}_{{{{\rm{i}}}}}|$$ (4) where \({{{{\bf{M}}}}}_{{{{\rm{G}}}}}^{{{{\rm{t+1}}}}}\) represents the global model updated in the _t_ + 1 round, _K_ denotes the number of
participants, ∣DLi∣ represents the number of samples used by the _i_-th participant to train the local model, and _n_ is the total number of samples used to train all local models.
\({{{{\bf{M}}}}}_{{{{{\rm{L}}}}}_{{{{\rm{i}}}}}}^{{{{\rm{t}}}}}\) is the local model parameters set updated by the _i_-th participant in the _t_ rounds. Notably, parameter aggregation is
illustrated using the Mean method35 in MatSwarm, which is the most widely used approach. However, various aggregation methods are available, such as MultiKrum49, CenteredClipping50,
GeoMed51, and Median52, among others. On our platform, participants can choose different model aggregation methods based on task requirements and robustness needs. 3) Remote attestation:
during the global model generation process, remote attestation is used to verify the integrity of the global model generated by AE2. In this method, the blockchain node BNi facilitates
interaction between the AE2 of TEEi and the blockchain network, serving as both an aggregator and verifier in the attestation process. TEEi generates a REPORT structure information Ri
containing the current enclave identity information Mi, and other metadata through the EREPORT function, and signs Ri to produce a Message Authentication Code (MAC) tag MACi. AE2 sends Ri
and MAC tags to the Quoting Enclave in TEEi for mutual attestation. The Quoting Enclave calls the EGETKEY command to decrypt the MACi and verifies the decrypted information against Ri. After
successful mutual attestation within TEEi, the Quoting Enclave uses the private key (AKRi) of the attestation key generated by the Intel Provisioning Service’s Provisioning Seal Key, to
sign Ri and create a Quote QGi = Sign(Ri, AKRi). Only Quoting Enclave can access the key used for attestation in the Intel Provisioning Service to verify the credibility of TEEi. The QGi is
then sent through the blockchain network to the blockchain nodes of other participants for verification. Once BNi receives K-1 Quotes, it will verify each Quote using the public key AKPi
generated by the Intel Provisioning Service. The verification is completed utilizing the function _v__e__r__i__f__y_(Sign(QGi, AKRi), AKPi). Once the Quote is validated, BNi will extract the
global model
\({{{{\bf{M}}}}}_{{{{\rm{G}}}}}^{{{{\rm{t+1}}}}}\)=\(\left({{{{\bf{M}}}}}_{{{{{\rm{G}}}}}_{{{{\rm{1}}}}}}^{{{{\rm{t+1}}}}},{{{{\bf{M}}}}}_{{{{{\rm{G}}}}}_{{{{\rm{2}}}}}}^{{{{\rm{t+1}}}}},\ldots,{{{{\bf{M}}}}}_{{{{{\rm{G}}}}}_{{{{\rm{K}}}}}}^{{{{\rm{t+1}}}}}\right)\)
from QG = \(\left({{{{\bf{QG}}}}}_{1},{{{{\bf{QG}}}}}_{2},\ldots,{{{{\bf{QG}}}}}_{{{{\rm{K}}}}}\right)\) for subsequent consensus. 4) Global Model Consensus: at this stage, the consensus
mechanism is used to determine the global model accepted by the participants. We use the PBFT53 consensus, which can tolerate _f_ Byzantine fault nodes. We assume that three organizations
are participating in the shared task. The blockchain node BN1 is a blockchain node of the task issuer _O_1 acting as the primary node; BN2 and BN3 are blockchain nodes of the other two
participants _O_2 and _O_3 participating in the consensus mechanism as active nodes; BNj (_j_ ∈ _M_) denotes the blockchain node of organizations that are not participating in the sharing
task, referred to as passive nodes. The consensus mechanism consists of five steps: request, pre-prepare, prepare, commit, and reply. Request phase: the task issuer _O_1 initiates a global
model consensus request to the deployed blockchain node BN1. Pre-prepare stage: BN1 calculates
Hash\(({{{{\bf{M}}}}}_{{{{{\rm{G}}}}}_{{{{\rm{1}}}}}}^{{{{\rm{t+1}}}}},{{{{\bf{M}}}}}_{{{{{\rm{G}}}}}_{{{{\rm{2}}}}}}^{{{{\rm{t+1}}}}},\ldots,{{{{\bf{M}}}}}_{{{{{\rm{G}}}}}_{{{{\rm{K}}}}}}^{{{{\rm{t+1}}}}})\)
and broadcasts Hash\(({{{{\bf{M}}}}}_{{{{{\rm{G}}}}}_{{{{\rm{1}}}}}}^{{{{\rm{t+1}}}}})\) to BN2 and BN3 if the Hash of all global models is equal. Prepare stage: after receiving the
Hash\(({{{{\bf{M}}}}}_{{{{{\rm{G}}}}}_{{{{\rm{1}}}}}}^{{{{\rm{t+1}}}}})\) sent by BN1, BN2, and BN3, calculate the Hash value of the global model
\({{{{\bf{M}}}}}_{{{{{\rm{G}}}}}_{{{{\rm{i}}}}}}^{{{{\rm{t+1}}}}}(1\le i\le K)\) sent by each organization. If all hash values are equal to
Hash\(({{{{\bf{M}}}}}_{{{{{\rm{G}}}}}_{{{{\rm{1}}}}}}^{{{{\rm{t+1}}}}})\), BN2 and BN3 broadcast Hash\(({{{{\bf{M}}}}}_{{{{{\rm{G}}}}}_{{{{\rm{2}}}}}}^{{{{\rm{t+1}}}}})\) and
Hash\(({{{{\bf{M}}}}}_{{{{{\rm{G}}}}}_{{{{\rm{3}}}}}}^{{{{\rm{t+1}}}}})\) to the other two participants, respectively. Commit stage: after receiving the calculation results from the other
nodes, all participants verify whether a consistent global model has been agreed upon by all. If consensus is achieved, they broadcast confirmation messages to the other participants. Reply
stage: the consensus request is considered complete when each participant receives confirmation messages from at least two-thirds of the nodes. A Reply message is then constructed and sent
to _O_1. Once _O_1 receives confirmation messages from more than two-thirds of the nodes, it finalizes the global model and broadcasts its hash to all active and passive nodes for storage.
The processes of local and global model generation are repeated round and round until the model converges to a threshold. In the end, the final global model is stored on the blockchain,
ensuring tamper resistance while facilitating efficient retrieval by others. REPORTING SUMMARY Further information on research design is available in the Nature Portfolio Reporting Summary
linked to this article. DATA AVAILABILITY All datasets used are publicly available at https://github.com/SICC-Group/MatSwarm.git and Zenodo54. All data supporting the findings described in
this manuscript are available in the article and in the Supplementary Information and from the corresponding author upon request. Source data are provided with this paper. CODE AVAILABILITY
The codes are available in open source at https://github.com/SICC-Group/MatSwarm.git and Zenodo54. REFERENCES * Liu, C. et al. A transfer learning cnn-lstm network-based production progress
prediction approach in iiot-enabled manufacturing. _Int. J. Prod. Res._ 61, 4045–4068 (2023). Article Google Scholar * Chaudry, U. M., Hamad, K. & Abuhmed, T. Machine learning-aided
design of aluminum alloys with high performance. _Mater. Today Commun._ 26, 897 (2021). Google Scholar * Malik, P. K. et al. Industrial internet of things and its applications in industry
4. _Comput. Commun._ 166, 125–139 (2021). Article Google Scholar * Damewood, J. et al. Representations of materials for machine learning. _Annu. Rev. Mater. Res._ 53, 399–426 (2023).
Article ADS CAS Google Scholar * Stergiou, K. et al. Enhancing property prediction and process optimization in building materials through machine learning: a review. _Comput. Mater.
Sci._ 220, 031 (2023). Article Google Scholar * Aflow - Automatic FLOW for materials discovery. https://aflowlib.org/ (2024). * Crystallography open database.
http://www.crystallography.net/cod/ (2024). * Materials data repository home. https://materialsdata.nist.gov/. (2024). * Morgan, D. & Jacobs, R. Opportunities and challenges for machine
learning in materials science. _Annu. Rev. Mater. Res._ 50, 71–103 (2020). Article ADS CAS Google Scholar * Xu, P., Ji, X., Li, M. & Lu, W. Small data machine learning in materials
science. _npj Comput. Mater._ 9, 42 (2023). Article ADS Google Scholar * Kim, Y. et al. Deep learning framework for material design space exploration using active transfer learning and
data augmentation. _npj Comput. Mater._ 7, 140 (2021). Article ADS Google Scholar * Jain, S., Seth, G., Paruthi, A., Soni, U. & Kumar, G. Synthetic data augmentation for surface
defect detection and classification using deep learning. _J. Intell. Manuf._ 33, 1007–1020 (2022). Article Google Scholar * Hnewa, M. & Radha, H. Object detection under rainy
conditions for autonomous vehicles: a review of state-of-the-art and emerging techniques. _IEEE Signal Process. Mag._ 38, 53–67 (2020). Article Google Scholar * Wen, Y., Tran, D.,
Izmailov, P., Wilson, A.G. Combining ensembles and data augmentatio.n can harm your calibration. _In_: International Conference on Learning Representations https://arxiv.org/abs/2010.09875
(2021). * Lejeune, E. & Zhao, B. Exploring the potential of transfer learning for metamodels of heterogeneous material deformation. _J. Mech. Behav. Biomed. Mater._ 117, 104,276 (2021).
Article CAS Google Scholar * Zhang, C. et al. A survey on federated learning. _Knowl. Based Syst._ 216, 106,775 (2021). Article Google Scholar * Mothukuri, V. et al. A survey on
security and privacy of federated learning. _Future Gener. Comput. Syst._ 115, 619–640 (2021). Article Google Scholar * Kairouz, P. et al. Advances and open problems in federated learning.
_Found. Trends Mach. Learn._ 14, 1–210 (2021). Article Google Scholar * Zhang, J. et al. Security and privacy threats to federated learning: Issues, methods, and challenges. _Secur.
Commun. Netw._ 2022 (2022). * Tolpegin, V., Truex, S., Gursoy, M.E., Liu, L. Data poisoning attacks against federated learning systems. _In_: Computer Security–ESORICS 2020: 25th European
Symposium on Research in Computer Security, pp. 480–501 (2020). * Xiao, X., Tang, Z., Li, C., Xiao, B. & Li, K. Sca: sybil-based collusion attacks of iiot data poisoning in federated
learning. _IEEE Trans. Ind. Inform._ 19, 2608–2618 (2022). Article Google Scholar * Bakopoulou, E., Tillman, B. & Markopoulou, A. Fedpacket: a federated learning approach to mobile
packet classification. _IEEE Trans. Mob. Comput._ 21, 3609–3628 (2021). Article Google Scholar * Wang, B., Li, A., Pang, M., Li, H., Chen, Y. Graphfl: a federated learning framework for
semi-supervised node classification on graphs. _In_: 2022 IEEE International Conference on Data Mining (ICDM) pp. 498–507 (2022). * Xie, J., Su, Y., Zhang, D. & Feng, Q. A vision of
materials genome engineering in china. _Engineering_ 10, 10–12 (2022). Article Google Scholar * Wang, R. et al. A secured big-data sharing platform for materials genome engineering:
state-of-the-art, challenges and architecture. _Future Gener. Comput. Syst._ 142, 59–74 (2023). Article Google Scholar * Wang, R., Xu, C., Ye, F., Tang, S., Zhang, X., S-mbda: a
blockchain-based architecture for secure storage and sharing of material big-data. _IEEE Internet Things J_. 11, 15 (2024). * Liu, S. et al. An infrastructure with user-centered presentation
data model for integrated management of materials data and services. _npj Comput. Mater._ 7, 88 (2021). Article ADS CAS Google Scholar * Ileana, M., Oproiu, M.I., C.V., Marian, Using
docker swarm to improve performance in distributed web systems. _In_: International Conference on Development and Application Systems (DAS) pp. 1–6 (2024). * Jere, M. S., Farnan, T. &
Koushanfar, F. A taxonomy of attacks on federated learning. _IEEE Secur. Priv._ 19, 20–28 (2020). Article Google Scholar * Romano, Y., Aberdam, A., Sulam, J. & Elad, M. Adversarial
noise attacks of deep learning architectures: stability analysis via sparse-modeled signals. _J. Math. Imaging Vis._ 62, 313–327 (2020). Article MathSciNet Google Scholar * Fang, M., Cao,
X., Jia, J., Gong, N., Local model poisoning attacks to byzantine-robust federated learning. 29th USENIX security symposium (USENIX Security 20), pp. 1605–1622 (2020). * Li, L., Xu, W.,
Chen, T., Giannakis, G. B. & Ling, Q. Rsa: Byzantine-robust stochastic aggregation methods for distributed learning from heterogeneous datasets. _Proc. AAAI Conf. Artif. Intell._ 33,
1544–1551 (2019). Google Scholar * Baruch, G., Baruch, M., Goldberg, Y., A little is enough: circumventing defenses for distributed learning. _Adv. Neural Inf. Process. Syst._, 32 (2019). *
Xie, C., Koyejo, O., Gupta, I. Fall of empires: breaking byzantine-tolerant SGD by inner product manipulation. https://arxiv.org/abs/1903.03936 (2020). * Li, X., Huang, K., Yang, W., Wang,
S., Zhang, Z. On the convergence of fedavg on non-iid data. _In_: International Conference on Learning Representations, https://openreview.net/forum?id=HJxNAnVtDS (2020). * Li, T. et al.
Federated optimization in heterogeneous networks. _Proc. Mach. Learn. Syst._ 2, 429–450 (2020). Google Scholar * Liu, Y., Kang, Y., Xing, C., Chen, T. & Yang, Q. Secure federated
transfer learning. _IEEE Intell. Syst._ 35, 70–82 (2020). Article Google Scholar * Kalapaaking, A. P. et al. Blockchain-based federated learning with secure aggregation in trusted
execution environment for internet-of-things. _IEEE Trans. Ind. Inform._ 19, 1703–1714 (2022). Article Google Scholar * Chowdhury, S., Mayilvahanan, P. & Govindaraj, R. Optimal feature
extraction and classification-oriented medical insurance prediction model: machine learning integrated with the internet of things. _Int. J. Comput. Appl._ 44, 278–290 (2022). Google
Scholar * Fatani, A., Dahou, A., Al-Qaness, M. A., Lu, S. & Abd Elaziz, M. Advanced feature extraction and selection approach using deep learning and aquila optimizer for iot intrusion
detection system. _Sensors_ 22, 140 (2022). Article ADS Google Scholar * Hewa, T., Ylianttila, M. & Liyanage, M. Survey on blockchain based smart contracts: applications,
opportunities and challenges. _J. Netw. Comput. Appl._ 177, 102,857 (2021). Article Google Scholar * Daemen, J. & Rijmen, V. Reijndael: the advanced encryption standard. _Dobb’s. J._
26, 137–139 (2001). Google Scholar * Barker, E., Mouha, N. Recommendation for the triple data encryption algorithm (tdea) block cipher. Technical report, National Institute of Standards and
Technology (2017). * Naresh, V., Sivaranjani, R. & Murthy, N. Provable secure lightweight multiple shared key agreement based on hyper elliptic curve diffie-hellman for wireless sensor
networks. _Int. J. Crit. Infrastruct. Prot._ 28, 100,371 (2020). Google Scholar * Trzepieciński, T. & Lemu, H. G. Improving prediction of springback in sheet metal forming using
multilayer perceptron-based genetic algorithm. _Materials_ 13, 3129 (2020). Article ADS PubMed PubMed Central Google Scholar * Maulud, D. & Abdulazeez, A. M. A review on linear
regression comprehensive in machine learning. _J. Appl. Sci. Technol. Trends_ 1, 140–147 (2020). Article Google Scholar * Wu, L. et al. A recurrent neural network-accelerated multi-scale
model for elasto-plastic heterogeneous materials subjected to random cyclic and non-proportional loading paths. _Comput. Methods Appl. Mech. Eng._ 369, 113,234 (2020). Article MathSciNet
Google Scholar * Meng, H., Geng, M. & Han, T. Long short-term memory network with bayesian optimization for health prognostics of lithium-ion batteries based on partial incremental
capacity analysis. _Reliab. Eng. Syst. Saf._ 236, 109,288 (2023). Article Google Scholar * Blanchard, P., El Mhamdi, E.M., Guerraoui, R., Stainer, J., Machine learning with adversaries:
byzantine tolerant gradient descent. _In_: International Conference on Neural Information Processing Systems p. 118–128 (2017). * Karimireddy, S.P., He, L., Jaggi, M., Learning from history
for byzantine robust optimization. _In_: International Conference on Machine Learning, pp. 5311–5319 (2021). * Chen, Y., Su, L. & Xu, J. Distributed statistical machine learning in
adversarial settings: Byzantine gradient descent. _Proc. ACM Meas. Anal. Comput. Syst._ 1, 1–25 (2017). CAS Google Scholar * Yin, D., Chen, Y., Kannan, R., Bartlett, P., Byzantine-robust
distributed learning: towards optimal statistical rates. _In_: International Conference on Machine Learning, pp. 5650–5659 (2018). * Zhang, G. et al. Reaching consensus in the byzantine
empire: a comprehensive review of BFT consensus algorithms. _ACM Comput. Surv._ 56, 1–41 (2024). Article Google Scholar * Wang, R. et al. Matswarm: trusted swarm transfer learning driven
materials computation for secure big data sharing, https://zenodo.org/records/13622509 (2024). Download references ACKNOWLEDGEMENTS This work is supported in part by the National Key
Research and Development Program of China under Grant 2021YFB3702403, and in part by the National Natural Science Foundation of China under Grant 62101029. R.W. has been supported by the
China Scholarship Council Award under Grant 202306460078. C.X. has been supported in part by the China Scholarship Council Award under Grant 202006465043. AUTHOR INFORMATION AUTHORS AND
AFFILIATIONS * School of Computer and Communication Engineering, University of Science and Technology Beijing, 100083, Beijing, China Ran Wang, Cheng Xu, Fangwen Ye, Yusen Tang, Sisui Tang,
Hangning Zhang, Wendi Du & Xiaotong Zhang * Beijing Advanced Innovation Center for Materials Genome Engineering, University of Science and Technology Beijing, 100083, Beijing, China Ran
Wang, Cheng Xu & Xiaotong Zhang * College of Computing and Data Science, Nanyang Technological University, 639798, Singapore, Singapore Ran Wang & Shuhao Zhang * Shunde Innovation
School, University of Science and Technology Beijing, 528399, Guangdong, China Cheng Xu & Xiaotong Zhang Authors * Ran Wang View author publications You can also search for this author
inPubMed Google Scholar * Cheng Xu View author publications You can also search for this author inPubMed Google Scholar * Shuhao Zhang View author publications You can also search for this
author inPubMed Google Scholar * Fangwen Ye View author publications You can also search for this author inPubMed Google Scholar * Yusen Tang View author publications You can also search for
this author inPubMed Google Scholar * Sisui Tang View author publications You can also search for this author inPubMed Google Scholar * Hangning Zhang View author publications You can also
search for this author inPubMed Google Scholar * Wendi Du View author publications You can also search for this author inPubMed Google Scholar * Xiaotong Zhang View author publications You
can also search for this author inPubMed Google Scholar CONTRIBUTIONS R.W. and C.X. conceived this project. C.X. and X.Z. funded and supervised the research. R.W. and F.Y. implemented the
algorithm, performed the experiments, and prepared the plots. Y.T., S.T., H.Z., and W.D. implemented the open-source prototype. R.W. and C.X. analyzed the results and drafted the main text.
C.X., S.Z. and X.Z. revised the manuscript. All authors commented on the manuscript. CORRESPONDING AUTHORS Correspondence to Cheng Xu or Xiaotong Zhang. ETHICS DECLARATIONS COMPETING
INTERESTS The authors declare no competing interests. PEER REVIEW PEER REVIEW INFORMATION _Nature Communications_ thanks Ernestina Mensalvas and the other anonymous reviewer(s) for their
contribution to the peer review of this work. A peer review file is available. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION PEER REVIEW FILE DESCRIPTION OF ADDITIONAL SUPPLEMENTARY FILES SUPPLEMENTARY MOVIE 1
SUPPLEMENTARY MOVIE 2 SUPPLEMENTARY MOVIE 3 REPORTING SUMMARY SOURCE DATA SOURCE DATA RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0
International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative
Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Wang, R., Xu, C., Zhang, S. _et al._ MatSwarm: trusted swarm transfer learning
driven materials computation for secure big data sharing. _Nat Commun_ 15, 9290 (2024). https://doi.org/10.1038/s41467-024-53431-x Download citation * Received: 11 November 2023 * Accepted:
07 October 2024 * Published: 28 October 2024 * DOI: https://doi.org/10.1038/s41467-024-53431-x SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content:
Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative