Get SGD58.85 off your premium account! Valid till 9 August 2021. Use the Code ‘SGLEARN2021’ upon checkout. Click Here

Synergos Is Officially Launched!

In previous posts (here and here), we gave a preview of AI Singapore’s Federated Learning platform Synergos. Today we are happy to announce that Synergos is officially launched!

We begin with a recap of what Synergos is and its main components.

The motivation for Synergos

Data is at the core of machine learning. Nevertheless, in many real-world projects, a single party’s data is often insufficient and needs to be augmented with data from other parties. However, there are also many concerns (regulatory, ethical, commercial etc.) stopping parties from exchanging data.

An example can be found in the healthcare domain. Individual hospitals alone typically have a limited amount of local data to build a robust model. There are existing studies (this one as an example) showing the benefit of using data from more hospitals to build models. Nevertheless, even though hospitals are convinced of the value of sharing data, there are a lot of regulatory concerns stopping hospitals from sharing data, since healthcare data is usually viewed as sensitive personal data in either general or sectoral data protection regulations, e.g. GDPR or HIPPA.

Federated Learning is an emerging privacy-preserving machine learning technology. It enables multiple parties holding local data to collaboratively train machine learning models without exchanging their data with one another, hence preserving the confidentiality of different parties’ local data. 

The design of Synergos

Synergos is a platform which AI Singapore has been building to make Federated Learning more accessible and sustainable. The diagram below gives an overview of the key components of Synergos. For a detailed description of each component, please refer to this post

Broadly, these components are grouped into three layers of functionalities.

  1. Federated training, whose aim is to make Federated Learning simple and user-friendly.
  2. Model management, whose goal is to make Federated MLOps simple.
  3. Platform management, whose goal is to make Federated Learning sustainable.

Synergos makes Federated Learning accessible

In conventional machine learning, it is commonly assumed that all data are independently and identically distributed (IID). Or in simple words, it assumes that all data are from the same generative process and the generative process does not have memory of past generated data. However, in Federated Learning, as different parties do not really see other parties’ data, it cannot be assumed that they all follow the same generative process. Special care is needed to address such non-IID data. Otherwise, the model derived with Federated Learning may not converge and generalise to different parties’ data or take longer to converge. Many federated aggregation algorithms have been proposed to address this problem. 

Synergos makes Federated Learning user-friendly and accessible, taking away the burden from the users in implementing those federated aggregation algorithms. In Synergos, the Federation component implements a number of those algorithms. The most basic aggregation algorithm is FedAvg. Besides this, the current version of Synergos also supports more advanced aggregation algorithms, including FedProx, FedGKT, etc. More aggregation algorithms will be supported in future versions.

Synergos further reduces the burden with the Orchestration component, which supports auto-tuning of multiple Federated Learning models with different configurations of aggregation algorithms, aggregation settings, model hyper-parameters, etc.

Synergos makes Federated Learning sustainable

Usually, different parties incur non-negligible costs in acquiring and cleaning their data. They rarely altruistically share their data with others and risk losing their competitive edge. These parties would be more motivated to share their data when given enough incentives, such as a guaranteed benefit from the collaboration and a fair higher reward from contributing more valuable data. Otherwise, without any party motivated to contribute data, it could be detrimental to the sustainability of Federated Learning. 

Synergos makes Federated Learning sustainable by building the Contribution & Reward component to evaluate contributions and reward different parties fairly based on their contributions.  

We are still actively working on this component, and it is not yet available in the current version. We plan to implement model reward, which is the outcome of a research supported by AI Singapore. Conventionally, reward is associated with monetary gain. While this remains a natural and viable option, there are scenarios where monetary returns are not preferred or even impossible. Innovatively, Model reward rewards the participating parties with models of different quality based on their contribution, instead of the usual monetary rewards.

The path forward

The launch today is not the end of the story. Rather, it is the start of a long journey. Moving forward, besides the Contribution & Reward component, there are already a few enhancements and new features planned, including:

  • Support of non-neural network models. 

Currently, those federated aggregation algorithms implemented in the Federation and Federated Grid components mostly support deep neural networks. Nevertheless, there are still many commonly used machine learning models that are not neural networks based. 

In the next version, we plan to integrate outstanding research outcomes in the field of Federated Learning to support more aggregation algorithms, e.g. Federated GBDT (SimFL), etc.

  • Support of Vertical Federated Learning

The first post of this series discussed two common paradigms of Federated Learning: Horizontal Federated Learning and Vertical Federated Learning. 

Horizontal Federated Learning is useful in scenarios where different parties have a big overlap in the feature space (columns) but small overlap in the user space (rows).

Horizontal Federated Learning

Vertical Federated Learning is useful in the scenarios where different parties have a big overlap in the user space (rows), but a small overlap in the feature space (columns).

Vertical Federated Learning

In the current version, Synergos only supports Horizontal Federated Learning. We are also working on the support of Vertical Federated Learning. 

  • Integration with other compute and/or storage engines. 

Compute & Storage is an interface to compute and storage backends, which different parties use in local training. The current version of Synergos supports data that is managed by a local file system and S3-compatible storage, and the compute load is handled by a single node. 

We are actively working on support for other storage services and compute frameworks in the future versions, e.g., Spark, Horovod. 

  • Support of privacy-enhancing technologies (PET) 

In Federated Learning, what is exchanged among parties is mainly the intermediary model learnings like gradients and/or weights. This protects different parties’ local data since it does not require sharing of raw data. Nevertheless, exchanging of gradients could also lead to information leakage

In future versions, we will support application of privacy-enhancing technologies (PET) like homomorphic encryption (HE) or secure multi-party computation (SMPC) to better protect participating parties’ data.

In summary, some of the planned enhancements and new features are as follows:

Components

Status in current version

Enhancement/new features planned

Contribution & Reward

work-in-progress

Model reward, in which different parties will get a customised model of varying qualities based on their data contribution

Federation & Federated Grid

  • Support of neural network based models
  • Support of Horizontal Federated Learning
  • Support of non-neural network based models, e.g., GBDT. 
  • Support of Vertical Federated Learning

Compute & Storage

  • Data is managed by a local file system or mounted volume 
  • Compute load is handled by a single node

Support of other compute & storage engines, e.g., Spark, Horovod

Federation & Federated Grid

No privacy-enhancing technologies (PET) applied

Support of PET, like HE or SMPC, to better protect participating parties’ data 

Serving

Support only those parties who have contributed in the training to use the federated model 

Support of new parties’ requests to use the federated model. Those new parties did not participate in the federated training.

 

“Synergos” is a Greek word. The English word “synergy” was derived from “synergos”, which means “to work together” or “to cooperate”. We therefore also invite you to work together with us in this journey. 

Check out our code repositories at GitHub, start using it as a tool in your machine learning toolbox, and contribute your code to the platform. Synergos adopts a modular design. Different components are maintained in separated code repositories. Check out the key components like Synergos TTP and Synergos Worker. As a quick start, Synergos Simulator allows you to run all different configurations of Synergos in a sandboxed environment from your local computer. User guide is available here

Do share with us your feedback and suggestions on new features or areas we could improve. Please also join the discussion in our community group.

 


 

The Federated Learning Series

Authors

  • A 20-year veteran in tech startups and MNCs, Najib focuses on High- Performance Computing (HPC ) as well as Cloud, Data and Artificial Intelligence (AI). He has led engineering teams in several organisations, some of which were startups that were acquired or exited successfully. He has helped build several of the first generation HPC cluster systems and infrastructure in Singapore and the region. He was also a lecturer for NUS School of Continuing and Lifelong Education (NUS SCALE) where he conducted workshops on Reproducible Data Science, Data Engineering and Conversational AI bots (Chatbots). He currently heads the AI Platforms Engineering team in the Industry Innovation Pillar at AI Singapore (AISG) where his team focuses on building the AI infrastructure and platforms for researchers, engineers and collaborators to solve challenging problems.

  • Wayyen manages the on-premise and on-cloud infrastructure resources used by AISG engineers and apprentices. He also works with MLOps, SecureAI and Synergos teams to bring out new tools and platforms for better CI/CD/CT in machine learning.

  • “Sometimes it is the people no one can imagine anything of who do the things no one can imagine.” ― Alan Turing In the data-driven world of today, knowledge is invaluable. My primary skills revolve around scientific computation, data-mining and network analysis.

  • My main goal over the near future is to leverage my technical AI knowledge and communication capabilities for good. I believe that the most good can come to society, in the next 10 years, from unlocking the potential of the ethical AI in industrial revolution 4.0. My best contribution to this is to introduce and enable the world of privacy-preserving machine learning, by building such systems for individuals and organisations alike.

  • Jianshu has many years of AI/Data Science research and consulting experience. He has good track records in delivering values to clients and also quality academic research. One of his papers has also been awarded the Test of Time award by one of the leading AI conferences. In the recent years, he has spent most of his time in putting AI/ML into real-world usage and promoting ethical aspect of AI/ML, e.g. explainability, fairness, robustness, and privacy-preserving of AI/ML models.

Share the post!

Facebook
Twitter
LinkedIn
Reddit
Telegram
WhatsApp
Email
Print

Related Posts

mailing list sign up

Mailing List Sign Up C360