Get SGD58.85 off your premium account! Valid till 9 August 2021. Use the Code ‘SGLEARN2021’ upon checkout. Click Here

Stepping out of my comfort zone and into the world of AI

Getting into the AI Apprenticeship Programme (AIAP) certainly isn’t easy. But for those who make the cut, a whole new world of opportunities awaits in the field of AI.

Shannen Lam had been working as a civil engineer for most of her life and was enjoying a stable job in a familiar trade when a sense of stagnation started to creep in.

After ten years in the industry, she decided to step out of her comfort zone and took up a Masters course in analytics at the Singapore Management University. The decision was sparked by an interest in exploring how data could be analysed to help with various situations at work.

While the course exposed her to machine learning models and equipped her with a theoretical understanding of the algorithms behind them, she knew that it was just the tip of an iceberg. “There were gaps in the know-how to implement a model, software skills essential for deployment, and exposure to other more sophisticated learning algorithms out there in the industry.”

When a friend recommended that she apply for AIAP, she decided to go for it. What attracted her to the programme was that she would get to work on real business problems with clients and have the opportunity to gain experience that she could put on her resume.

The decision made, Shannen would soon discover that “it ain’t easy to get in!”

As part of the technical assessment for acceptance into the nine-month deep-skilling programme, Shannen had to submit a solution involving data extraction, exploratory data analysis and an end-to-end ML pipeline.

While she had done Python programming in an integrated development environment, the challenge now was to put the lines of code into a pipeline script so that they could be executed from a command-line environment. She also had to organise her thought processes and present her solution to the technical assessment panel.

The rigorous entry requirements set the tone for Shannen’s AIAP stint, which began in March this year.

“The learning curve was steep. The courseware was structured to ensure apprentices develop a finer understanding of machine learning (ML) techniques and concepts and covered many different aspects of ML such as loss functions, data augmentation, and more. “I had to comb through the learning resources and digest them before applying them to problems that were assigned to us.”


In addition, there were weekly peer reviews of codes and presentations on topics that the apprentices had to research and present to the rest of the cohort. “The learning was not just from the mentors but also from fellow apprentices who had different coding styles and approaches to the same problem,” said Shannen.

Besides the coursework, a highlight of AIAP was the opportunity to work on 100 Experiments (100E) projects where AI Singapore engineers help organisations to solve problems for which there are no commercial off-the-shelf solutions.

Shannen was part of a team developing a network cyber attack classifier for a manufacturing client. The goal was to classify abnormal traffic detected in industrial controllers within the company’s manufacturing plant.

Being new to network security, she had to ramp up her knowledge through intensive research and literature reviews. But the project experience was gratifying. Apart from ML modelling, she had the opportunity to package and deploy the code in docker containers, write unit tests for function modules and use Gitlab extensively as a CI/CD tool. She also had the satisfaction of seeing the project through to fruition, with the ML model shipped as a working product to the end-user to achieve its intended purpose.

With two more months to go before she graduates from AIAP, the experience has taught her that there is more to an AI solution than building and tweaking ML learning models.

“Most of the courses out there focus on building and perfecting the algorithms. It was only after I entered AIAP and was involved in an actual ML project that I was exposed to other dynamics that are equally important in a project lifecycle, such as business expectations, data integrity, building a reproducible pipeline for sustainable ML training and experimentation, and so on.”

“Going through AIAP has helped me to build my skillsets and confidence in meeting the needs of the industry,” she said. “It has also connected me to a network of like-minded friends and given me a good start for a career in AI.


To find out more about the AI Apprenticeship Programme:

Join the AIAP community today:

Privacy Protection Using Peekingduck

”Data is the new oil” is a catchphrase that is reflective of modern times. As more data is being collected and shared, data privacy and protection becomes increasingly important, and governments around the world have been implementing regulations in response. In the European Union, the General Data Protection Regulation (GDPR) was adopted in 2016, while in Singapore, the Personal Data Protection Act (PDPA) came into effect in 2012.

Videos and images are also forms of data, and thus fall under such regulations. For example, Chapter 4 of Singapore’s PDPA states that consent is required from identifiable individuals who appear in photos or videos, before they can be used in some situations. As this can be impractical, Chapter 4 also mentions that if the identifiable features of the individuals are “masked”, then consent is not required. This process is sometimes known as “de-identification”.

De-identification of a single photo can be done easily with readily available tools such as MS Paint. But what if you need to de-identify thousands of videos? To manually do so, frame by frame, would be extremely time consuming and tedious! There are commercial solutions that promise to solve this (at a price), but at AI Singapore, we are now releasing our de-identification technology as open-source, completely free of charge!

PeekingDuck v1.1 to the Rescue

Two months ago, we released PeekingDuck, a Computer Vision (CV) inference framework with built-in modularity using the concept of “nodes”. In our latest v1.1 release, we added new nodes which tackle the problem of privacy protection by de-identifying faces of individuals and license plates of vehicles. PeekingDuck can be configured to de-identify a folder of saved videos or images, and can even run real-time on live CCTV feeds.

Original video (above), with face de-identification (below)
License plate de-identification

Under the hood, the first step is to use object detection models which predict bounding boxes around faces and vehicle license plates. We trained YOLOv4 and YOLOv4-tiny models on datasets of human faces and license plates, and also used a MTCNN model pre-trained on faces, creating our new model.yolo_face, model.yolo_license_plate, and model.mtcnn nodes.

The second step is to mosaic or blur the pixels within the bounding boxes using the draw.mosaic_bbox or draw.blur_bbox nodes. The level of mosaic or blur are configurable parameters which can be adjusted. If other forms of “masking” are required, such as blacking out the pixels, it is also possible to create your own custom node that does this.

Mosaic (left) vs blur (right)

Additional Features in v1.1

Aside from the privacy protection use cases above, we also included additional features in v1.1 of PeekingDuck. Firstly, our model.yolo_face node is able to distinguish between masked and un-masked faces, adding another use case to combat COVID-19, on top of our social distancing and group size checking use cases.

Face mask detection

We also responded to a feature request, to show a list of all available nodes via command-line interface. By running the command ‘peekingduck nodes’, the different types of nodes, their names, and URLs for more information will be shown, as depicted below. We find this feature useful as even ourselves lose track of the nodes and names sometimes. Additionally, we used colours and bold fonts to allow info messages, warnings and errors to be shown more clearly in logs.

Output from ‘peekingduck nodes’ command

Furthermore, Apple started releasing Macs with their proprietary M1 ARM-based chip in late 2020, a significant change from the previous Intel processors. Recognising that there will be more users of M1 Macbooks over time, we successfully tested PeekingDuck on a few M1 Macs and have provided an installation guide.

Moving Forward

As CV continues to have new developments, we are committed to maintaining and adding new features to PeekingDuck over time to ensure that it stays relevant. You are welcome to use our Community page to suggest potential problems that could be solved by CV, and we will consider building nodes to solve it, if viable. In the meantime, do install the latest version of PeekingDuck to try it out!

Find Out More

To find out more about PeekingDuck and start using it, check out our documentation below:

You are also welcome to join discussions and reach out to our team in our Community page.

A Sneak-Peek Into a Cool Machine Learning Algorithm: Random Forest!

(This article was contributed by the SUTD AISG Student Chapter)

In a traditional context, a random forest is defined as ‘A supervised machine learning algorithm that is constructed from decision tree algorithms’. In this article, we will attempt to better understand and explain the working principles of the random forest algorithm.

Random forest is essentially a machine learning algorithm that utilizes ‘ensemble learning’, which involves combining several classifiers (decision trees) to obtain the solution to a problem. This is commonly done through a technique called ‘bagging’, which helps to reduce variance to make the algorithm more accurate.

A decision tree is a support model that uses a tree-like visualization of decisions and possible consequences. A decision tree usually has a root node, and branches out into multiple nodes, ultimately ending at the leaf nodes.

Fig 1: Example of decision tree

Supervised learning is a branch of machine learning where the model maps an input to an output based on some data (input-output pairs). Therefore, the model has some idea on the types of input-output matches to work with. The main types of supervised learning are regression and classification.

Regression involves predicting continuous values, such as the fluctuation in stocks as a time series. Classification on the other hand is predicting distinct outcomes of input, such as detecting if an animal is a cat or a dog.

Random forest models are commonly used to solve problems of regression and classification.

A random forest essentially works by feeding our input data into several decision trees randomly for training our model. Then, once the data flows through all the decision trees, the final output of the random forest is usually assumed to be the majority output of the decision trees (classification) or the average of all outputs from the decision trees (regression).

Here is an example of a simple classification problem to better understand this procedure:

Fig 2: Example of random forest

In this example, our input data is a description of a fruit, (such as colour, size, shape, taste etc.) that we would like to classify into an apple or a banana. This data is fed into ‘n’ decision trees and allows each of the decision trees to classify it. Once all the decision trees have returned a final output, the majority voting process is initiated, where the model calculates the value that the majority of the decision trees returned. This value (in this case apple ) is the output of the random forest algorithm.

Random forest is a highly efficient algorithm that can handle large datasets. The random forest is a relatively simple algorithm to understand and provides greater accuracy than a simple decision tree.

The random forest algorithm is used actively in several industries, such as banking to understand banking patterns to classify customers, in healthcare to predict and diagnose patients based on their medical history patterns, trading to understand and regress stock patterns, and e-commerce to predict customer preferences based on user consumption and cookie tracking patterns.

Written by:

Anirudh Shrinivason, Sahana Katragadda, Dhanush Kumar Manogar, Lai Pin Nean

The views expressed in this article belong to the SUTD AISG Student Chapter and may not represent those of AI Singapore.

Synergos in Action

We’ve launched Synergos, a Federated Learning platform AI Singapore has been working on.

The launch marks the start of a long journey. We invite you to join us on this journey. Let’s start by using it as a tool in your machine learning toolbox, and even contribute your code to the repo. Do share with us your feedback and suggestions for new features or areas we can improve.

Now, let’s see a case study of Synergos in action.


The Problem Statement

AI Singapore has been working with Seagate Technology—the global leader in mass-data storage solutions, from hard drives to storage-as-a-service and mass-data storage shuttles. Many of its customers deploys a big fleet of data storage devices manufactured by Seagate. Those devices generate telemetry logs on a regular basis. There are also health logs included in similar settings, which helps to monitor the health of the devices.

Seagate is developing a predictive analytics service for its customers, which is expected to forecast which devices are likely in need of servicing. This enables it to provide value-added service ensuring high availability of its customers’ services, right-sizing maintenance for undisrupted operation of their customers with low maintenance cost overhead.

Each customer keeps the logs of its whole fleet of devices locally. Moving data from different customers to a single location for training presents potential challenges in preserving data ownership, residency, and sovereignty. Therefore, to increase customer adoption of telemetry data predictive analytics, while achieving utility from the data, Seagate is collaborating with AI Singapore to apply Synergos in building the predictive maintenance model with Federated Learning.

This is a multi-phase collaboration. Before Federated Learning is deployed into production, Seagate would like to run a pilot to validate the efficacy of Federated Learning. In the current phase, the goal is to validate whether the models built with Synergos (i.e. federated training mode) can achieve the same level of performance as those built with different customers’ data being pooled together into a centralized location (i.e. centralized training mode).


The Data

In this phase, we are working with two datasets from Seagate. In keeping with Seagate’s uncompromising ethical standards when it comes to data, the raw log data was anonymized and in no way ties back to user data.

For each data source, the raw data is composed of the device fleets’ telemetry logs and their corresponding health logs. The data is collected daily over a 20-day period. The raw data is processed by Seagate to construct the data used in training.

  • Records: Each record corresponds to one device on a given day. A device would have multiple records since a device may generate telemetry logs on different days.
  • Features: Initially, there are a few hundred features in the telemetry data. After some feature engineering by Seagate (e.g. feature normalization, missing value imputation, etc.), about ⅓ of the features are eventually selected. The feature names are not exposed to the AI Singapore team, with only the feature data types provided. The feature names are represented by their indices.
  • Labels: Labels are derived from the health logs. Health logs record the health status of the devices. There are three possible label values: 0, 1, 2. Each label represents an increasing recency to the device requiring servicing, starting with label 0 as having no failure in the data period.

For Dataset 1, the distribution of labels is 0: 68.2% / 1: 15.4% / 2: 16.4%; while the same for Dataset 2 is 0: 69.6% / 1: 22.3% / 2: 8.1%, as illustrated in the chart below. It can be seen here that individual datasets show a sign of non-IID data, as different data sources have varying ratios of different classes. The numbers of records in Dataset 1 and Dataset 2 are also significantly different, i.e. Dataset 2’s data is about 35 times of Dataset 1’s data. Seagate had confirmed that there is data heterogeneity as their data sources are applying different workloads for the usage of their devices.

When conducting the train-test split, 30% of the unique devices that ever failed (either class 1 or 2) and those that never failed (class 0) are kept in the test set, while the other 70% unique devices of both categories are kept in the train set. By doing so, the datasets were split in a stratified manner.


The Methodology

As discussed above, the main objective of the current phase is to validate whether the models built with Synergos in federated training mode can achieve the same level of performance as those built-in centralized training mode.


Model Training in Centralized Mode

We first built models in centralized mode. Seagate has worked on a centralized Gradient Boosting Decision Tree (GBDT) model. GBDT is a popular tree-based ensemble model. As non-neural network-based models support is still in development for Synergos, we focus on neural network models in the current phase of the collaboration. We first built a neural network model in centralized mode as a benchmark in the current phase.

In the centralized mode, an aggregated dataset is created by combining data from both datasets. The training data and testing data are aggregated separately after the stratified split is done.

We then ran several experiments in Polyaxon, which is a tool used in AI Singapore for experiment management. The best performing model is selected based on the weighted F1 scores across datasets, and the relative model architecture complexity and size to achieve the scores. It is denoted as NN9. The model architecture is shown below:

This model serves as a baseline when validating the efficacy of Federated Learning.


Model Training in Federated Mode

There are two data sources, where each of them is treated as one participant in the federated training. Training is coordinated across these two participants, by a trusted third party (TTP). The same train-test split strategy is applied to each party.

Synergos supports multiple federated aggregation algorithms, including FedAvg, FedProx, FedGKT, etc. In the current phase of the collaboration, we applied two of them, namely FedProx and FedGKT.

FedProx is an enhancement of FedAvg, which is usually seen as the most basic version of a federated aggregation algorithm. In FedAvg, different parties train a global model collectively, with a TTP coordinating the training across different parties. At each global training round t, a global model is sent to all parties. Each party performs local training on their own dataset, typically using mini-batch gradient descent, for E local epochs with B mini-batch size. After every E local epochs, each party sends the parameters from its most recently obtained model state to the TTP. The TTP then updates the global model by conducting a weighted average of the parameters received from multiple parties, with individual parties’ weights θ proportional to their number of records used in the local training. This process iterates until the global model converges or a prefixed number of global training rounds is reached. The diagram below gives a simplified illustration of the FedAvg aggregation process.

FedProx is using a similar aggregation mechanism as FedAvg does. One key improvement FedProx has over FedAvg is that it introduces an additional proximal term to the local training, which essentially restricts the local updates to be closer to the latest global model, which helps the federated training to converge faster. The proximal term is scaled by a hyper-parameter µ, which is to be tuned during training.

Another federated aggregation algorithm used is FedGKT (Federated Group Knowledge Transfer). It was originally proposed to allow low-compute federated training of big CNN-based models with millions of parameters (e.g., ResNet 101, VGG 16/19, Inception, etc.) on resource-constrained edge devices (e.g., Raspberry Pi, Jetson Nano, etc.). The diagram below illustrates the training process of FedGKT.

FedGKT training (diagram adapted from original FedGKT paper)

Essentially, there is one model in FedGKT, but split into two sub-models, i.e. each participating party trains a compact local model (called A); and the TTP trains a larger sub-model (called B). Model A on each party consists of a feature extractor and a classifier, which is trained with the party’s local data only (called local training). After local training, all participating parties generate the same dimensions of output from the feature extractor, which are fed as input to model B at TTP. The TTP then conducts further training of B by minimizing the gap between the ground truth and the soft labels (probabilistic predictions) from the classifier of A. When TTP finishes its training of B, it sends its predicted soft labels back to the participating parties, who further train the classifier of A with only local data. The training also tries to minimize the gap between the ground truth and the soft labels predicted by B. The process iterates multiple rounds until the model converges.

When the training finishes, the final model is a stacked combination of local feature extractor and the shared model B. One of the main benefits of FedGKT is that it enables edge devices to train large CNN models since the heavy compute is effectively shifted to the TTP, who usually has more compute power. Another benefit is model customization, in that different participating parties would have different local feature extractors which will be combined with the shared model B.

In this pilot, shifting heavy computation processes to TTP is not the main motivation. FedGKT is chosen as one of the aggregation algorithms mainly because of the benefit of model customization. When training with FedGKT, the selected baseline model NN9 is split into two parts at layer L. The layers below L is to be trained as the feature extractor (with another layer of softmax acts as the classifier) by the participating parties with local data; while those above L is to be trained by the TTP. L is a hyperparameter to be tuned. We could set L=1 or L=2, since NN9 is a relatively simple model with three hidden layers.


Evaluation and Comparison of Different Models

After models are trained in both centralized and federated mode, we compile the results and compare the performance.

There are three models, including the centralized model (which serves as baseline), the federated model trained with FedAvg, and the federated model trained with FedGKT.

When evaluating the performance, we apply the models on the two datasets individually. We focus on the models’ performance on Class 1 and 2 (i.e. devices which failed within recency thresholds). We compare the performance achieved by the centralized model and federated models. The performance of the federated models is expected to be close to the performance of the centralized model. Performance metrics used include precision, recall, and F1.


Actions in Synergos

The following illustrates the setup of Synergos. Each party in action operates as a Docker container, allowing convenient and easy implementation. With all the necessary containers initialized for each party, a single human operator can orchestrate the entire federated process, from a Jupyter notebook or GUI. Each customer runs a Synergos Worker container on its own compute resource. We set up Synergos to run a cluster of multiple federated grids. Each federated grid has one TTP, who coordinates the multiple Synergos Workers within the grid. A Director (running Synergos Director container) orchestrates multiple TTPs. The Director leverages a Message Queue Exchange to facilitate the parallel job running across multiple federated grids. The setup is illustrated below, with each terminal representing a docker container running.

Users interact with Synergos via its GUI – Synergos Portal. There are two types of users, namely orchestrators and participants.

The orchestrator interacts with the Director, and defines the configuration of the federated training, i.e. the hierarchy of collaboration, project, experiment, and run.

A collaboration defines a coalition of parties agreeing to work together for a common goal (or problem statement). Within a collaboration, there may be multiple projects. Each project corresponds to a collection of data different parties in the collaboration use. Under a project, there will be multiple experiments. And each of them corresponds to one particular type of model to be trained, e.g. logistic regression, neural network, etc. And there are multiple runs under each experiment, each of them uses a different set of hyper-parameters. In this case, the two datasets form a collaboration. With this collaboration, one project has been defined, where its goal is to build the predictive maintenance model. Under this project, one experiment is defined, as we are only using the NN9 network. Under this experiment, there are multiple runs, each of which corresponds to a different hyper-parameter setting, including the federated aggregation method used (e.g. FedProx and FedGKT).

The interaction flow for the orchestrator is shown below.

With the configuration of federated training completed, the participants can then proceed to register themselves to the collaboration/project they want to contribute. They also declare the compute resource and data they are going to use.

After both the orchestrator and participants provide their meta-data, the orchestrator would then start the federated training. No further actions are required from the participants. The orchestrator can also view the progress of the federated training and the status of various Synergos components in a Command Station Analytics Dashboard. Please refer to our user guide for a walkthrough of the steps to build federated models in Synergos.


The Outcome

The performance of the centralized model NN9 is as follows. This serves as the baseline, when comparing with the federated models.



(Class 1)


(Class 1)


(Class 1)


(Class 2)


(Class 2)


(Class 2)















Performance of the federated model trained with FedProx:



(Class 1)


(Class 1)


(Class 1)


(Class 2)


(Class 2)


(Class 2)















Performance of the federated model trained with FedGKT:



(Class 1)


(Class 1)


(Class 1)


(Class 2)


(Class 2)


(Class 2)















For easy comparison, the difference in terms of F1 score between the federated models and the centralized model is calculated and shown in the table below. A negative value here means that the F1 achieved by the federated model is higher than that achieved by the baseline (i.e. the centralized model), which signifies that the federated model performs better than the baseline.



(Centralized vs. FedProx)


(Centralized vs. FedGKT)


Class 1

Class 2

Class 1

Class 2











What is reported in the last three tables is the performance of the best performing federated models (FedProx or FedGKT). As shown in the last section, the Director in the Synergos Orchestration component has been used to tune the hyperparameters. In total, 86 models have been trained during the tuning process. The chart below shows the average performance of all the different models trained by the Director.



(Centralized vs.FedProx)


(Centralized vs. FedGKT)


Class 1

Class 2

Class 1

Class 2























It can be observed that the federated models trained with FedProx can attain comparable model performance (ΔF1 is small) as the baseline model does, while maintaining individual data sources’ data confidentiality. The best performing model trained FedGKT also manages to achieve a performance that is close to that of the baseline. Nevertheless, FedGKT achieves worse performance than FedProx for Class 2 across both datasets. The models trained with FedGKT also exhibit higher variance in performance. This could be because Class 2 is generally a smaller class (compared to Class 0 and 1), and the simple model trained locally is not able to extract meaningful features for this class.


Next Step

We have seen that the federated models do achieve a similar level of performance as that of the centralized baseline model, which serves the current phase’s objective. We have also seen how Synergos can be used to train and tune federated models with ease.

In their application of predictive analytics, Seagate originally used a Gradient Boosted Decision Tree (GBDT) model in their experiment. This highlights the case that machine learning in production is not restricted to deep neural network models. We are working on adding federated GBDT support in Synergos to extend the capabilities of the platform. The support of federated GBDT also goes beyond this collaboration. It would provide Synergos users with a greater variety of models, besides the current deep neural network-based models.


We’d like to give thanks to the Seagate Technology team (Hamza Jeljeli, Ed Yasutake, and Saravanan Nagarajan) who provided the use case and support with the launch of our Federated Learning platform Synergos.

The Federated Learning Series


Secure AI Engineering in AI Singapore

With more AI systems being deployed into production, it becomes critical to ensure that the systems are secure and trustworthy. Here in AI Singapore, the SecureAI team is dedicated to developing processes and tools to support the creation of secure and trustworthy AI systems.

As shared in the previous article, one of the key ingredients to robust AI systems is process. Currently, operationalizable process guidelines are missing to guide organizations in developing, verifying, deploying, and monitoring AI systems.

To fill this gap, the SecureAI team has worked on developing a set of guidelines that draws upon AI Singapore’s experience in delivering 100E projects, and consolidates knowledge and best practices from the larger AI community – notably from the Berryville Institute of Machine Learning (BIML) Architectural Risk Analysis (ARA) and Google’s ML test score paper.

In this article, we will share an overview of our findings and how we operationalized them in the organization.

Engineering AI Systems Securely

An AI system is a specific type of software system. The field of software engineering has a relatively well-established set of best practices for the development of software systems. In comparison, the domain of AI engineering is in its infancy and the best practices are constantly being updated and improved. 

The full life cycle of an AI system generally consists of the stages as shown in Figure 1.

Figure 1. Life cycle of an AI system.

The considerations for engineering an AI system can be grouped into one of the following four areas of focus: data, modelling, infrastructure, and monitoring. Each of these areas can pertain to one or more parts of the life cycle. The following are a selection of key considerations under each area, which we have identified to be important for the development of secure AI systems.


Data is a key part where AI projects differ from a traditional software project. Traditional software systems have their logic coded in their source code whereas AI systems rely on learning from the data provided. This means that any bias or compromise in the data can result in vastly different behaviors and unwanted outcomes in the AI system. Therefore, it is critical to ensure that data used is trustworthy and reliable.

As data is arguably one of the most important components of an AI system, there are many other considerations in this category. This includes, but is not limited to, checking for input-output feedback loops, proper representation of the problem space, data splitting methodology, avoiding unwanted bias from data processing, and ensuring privacy/anonymity.


The model or algorithm is typically what people think of when it comes to AI systems. The model chosen needs to be suitable for the complexity of the problem. It is also important to identify and verify assumptions associated with the models.

Beyond the choice of algorithm, model development is a complex process where many small decisions have to be made along the way that can potentially have a critical impact on the performance of the model. It is important to examine these choices systematically, for example, it is important to evaluate the sensitivity of hyperparameters, and whether the metric used for the machine learning task is appropriate.

Beyond basic functional requirements, an AI system can also be tested for non-functional requirements1 such as fairness, robustness, and interpretability. Robustness testing specifically, is an area of focus for the SecureAI team and we will be sharing in much greater detail about our work in this area in subsequent articles of the series.


Infrastructure supports the entire life cycle of the AI system. This is not limited to training and testing, but also to deployment and future enhancement of models. The infrastructure should facilitate the process of model training, model validation, and model rollback when needed.

It is important to have proper access control and versioning of the data, model, and code, for traceability, reproducibility, and security. The development and production environments should also be properly isolated.


The performance of an AI system could change in unexpected ways over time due to reasons such as changing trends or degradation of physical hardware, e.g. sensors which provide input data or computational devices that the model runs on. It is important to continually monitor the performance of the system to ensure that it meets requirements. The monitoring should be able to automatically alert the relevant teams when the performance deviates from expected, so that the necessary actions can be taken promptly, e.g. retraining of the model, updating of dependencies, maintenance of hardware and etc.

All of the four aforementioned aspects must be managed properly in order to ensure that the AI system is reliable and secure. This is not an exhaustive list but rather an introduction to the topic of secure AI engineering. For more in-depth discussion on the topic, interested readers may refer to the linked resources.

Operationalizing the Principles

In order to put the above principles into practice, the SecureAI team has developed the following process that involves a knowledge sharing and security review.

Knowledge Sharing

At the start of a project, the SecureAI team conducts a sharing about the common risks faced during the development and deployment of AI systems. The target audience are AI practitioners, engineers and the project stakeholders from both AI Singapore and partners from the industry. 

The primary goal of the sharing is to ensure that everybody involved with the project understands the importance and implications of AI risks and are aligned with the goal of minimizing the risks. 

It also enables the AI developers and engineers to proactively secure the AI system as they are developing it, as well as for practitioners from a traditional cybersecurity background to understand the security implications of deploying AI in their systems.

Security Review

The project team is provided with a checklist that consists of questions that are designed to aid them in systematically identifying and mitigating potential risks in an AI system. Throughout the development process, the project team can refer to the checklist for guidance. 

When the project team is ready, they can fill in the checklist with the details of their system design. Based on the responses, the SecureAI team provides an overall risk assessment and recommendations for mitigating potential risks. This process can be iterative in nature, to facilitate the development of more secure AI systems.

At the end of the project, the final version of the report will be handed over along with the project deliverables to the project sponsors.

Risk Control Checklist Examples

The questions in the checklist are organized into four sections reflecting the typical life cycle of an AI system as mentioned above (data, modelling, infrastructure, and monitoring). Sample questions and recommendations are shown in Table 1 and 2, respectively.

Table 1. Example questions from the risk control checklist.
Section Question Answer [Y/N/NA] Elaboration [Please justify all answers, including ‘NA’]
Data Is your dataset representative of the problem space?   Please describe the problem space that the ML system aims to address. 
Please elaborate on how you have ensured that the distribution of the data is representative of the problem (e.g. data covers all intended operating conditions/target demographic, term frequency matches the natural distribution of the target corpus, classes are balanced). Please note down any constraints in obtaining a representative dataset, if any.
Modelling Have you ensured that your model is sufficiently robust to noise in the inputs?   Please elaborate on how the model was tested for robustness.
Infrastructure Is your ML pipeline integration tested?   Please elaborate on how you have ensured that your full ML pipeline is integration tested and how often (e.g. automated test that runs the entire pipeline – data prep, feature engineering, model training and verification, deployment to the production servicing system – using a small subset of data, in regular intervals or whenever changes are made to the code, model or server).
Monitoring Will any degradations in model quality or computational performance be detected and reported for the deployed model?   Please elaborate on how degradations of model performance in the production environment are detected and reported.
Table 2. Examples of recommendations that may be provided to a project team.
Checklist Section Areas of Improvement Recommendation
Modelling The ability to explain the model is relatively low due to the application of a deep learning model.  Post-hoc explainers, like LIME or Grad-CAM, could be applied.
Infrastructure The data and model artefacts are manually versioned with timestamps. It is suggested that a proper model lifecycle management tool is used. This would help to keep an inventory of different models and their corresponding performance and model stage transition (e.g. promoting a model to production stage or roll-back from production). 

Following this process allows us to have more confidence that the AI systems developed in AI Singapore are secure and trustworthy. This checklist is continually improved based on feedback and experience from executing projects.

Hopefully, this article has given the reader an idea of how we practice secure AI engineering in AI Singapore. In the subsequent articles of this series, we will be diving into a focus area of SecureAI as mentioned in the ‘modelling’ section above: robustness testing. Stay tuned to learn more about the topic and our work in this area!

1 Machine Learning Testing: Survey, Landscapes and Horizons

Pitting TagUI Against the Big Boys

RPA (robotic process automation) refers to the automation of digital processes done by human workers, and is one of the fastest growing enterprise software over the last few years. One of the leading commercial RPA software vendors, Automation Anywhere, organised a series of RPA challenges in August, providing customer real-life business scenarios for its users and users of other RPA software to solve.

AI Singapore joined in the fun with some attractive prizes for TagUI users to solve using the tool. The TagUI team joined in as well. What is evident is a free and open-source RPA tool like TagUI can solve these business scenarios just as easily, as quickly, as reproducibly, as commercial RPA software like Automation Anywhere and UiPath.

This can be validated by browsing through the #BotGames hashtag, to see posts from different users of RPA tools. There are lots of posts for this LinkedIn hashtag, to see all the posts you can sort by: Recent. Posts made by Ken Soh, product engineer at AI Singapore working on TagUI, consistently ranked amongst #BotGames posts with the most engagements from the RPA community.

Week 1 – Customer onboarding filling website with data from Excel
Week 2 – Supply chain updating data from one website to another
Week 3 – HR data migration from desktop app and API to HRM
Week 4 – Accounting scanned invoice processing using OCR

Ken, as with other TagUI users, solve using different flavours of TagUI, including the human language version, the Python version, and the Microsoft Word version (Week 3 example above). In fact for week 4, the toughest challenge, he was the first to post a solution amongst users of all other RPA tools. Also, check out the fastest entries made by TagUI users in TagUI community forum.

Thank you to all TagUI users who participated, it was a learning experience for everyone in the RPA community who took part in these RPA challenges kindly created and hosted by Automation Anywhere. Following are TagUI users with the fastest entries (in no particular order), each winning $500 USD worth of prize (AI Singapore LearnAI Premium Subscription + DataCamp Premium Annual Subscription).

François Blanc from France – Customer Satisfaction and Quality Manager at Schneider Electric
Abdulaziz Shaikh from India – Final Year Student at M.H. Saboo Siddik College Of Engineering
Nived N from India – RPA Influencer / Trainee RPA Developer at Tata Consultancy Services
Mirza Ahsan Baig from Saudi Arabia – Automation Engineer Intern at Selangor HR Development Centre
Daniel Correa de Castro Freitas from Brazil – RPA Developer at Infosys Consulting
Wei Soon Thia from Singapore – RPA / Data Science / Engineering Project Lead at SimplifyNext
Chee Huat Huang from Singapore – Human Capital Analytics Manager at SATS Limited

In particular, Wei Soon and Chee Huat opt to give their prizes to others in the community who may be able to benefit more. A post-event lucky draw was held using a TagUI workflow to randomly select 2 people from folks who expressed interest in having these AI/ML/DS learning subscriptions. Bibin P John from India and Nur Ashikin Binti Rohaime from Malaysia won themselves the prizes.

Special shout out to Daniel from Brazil. He shared on his GitHub repository all 4 weeks of solutions using both TagUI human language and Python versions. Check out his very nicely documented and high quality solutions.

TagUI, and its various ‘flavours’, are fully free and open-source. Go ahead and make a dent in the digital automation space using them as part of your digital toolbox. Create RPA solutions for your clients, your bosses, your colleagues, and even your loved ones 😄 (yes, you really can).

PS – The temporary turbo mode created to play in the challenge, is now available as a permanent option. Users can now run at 10X faster than normal human user speed. A workflow that takes 1h to complete can now be done in 5 minutes. Imagine what RPA can do to help meet your deadlines!

New Community-created Content




mailing list sign up

Mailing List Sign Up C360