The Techsauce Global Summit hosted in Bangkok, Thailand, took place virtually on 5 to 8 October this year. Among the hundred-odd talks distributed across thirteen themes, including AI & Data, AI Singapore’s Director of AI Innovation Laurence Liew participated in a panel discussion to share experiences in AI talent development and industry adoption together with Dr Putchong Uthayopas, Vice President for Digital Technology at Kasetsart University and Dr Supan Tungjitkusolmun, President of CMKL University.
Right off the bat, our gracious and humble Thai counterparts acknowledged that Singapore is ahead of the curve as far as AI talent development and industry adoption go. Laurence gave a quick summary of what AI Singapore has achieved since it was set up in 2017. Of the one hundred industry projects originally planned, fifty-five have been approved and fourteen are already deployed in production. The project sponsors have been an equal mix of end users (incorporating AI as an competitive advantage) and solution providers (offering new services) across a diverse range of industries. On the talent development front, the AI Apprenticeship Programme conceived to address the talent shortage has also seen an explosion in applicants this year, with the demand not projected to taper off in the next two years.
In Thailand, several industry verticals have already been identified in its national AI strategy : food and agriculture, medical and healthcare, energy and environment, tourism, education, logistics, security and manufacturing. It is interesting to note that there is much in common with the Singapore experience despite major differences in our economic structures. For example, 40% of the Thai population is engaged in agriculture-related work and tourism contributed between 10% to 20% of pre-COVID GDP by various estimates (compared to Singapore’s 4%).
Dr Putchong, a good friend of Laurence of many years, is an admirer of the Singapore model, passionately advocating for a structure similar to the one the Lion City has put in place, which empowers industry to drive the demand for AI applications and talent. Having a population more than ten times Singapore’s, Laurence is confident that despite the later start, Thailand, with its tremendous amount of talent and data, can be on par with Singapore in a year. A key part of its talent development strategy is the recently launched Super AI Engineer Development Program, which seeks to develop in the first phase 100 such top talent together with 300 experts and 1,600 starters.
Beyond personal friendships between leaders, ideas to connect up more AI professionals between nations were also floated during the discussion. In Singapore, the recently set up AI Professionals Association (AIP) might be one such vehicle to carry out such network building. Whatever form it takes, all the panelists agree that this would be good for the whole Southeast Asian region. Personally, I think there is much mutual learning to be done, in view of the many common AI use cases encountered, and I eagerly look forward to it coming to fruition. Overall, this panel discussion has provided me a good glimpse of the development of AI in our neighbouring country.
(The opinions expressed in this article are those of the author and do not necessarily reflect the official position of AI Singapore.)
The Federated Learning Lab is a major initiative within AI Singapore which seeks to bring to industry the benefits of collaborative machine learning while respecting data privacy. Huang Jie and Han Xiang (Han) are two AI apprentices from Batch 5 of the AI Apprenticeship Programme who shared with me their contributions to this effort and what they learned along the way.
Below is a transcript of the conversation [*].
Basil : Hi Huang Jie and Han. Great to have you guys here today.
Huang Jie : Thank you, glad to be here.
Han : Yes, thank you for having us.
Basil : Okay, before we begin, perhaps I would like to invite you guys to introduce yourselves to our listeners. Ladies first, let’s start with Huang Jie.
Huang Jie : For me, after obtaining my PhD in molecular dynamics, I had been working as a software developer for around 3 years. I got interested in machine learning last year and took online courses to learn more about it. Then I found the AIAP programme in AISG, thinking this was the best chance for me to get into this industry.
Han : So, for me, I got my bachelor’s in computer science in 2019. Unfortunately there weren’t any courses on machine learning or AI theory at my school, so I studied it on my own. I read papers, did courses, personal projects. AI Singapore is the first machine learning related job that I’ve done. It’s been an irreplaceable learning experience and I’m glad to have had it.
Basil : So, today we’re going to talk about federated learning. I first got to know about federated learning last year and I did some reading up on it. So, I’m not a total stranger to it. But for the benefit of our listeners, could you explain what federated learning is?
Han : Okay, yes, in super simple terms. Normal machine learning typically requires that you have all of your data together in one bin, one container. Now, imagine a scenario where you have a bunch of containers. Each container owned by a different person, and none of those people really trust each other. Let’s say the owner of one container is a big dog and a bin owned by another person is filled with beef. You can’t let the dog touched the beef or it will eat it, and this would be bad. And let’s say another bin contains secrets pertaining to the private lives of a bunch of people. You can’t let anyone see those secrets or you might have a bunch of lawsuits dumped on you and the government might decide to slap you with some limiting regulations as punishments. Anyway, you now need to do machine learning on all of these separate containers that you can’t see and you definitely can’t reveal the contents and you can’t let the owners of bins mess with any bins that are not theirs. So, it’s pretty challenging, yeah.
Huang Jie : Maybe I can give an example of an actual use case. Imagine you are a hospital owner, and you want to predict how likely it is that a patient will perish. You do not have many fatalities recorded. Maybe your hospital is a good one, or you just have very few electronic records. Predicting mortality is a hard job that is probably going to need lots of data. So your only option is to collaborate with a bunch of other hospitals. Of course, well-run hospitals are not going to reveal the details of their patient data to just anyone. The data has to be kept secret and confidential – it’s the sort of situation federate learning is intended for.
Basil : Interesting. And what are some other industries or applications where federated learning can also add value?
Huang Jie : Yes, actually, there are two types of industries where federated learning is particularly useful. First one would be finance. Machine learning in finance is a little bit infamous because financial data is typically a great simplification of extremely complex real world phenomena. For example, how would you go about predicting the stock market rising or falling, based on Donald Trump catching the coronavirus or tweeting something? How would you account for back-room dealings, pump and dump schemes, or the discovery of new resources in some area? The simplest solution is just to throw lots of data at it, as much as you can, and you can get extremely large amounts of data if you make use of what multiple financial institutions have collected. But because this is financial data, this has to be kept secret for reasons of security and competitive advantage. So, federated learning would allow financial organisations to collaborate with their competitors without revealing their data to any of them.
Han : The second industry would be IoT, the Internet of things. The issue with IoT is that it typically makes use of information collected from appliances and tools, which are a very fundamental part of people’s daily lives, and which are constantly collecting data. So, things like temperature readings of someone’s house, the amount of food in a fridge, the types of things they watch on TV, the topics of conversations they had with family members etc. Now, if someone had access to such data they could create an extremely accurate profile of lifestyle, psychology, ideology of the people associated with the data. So in a benign way, this could be used for lifestyle optimisation. You know, you are running out of a certain type of food so the fridge will automatically order the food … the house is going to adjust the temperature so it’s ideal for what you typically like at a certain point of the day, that sort of nice things. Maliciously, this could be used to determine the perfect time to, say, rob your house and kidnap your child. More seriously, it could be used to profile people according to some measures of social desirability. So let’s just say that it’s data that extremist groups with dreams of radically reshaping society would love to have. It is basically perfect and ubiquitous surveillance. Federated learning would allow IoT companies to collect and make use of this data, but it would reduce the possibility of that data being used for such nefarious purposes.
Basil : Yes, on a related note, I also recall a point made by technology thought leader Kai-Fu Lee in his 2018 book “AI Superpowers”. He mentioned three points in successful AI implementation : big data, computing power and AI engineering talent. Now, once computing power and engineering talent reach a certain threshold, the quantity of data becomes decisive in determining the overall power and accuracy of an algorithm. I think this is a very pertinent point for Singapore where our size limits the amount of data we have at our disposal to work with. So, now, translating theory into practice, moving from the concept to application, what are the technical challenges encountered when implementing a federated learning solution?
Han : Well, I think the main challenge in federated learning comes from what we call statistical and system heterogeneity. Statistical heterogeneity is a natural characteristic of data which comes from many different sources. Let’s take the mortality prediction task we had previously. We have two hospitals : one is in North America, the other is in Uganda. The data from both hospitals deals with the same task which is mortality prediction. But, it reflects completely different sets of socioeconomic circumstances and is influenced by many unique factors. They don’t really represent the same real world phenomena and machine learning models trained on each of them can end up approximating functions that look and behave very dramatically different from each other. The second challenge I mentioned was system heterogeneity. You have a bunch of participants. Each participant has a different computer. Some of those computers are better, some are worse. Some participants will complete their tasks much faster than the others and then we’ll have to wait. Some participants’ computers will crash, disconnect or jam and leave everyone else waiting with no idea what’s going on with them. And so, it’s a pretty difficult problem of coordination and it’s still unsolved for federated learning. Like, how do you account for all of these issues of timing?
Huang Jie : On top of that, although federated learning has the benefit of allowing parties to collectively train a model without exposing their data, this is a double-sided sword. Since no data is exposed, in some mild setting, parties might just take advantage of other parties’ better data in terms of the model gradients while contributing lousy ones obtained by their own useless data. Things can become severe, some party might even perform data poisoning by contributing model gradients that are specifically and carefully manipulated to achieve their malicious goals. So, there is a need to evaluate how much each parties’ data contributes to the final model and also how to distribute the reward fairly among all the parties.
Basil : Okay, now coming to AI Singapore. Could you tell us more about what AI Singapore is working on in federated learning?
Huang Jie : The federated learning team in AISG aims to build a federated learning platform that can be available to all the organisations in Singapore. Participants will be able to benefit from a better “model” obtained by collaboratively utilising data from all the parties, without sacrificing the privacy of their own data. In addition, each party can be rewarded by their corresponding contribution. The project started from our Batch 4 apprentices. They were the ones who made this happen. They have built a platform which has been internally released to AI Singapore engineers and partners to gather feedback. In addition, they also performed use case studies to demonstrate the applicability and versatility of the platform. So, Han and I are Batch 5. Our focus is mainly on how to properly incentivise the participants through accurate and fair contribution calculations.
Han : If I could add a little bit more about that, our goal was to find some sort of general solution to the contribution calculation problem and a general solution doesn’t really exist… Well, there is a general solution, but it is pretty impractical and certainly not suitable for our use case. We started off by trying a whole bunch of new algorithms. We thought that we would try to calculate contribution in a certain special way and then used that method of calculating contribution as the basis for a new federated learning algorithm. So, we came up with a whole bunch of variations which we called FedMean, FedMomentum, FedDemocracy, FedDictatorships … all sorts of new things, none of which really worked well. Well, that’s not true – all of them worked well in certain situations, but failed in others…
Basil : Not universally, you mean?…
Han : Exactly. Not general solutions. It really wasn’t great. It was very inconsistent. We eventually realized that the problem is, with a standard federated learning algorithm, the contribution calculation is very hard because each party that trains the model is affected by information that comes from other parties, so at some point their individual contributions become tangled up. For example, let’s say you have two parties and they’ve been training a model for a certain amount of time and Party #1 has received a model from Party #2 in the past. Now, it’s going to build its future model based on what Party #2 gave it. However, if the two parties keep exchanging information in this way, the information that they contribute becomes very intertwined, impossible to separate. You can’t just cut cleanly the model in half and say this part was contributed by Party #1 and this part was contributed by Party #2. They all interact in a very unpredictable way. So, how do we solve this fundamental challenge? We came up with an algorithm that we called FedSwarm. FedSwarm is an ensemble learning method, meaning that it basically relies on training a large number of very simple, small machine learning models and then using them all at the same time. Basically treating it kind of like a democracy in that each model votes for what is the correct answer, and then we can take the majority votes, or we can weigh the votes in some manner, but either way the models are collaboratively deciding what the right answer is. This is as opposed to standard machine learning, where one model makes a decision and you just follow it. Skipping over all the minute technical details, we found that this method got us superior performance compared to the standard federated learning algorithms, and it also meant that we could easily get the contribution of each party in federated learning because each party is training models that never interact with anyone else’s models, the information never gets intertwined. So, you can just separate and say, oh, these models trained by this party get this number of answers correct and these models get another certain segment of answers correct – you can easily tell that they are responsible for different beneficial contributions.
Basil : Sounds like a lot of interesting work that you guys have done. I’m particularly intrigued by the attention paid to the contributions of individual parties, because you can have the most mathematically brilliant model training solution, but things will not work simply according to plan if you do not consider the so called human factor. Let’s talk about the apprenticeship in general. How has it been for you guys?
Han : I think it’s been very interesting. For a very long time, I had honestly had very few people to talk to who wanted to do AI and machine learning. Most of my classmates were interested in software and network development, but not really in machine learning. So, I eventually decided that those things are, interesting and important as they are, they’re not really what I want to focus my career on. So getting into AI Singapore meant that I suddenly got to talk and interact with a lot of people who had the same goals and were interested in the same topics. So, the experience has been pretty wonderful. I’m quite glad to have the opportunity be an apprentice. It really expanded my worldview and I’m quite grateful for that.
Huang Jie : I agree with what Han has said. On top of that, the apprenticeship actually feels quite stressful for me. But all the apprentices are talented and they’re hardworking, so working with them together inspires me a lot. Since we all come from different backgrounds, so having the chance to discuss with all the people here, even just for general non-machine learning topics is quite inspiring to me.
Basil : Good to hear that. Are there any particular learning points that you guys would like to mention, especially to listeners considering applying for the programme?
Han : Well, I think that the breadth and scope of the 100 Experiments projects at AI Singapore is pretty inspiring. I think that the real value of machine learning comes out when it’s applied to solve practical real world problems and the 100E projects really focus on a lot of problem areas that a lot of people might just not think about or not consider but which are necessary. So, for my case, I never heard about federated learning. I never knew it was a thing. But I got the chance to work on it in depth, to really explore the theory and the technical side of it in great detail and experiment with it. And Huang Jie and I were able to write a paper on FedSwarm … how do I say? I’m not really sure how to summarise this, but I think we both definitely got a lot more out the programme than we were expecting.
Huang Jie : Since Han has mentioned about the technical side, maybe I can mention more about the other sides, which are non technical. I have learned a lot from our mentor, Jianshu, who is the lead in the Federated Learning Lab. Also my teammate, Han. So they inspire me by being original and attending to details, and Han is very excited about trying out new ideas and Jianshu is very careful about the theoretical background of the approach that we are taking. Now, AI – which is quite an overloaded term – is a rapidly advancing field which requires whoever is in it to keep up with the research frontier, but also at the same time focus on the application of their research advancement. So, if I would summarise, I would say just like what the Red Queen said, “It takes all the running you can do to keep in the same place. But if you want to go somewhere else, you must run at least twice as fast as possible.”
Basil : I hope the experience will serve you guys well in the future. Thanks for being here today. Here’s wishing you guys all the best in the rest of the programme and beyond.
AI Singapore recently formed a team to build a platform for Federated Learning. This article is the first in a series that serves as a journal of our journey into the world of Federated Learning. It is not meant to be a survey of Federated Learning, which is itself a huge and active research area. Rather, we seek to capture our understanding of Federated Learning, why we feel it is important and what we plan to work on.
Why is Federated Learning needed?
We would like to start with the diagram below.
CRISP-DM (Cross-industry standard process for data mining) is an open standard process model that describes the common approaches used by data mining experts. There are also alternative methodologies, e.g. SEMMA by SAS and the more recent Team Data Science Process by Microsoft. Although it was conceived more than 20 years ago for the data mining community, the fundamental principles are still relevant for the AI/data science community at-large today: data remains the core, and there are some key stages through which projects are typically executed: business understanding, data acquisition and understanding, modelling (including data preparation/feature engineering, modelling, and evaluation), and deployment.
Data is Key
From a data management perspective, data acquisition typically starts with data discovery. Data discovery is the process of searching for existing data that is available for modelling either internally or externally in an organisation. With the rapid pace of digitalisation across most industries, more and more data is getting generated. Usually, data is generated by different systems. Furthermore, much of that data is generated without the intention of supporting specific downstream AI project needs in the first place. Therefore, it may not be possible to make the data easily discoverable. Thus, there is a need for a data catalog system. This is one of the reasons driving the need for a common data platform in AI Singapore.
Suppose we discover some data which could be relevant for the problem statement. In many cases, the data is incomplete and needs to be augmented with additional information. For example, a model which predicts whether an insurance policy is likely to lapse is useful for the insurers. Nevertheless, data within the insurance domain alone would be insufficient to build a model with high accuracy. Lapse could be driven by factors that are not captured by the insurance companies. For example, a lapsed policy could be due to the fact that the policy holder has lost income. Banks could have such personal data that could be used to augment insurance data, if it could be shared.
On the other hand, there are data privacy concerns when it comes to cross-organisation personal data sharing. There is an increasing awareness of staying compliant when sharing under regulations such as GDPR and PDPA. Even if it can be ensured that the personal data sharing stays compliant, there are plenty of business/commercial considerations that do not give enough justification/incentive for organisations to share the data they have acquired.
So the question now is how we can better utilise the data that is scattered across different systems in a privacy-preserving manner. We believe Federated Learning is a promising solution.
What is Federated Learning?
Federated learning is a machine learning technique that trains a model across multiple decentralised parties holding local data, without exchanging them. It is different from conventional machine learning techniques where all the local datasets are uploaded to one centralised location.
Don’t be bogged down by this seemingly complex diagram. Here are the main steps involved:
A group of parties (with local data) come together and form a network, with the common goal to train a model together. The number of parties varies depending on the use case. They agree on the type of model to be trained.
The Trusted Third Party (TTP) acts as the coordinator (it does not contribute data). It sends this model to all the other participating parties. This model would serve as a baseline for each individual party to start training with only local data.
The TTP then aggregates the new learnings from the parties and continues to improve the shared model.
The new shared model is again sent back to the participating parties and the same cycle repeats again and again. With each iteration, the shared model maintained by the TTP gets better.
There are variations of Federated Learning, depending on how the different steps above are conducted. For example, the TTP role is not always needed to coordinate the model aggregation. It is key to point out one important variation due to how data from the different parties “overlap”.
In the example above, where data from the insurance company can be augmented with data from the banks to build a better policy lapse model, different parties have a big overlap in the user space, but a small overlap in terms of feature space. This scenario is called Vertical Federated Learning or Feature-based Federated Learning. In this scenario, typically only one party has the label y (or ground truth). In the insurance policy lapse example, only the insurance company has the label where a policy has indeed lapsed.
There is another scenario where different parties have a big overlap in the feature space, but small overlap in the user space. This scenario is called Horizontal Federated Learning or Sample-based Federated Learning.
In this illustration, Party A, B, and C have data with different IDs, while they share the same set of features x1, x2, and x3. Each of them also has the label for the data it owns. The most famous example of Horizontal Federated Learning application is Google’s Gboard. In this application, billions of mobile devices come together to train Gboard’s query suggestion model. Different devices would collect the same set of features.
Horizontal and Vertical Federated Learning would require different treatments when aggregating learnings from different participants.
What is AISG doing in this area?
Our goal is to build a platform to support Federated Learning. Some of the key features include (but not limited to):
It should support both Horizontal and Vertical Federated Learning.
It should support most of the mainstream machine learning algorithms, such as logistic regression, tree-based algorithms, deep learning.
It should support multiple federated aggregation mechanisms. No matter whether it is Horizontal or Vertical, one of key problems in Federated Learning is how to deal with the statistical heterogeneity in the data of the different participants. In the Federated Learning setting, it is quite common that the data owned by individual parties are generated by different processes. Such a data generation paradigm violates the independent and identically distributed (I.I.D.) assumptions frequently used in conventional machine learning techniques. How to effectively address this remains an active research topic. The most commonly used one is FedAvg. It is known to have some drawbacks, and many improvements have been proposed, e.g. FedProx. Nevertheless, those proposed improvements have their own pros and cons too. Our platform would provide users options to choose different mechanisms.
It should provide a contribution and incentive mechanism. As shown above, one of the main benefits of Federated Learning is that it enables collaborative model training without the individual parties exposing its training data. But this is a double-edged sword. It also opens the door for the “free-rider“, i.e. participants who try to profit unilaterally by deliberately injecting dummy data into the training process. An extreme of such a scenario is when a participant launches a data poisoning attack. How to eliminate data poisoning attacks under a Federated Learning setting is still an active research area. As a first step, a contribution and incentive mechanism could help to find out who are the potential free-riders so that the collective benefit of all the participants could be optimised.
It should support auto tuning of hyper-parameters.
It should be easy to deploy and use. The platform would support deployment via Docker Compose. It will also be integrated with lifecycle management platforms like MLflow to provide a one-stop management of the whole cycle of Federated Learning training from reproducibility to model registry to deployment.
It should be tightly integrated with our common data platform. We take a holistic view of data management and machine learning in AI Singapore. With this integration, users will be able to search/discover data contributed by other users which could be relevant for their Federated Learning model.
There may be questions why AI Singapore is investing effort in building yet another Federated Learning platform given that there are a handful of them out there, including TensorFlow Federated and FATE (Federated AI Technology Enabler). We see potential applications of Federated Learning when we are collaborating with our industry partners to deliver our flagship 100E programme. During our search for suitable solutions, we see some limitations of existing frameworks/platforms. To name a few reasons:
Some of them only support Horizontal Federated Learning. We see applications of both Horizontal and Vertical Federated Learning in various use cases.
Most of the existing framework/platforms assume that all participants would build local models of the same configurations. This may not be the ideal situation, as some parties may achieve a more optimal local model with a configuration different from other parties. A related technique is Split Learning. With Split Learning, different parties jointly train a neural network. Each party trains the first few layers locally up to a specific layer known as the cut layer. The outputs at the cut layer can then be sent to the TTP, which completes the rest of the training without looking at raw data from any party that holds the raw data. Split Learning can be configured to support different parties training locally a different network configuration. Split Learning will also be implemented as one of the supported federated aggregation mechanisms.
There is no explicit treatment of the “free-rider” problem. This problem has been extensively studied in the area of peer-to-peer system, but not so in Federated Learning. We feel that this is something important to deal with in order to keep Federated Learning more sustainable. We will also design and implement a contribution and incentive mechanism as a first step to address this issue.
Most of them do not have end-to-end integration with other systems, like data catalog and lifecycle management. We feel that all these are important components that make an ML system work within a production setup.
The points mentioned above should not be taken to suggest that we feel that our platform is superior to the existing ones. Quite on the contrary, being a late comer in this field, we have drawn a lot of inspirations from them. We have also been actively contributing back to some of the open source projects in the field. For example, we have been actively contributing to the FATE community.
The area of Federated Learning is evolving rapidly. Building a platform from the ground up and applying it in projects will equip our apprentices with important skills to deal with AI projects in a real-life environment which is getting more privacy-aware. It will also give our apprentices the opportunities to learn together to avoid hidden technical debts.
In the subsequent articles in this series, you will hear more about how the platform is architected and designed, how certain important issues in Federated Learning are addressed, and about some use cases with the platform. Stay tuned!
The Federated Learning Series
AI Singapore’s Journey Into the World of Federated Learning (this article)
My name is Azmi and I’m a Senior AI Engineer at AI Singapore. Every day I work on interesting projects together with highly motivated, capable and, most importantly, awesome colleagues to solve AI problems. How did I get to be this lucky?
To answer that, I would have to take you back 5 years to the end of 2015; to one of the lowest points in my professional career. At that time I didn’t feel lucky at all.
I found myself retrenched for the first time in my working life.
Act 1: An Unexpected Ending
Let’s start with a little backstory.
After graduating from university with a degree in Computational Physics, I found myself working in one of Singapore’s life sciences research institutes. I was part of a programme to develop national capabilities in Grid Computing (the precursor to Cloud Computing) after which I continued as a Software Engineer to support biomedical research for a total of three-and-a-half years. I then decided to make a change and found myself working for an international company that provided Geophysical services to Oil & Gas companies. As it turned out, I eventually spent more than 10 years with this company, starting off as a Geophysicist whose job was to process huge volumes of seismic data that represent the subsurface structure of the Earth. These data volumes would then be used by the Oil companies to identify potential reservoirs for exploration.
A seismic data processing geophysicist will generate 3D image volumes using large quantities of data and significant computing resources. This volume is then used by Oil companies to identify geological structures and potential hydrocarbon reserves.
(Image courtesy of USGS)
I really enjoyed my work. I became competent enough to progress through the ranks, advancing to become a Technical Trainer. I was eventually appointed as the Regional Training Supervisor for APAC, where I was responsible for the training management of about 800 employees across the region. Somewhere through all of this, I even completed a Masters degree in Geoscience to deepen my knowledge in my field.
It was around 2014 when oil prices started to slide downwards from the highs of more than USD100 per barrel. The prices came down to a point where it was no longer economical for Oil companies to invest in new fields. Any company that depended on the fortunes of the Oil industry was greatly affected by this, as many Singaporeans might probably remember from that time. Similarly, the company I was working for was adversely affected and continues to experience sustained difficulties. Even today, the Oil & Gas industry is still facing tough challenges compounded by COVID-19, which has seen effects ripple across the global economy. It was under this backdrop that the company had to retrench a number of its people at the end of 2015 in order to save costs. I’ve had family and friends who had gone through retrenchment but it feels different when it happens to you.
Act 2 : Out of the Fire and into a Maze
To be honest, I probably didn’t handle retrenchment too well in the beginning. There was a whole range of emotions that I felt at the time which would probably require an entire article to express. Fortunately, I won’t do that to you here.
After getting over the initial shock, I had to assess my situation. I had just come out of a 10-year job working for the same company. I didn’t know how to look for a new job. The last time I applied for a job, it was listed in the Ads section of the newspaper. I hadn’t even updated my resume in years. Fortunately, there was some support for this through various agencies like E2i, WSG etc.
Initially, I did what many people in my shoes would do; to find a job as close to what I had been doing in both job role as well as industry. I was fortunate enough to have received some retrenchment benefits so at the time I felt I could hold out, hoping that the Oil and Gas industry would recover. The industry is well-known for its cyclical nature and the hope was that the downturn would be just temporary. However it turned out that this was blind optimism. One month soon became two, which then became three months. Six months passed and I was not even close to finding a suitable job opportunity anywhere in Singapore or even in the region. It became clear that the downturn would be here to stay.
It was around this time that I began to look around to see what other job roles I could do. The online learning movement had started to gain more traction and it was then I came across this ‘thing’ called Data Science. Data Science caught my eye because being a Geophysicist was very similar in some aspects. Seismic Data Processing in Geophysics was about managing, processing and making sense of data. So with doe-eyed enthusiasm, my plan then became: “Let’s do some of these online courses and then I’ll be able to get a job as a Data Analyst. Easy Peasy!”
It turned out that I was wrong and not for the last time in this journey.
With time on my hands, I was able to complete a number of Data Science courses in quick succession over the next few months including some well known ones. Even with those ‘certifications’ in hand, I was still not getting any opportunities in the job search. In fact, I wasn’t even getting any interviews. I was getting extremely frustrated. My morale was at an all time low. I felt that I was missing some pieces in the job search puzzle but I didn’t know what they were.
So by this time, almost three quarters of 2016 had gone by with no silver lining in sight. This was not a fun year.
Act 3: The Fog Clears
I then heard from a friend about a Data Analytics training programme delivered by the Institute of Systems Science at NUS. They were running a training programme under WSG’s Professional Conversion Programme (PCP) which targeted PMETs who wanted to transition into Data Analytics. The scheme included job placements and real-world practical experience. I felt that practical experience was the key piece missing from my Data Science learning. On paper, it seemed that I was the perfect profile for this and I applied with renewed enthusiasm. Once again I was brought down to earth with a rejection. This was due to the limited job placements available (getting accepted and employed by a company prior to starting was a key requirement of the programme) and the large number of applicants. Out of desperation, I persisted with NUS-ISS and I was accepted to join the same training programme but directly as a paying student. The main difference between a PCP participant and myself was that I would not have a job at the end of the training.
The programme consisted of a month-long classroom-based training followed by a 6-month practicum at a company. It was during this period that I was more exposed to and began to be more interested in Machine Learning. For the project work, I was attached to HP where I built an anomaly detection model for one of their manufacturing processes.
I believe it was this practical work experience that became the key for me to land a job in this new field as it was a demonstrable outcome of all my learnings in Data Science. After completing the programme, I was able to get a new job as a Data Engineer at a local Data & AI Consultancy company. This company was key in my initial development in the Data Science field. They encouraged me to work towards and attain technical certifications (Microsoft Certified Solutions Expert). Secondly, they allowed me to utilize my previous skillsets to gain a lot of exposure and experience in the industry by delivering public presentations, training courses and participate in trade shows. They also encouraged my interest in Machine Learning by putting me in the Data Science team and working on ML focused projects.
In mid-2019, I saw a role advertised at AI Singapore. I applied to join as an AI Engineer and I’ve been here ever since. I was struck by AI Singapore’s ‘Grow our own timbre’ philosophy. In the AIAP programme, I see apprentices who are going through similar journeys that I had to go through a few years earlier. As an Engineer and a Mentor, I believe it’s my duty to help train these apprentices to transition into Data Science and be successful AI Engineers when they complete the programme.
Throughout this new career path that I’m forging, I’ve also tried to find opportunities to incorporate skills and knowledge that I’ve accumulated over the years. For example, I was able to learn new programming languages, R and Python, relatively easily and am able to build robust, complicated ML systems based on experiences developed as a Software Engineer early in my career. I’ve used my technical training experience not only to mentor Apprentices but also to develop new technical content for AISG and to make external presentations when called upon. I use my experiences as a Geophysicist to handle technically challenging AI problems, manage large quantities of data and to engage the customers so that we can solve the project together. I was even able to use some knowledge from my Geophysics background in one of the projects which required some knowledge of signal processing and acoustic waves.
The past five years have been professionally challenging, triggered by an event that I hope people will never experience in their careers. However in some perverse way, I’m glad that it happened. I might not have been on this path and would not be writing this article otherwise. I also feel that I’ve grown mentally stronger because of it and will be more resilient to future hardships. Could I have done better? Sure, as they always say ‘hindsight is 20/20’. If I had a time machine, I would probably tell my younger self the following:
“Don’t stay in the bubble”: When I was working in the Oil and Gas industry, I wasn’t really paying any serious attention outside of Geophysics and industry. Had I widened my attention span, I would have known about the upcoming trend in Data Science and Machine Learning. This could have accelerated my entry into this new domain.
There is a common axiom that you should never stop learning. Never more true. With all the free and affordable online resources available, there is no reason to not do so. The act of learning itself is also important. Studies have shown that continual learning can help arrest cognitive decline as one grows older.
The growth of online learning has been a boon to many people. It has made learning new knowledge much more accessible. However, you need to realize that completing the online learning is only half the battle, you will also need to develop the real-world practical skills that will allow you to apply this knowledge.
Finding a job is a skillset on its own. Learn to utilize career coaching services and recruitment agencies more effectively. Write better resumes. And pay more attention to building a better online professional profile e.g. LinkedIn.
Do an honest audit of your own competencies and identify those that can be transferred into the new domain.
Above all, do not forget your mental health. It is important to stay motivated and not lose hope. Don’t be embarrassed to seek help.
For my final thoughts, as I’m writing this article in 2020, COVID-19 has profoundly changed the world that we know. Many people are facing personal and professional challenges as a result of this pandemic either directly or indirectly. I can’t really speak about the personal challenges but maybe I can provide some solace for someone who is going through professional challenges now. I hope you will read this and find some inspiration or motivation to keep fighting on. The future is uncertain but if you endeavor, there’s always a chance that you’ll come out the other side better and stronger.
* Literary fans may recognize Tolkien’s The Hobbit references in the title and section headers. You were not imagining it!