Get SGD58.85 off your premium account! Valid till 9 August 2021. Use the Code ‘SGLEARN2021’ upon checkout. Click Here

Grooming parents and teachers for AI for Kids (AI4K)™

AI for Kids (AI4K)™, meant for children aged between 10 to 12 years old, is a programme created by AI Singapore that aligns with our philosophy to “Grow Our Own Timber”.  After completing our public boot camps during the June school holidays in 2019, several Singapore primary schools approached us to help conduct additional AI4K boot camps in their schools.  

We step up to help the schools develop their capabilities and resources so that they will, in future, have the means and flexibility to teach AI for the students at their own pace.  

With Facebook coming in for the second time as official AI4K™ programme sponsor, we kicked off Phase II in late October 2019, with the aim to train MOE school teachers and their ICT facilitators to be certified as AI4K Instructors.   We further accept parent volunteers,  those intending to be and those who are already in the schools’ Parent Support Groups, to be trained – so that they can play a part to help the schools scale the outreach of the boot camps.

An AI for Everyone™ (Parent Volunteers Edition) workshop was held on 26th October 2019 to announce Phase II and invite parents to apply.   Four schools also announced their support during the workshop:

  • Wellington primary
  • Southview primary
  • Yuneng primary
  • St Gabriel primary

The first Train-the-Trainer (TTT) session was held in Yuneng Primary in early November with 15 teachers. Subsequent TTT sessions trained more teachers and parents who have applied from the workshop.  Facebook hosted the second TTT session at their South Beach Tower office.

AI4K™ Master Trainer – Darren TAN

Train-the-Trainers sessions

Facebook’s South Beach Tower office 

As part of their training, trainees conduct or facilitate at least one public AI4K™ boot camps organized by AI Singapore, before graduating.  We arranged 4 boot camps for Phase II – 9th and 14th December 2019, and 5th and 12th January 2020.   

Public Bootcamps at Singapore National Libraries


AI4K™ Phase II was completed at the end of March 2020, and today we have certified AI4K instructor teams across Singapore in 22 primary schools.

Trained Teachers or ICT trainers:
• Ai Tong School
• Canberra Primary
• Chongfu School
• Kong Hwa School
• Nan Chiau Primary
• Pei Chun Public School
• Poi Ching School
• Qihua Primary
• Southview Primary
• St. Gabriel Primary School
• Tao Nan School
• Temasek Primary
• Waterway Primary
• Wellington Primary
• Yuneng Primary

Trained Parent Volunteers and/or Parent Support Groups:
• Frontier Primary School
• Jiemin Primary Ximin Primary
• Nanyang Primary
• Princess Elizabeth Primary
• Red Swastika School
• Tanjong Katong Primary

We sincerely acknowledge the tremendous support from all the schools and trainees in making AI4K™ Phase II a success, and especially our programme sponsor Facebook.

Thank you!

Some comments from the graduated participants:

“It was an eye opening experience to peek into the world of AI from the student’s point of view. Sessions were great fun and it gave me the opportunity to be a part of an amazing team to impart AI knowledge to the kids.”

Sri Rathi Priya
ICT Trainer, Yuneng Primary school
“The AI4K Bootcamp and the Train-the-trainer (TTT) course were great experiences for me. As a participant of the TTT course, I am thrilled for the opportunity to visit Facebook corporate office at South Beach Tower. Thanks for this arrangement! As a facilitator to the AI4K Bootcamp on 7/12, it’s heartening to see excitable kids having fun while contributing to Machine Learning — training the computer to recognise their face/hand gestures.”
Chang Cheng Liang
Teacher, Poi Ching school
“The children was having great time during the boot camp. They participated actively and excitedly throughout the camp. I am truly impressed with their enthusiasm and overwhelming response. Their interest and exposure in AI is unbelievable high. This is a very good start and we are looking forward for AISG to conduct more camp for student with different level of challenges.” “Once again, thank you AISG for the effort in kick start this exposure to children. “
Ooi Hoey Yee
Parent support group
“As an Engineer with many years of experience, training the machine is not really critical for me. This program is a fun way of learning and for me not just teaching it was learning with kids. The kids knowledge, interest and their imagination power is really mesmerizing.” “The kids, our future is really bright!”

Parent volunteer
“The course was comprehensive and intense. It equipped me with the necessary knowledge to share about AI in school Following up on the course, I had implemented an AI curriculum for the Robotics CCA pupils and the pupils enjoyed the sessions very much, especially when they tried the machine learning website. I am sure they will go on to find out mroe about AI in the future.”
Tan Weihan
Teacher, Canberra Primary
“It was a wonderful experience to teach kids about AI. They seemed enjoy the hands-On activities through the session at AI4K Bootcamp. It is indeed a great platform to give a kick start to young kids to build the basic foundation of AI.” “Thanks to AI4K team.”

Parent support group
“I really enjoy the TTT session as it is very educational and engaging with the concept of AI explained to me very clearly and most importantly, I get to play a part in leading the AI4K bootcamp and contributing to the development of young talents in AI.”

Wu Fan
Parent support group
“Amazing introductory AI4K course for both beginners and person like myself who was totally new to AI. An indeed informative and easy to understand module which allowed me to know more about AI and its mechanisms. I loved how Trainer Darren made use of engaging videos, games and activities to make the course so lively, children will love them! I thoroughly enjoyed it and the course content was exactly what I was looking for to fully maximize on my learning.”
Agnes Poon
Parent support group
"The workshop provided clear visuals and hands-on experiences which helped us to gain a deeper appreciation of the fundamentals of AI."
Ernest Choon
Head of Department (ICT), Temasek Primary

Are “complex” ML Models Always Better?

After mentoring data professionals for a few years, I notice that there is always a handful of recent graduates (from universities or boot camps) who are caught by the mental trap of “always go for the most COMPLEX machine learning model!”

For instance, some of them would always go for stepwise regression (bi-directional elimination) to estimate customer lifetime value. When asked the reason why they are not testing with forward selection and backward elimination as well, the usual answer is, “I did not do so because I thought that stepwise regression is sufficient given its complexity.”

So let me set this straight. When we are doing supervised learning, we have a target (Y) and several features (Xs). Our job as data scientists is to find the “hidden” relationship between the target – what we want to predict – from the individual features. Now I do not deny that complex machine learning models do help us to model complex relationships (a mixture of quadratic, cubic, setting “curly” decision boundaries etc) but at the end of the day, we must know that it is a “curve fitting” exercise, as mentioned by Judea Pearl in this article. Now the “curve” need not necessary be curved, but can be a linear relationship (a.k.a when X changes, Y also proportionally changes as well).

What I am saying is, we cannot totally discount a machine learning model just because it is not “complex” enough. We should only discount a machine learning model when it has been tested and we do not get better “predictions” from it (not modelling the relationship between the target and features accurately) or it does not meet the business requirements, for instance not meeting the transparency needed, or the cost of implementation is prohibitive.

In conclusion, as data scientists, we need to test all possible machine learning algorithms, regardless of its “complexity” (usually defined by how complicated the underlying math is) as each class of machine learning model provides a different way to “curve fit” the hidden relationship between the features and the target.

Do keep this in mind at all times! Have fun in your data science learning journey! If the post has been useful, do share it.

The Most Important Data Science Tool for Market and Customer Segmentation

Use K-means and let AI advise you how many segments there (really) are.

Image Credits:

Market and customer segmentation are some of the most important tasks in any company. The segmentation done will influence marketing and sales decisions, and potentially the survival of a company.

Surprisingly, despite the advances in machine learning, few marketers are using such technologies to augment their all-important market and customer segmentation efforts.

In this article, I will show you how to augment your segmentation analysis with a simple, yet powerful machine learning technique called K-means. Learning this will give you an edge over your competitors (and colleagues).

So what’s K-means?

K-means is a popular clustering algorithm for unsupervised machine learning. It groups similar data points into a predefined number of groups.

Let me explain each term for you:

  • Clustering: a machine learning technique for identifying and grouping similar data points (e.g. customers) together.
  • Unsupervised machine learning: you don’t need to provide labelled data to the algorithm on how to group the customers. It will scan through all information associated with each customer and learn the best way to group them together.
  • A predefined number of groups: you need to tell K-means how many groups to form. This is the only input needed from you.

Here is an analogy to the above concepts: Imagine you have some toys and without providing further instruction, you ask your kid to separate the toys into three groups. Your kid will play around and eventually find his own best way to form three groups of similar toys.

Image Credits:

OK … so how does K-means work?

Let’s assume that you think there are 3 potential segments of customers.

K-means will initiate 3 points (i.e. centroids) at random locations and slowly fit each data point to the nearest centroid. Each data point represents one customer, and the customer closest to the same centroid will be in the same group.

The centroids’ locations are adjusted automatically based on the last nearest customer allocated to them. Doing so, it will learn on its own to find other customers with similar characteristics.

K-means identifying 3 clusters in a data set. Source: Wikipedia

What? That looks simple. I could do the grouping visually myself!

The 2-dimensional representation of customers above is a simplified form of visualising the data.

Each information associated with a customer represents one dimension of data. For instance, if you are just plotting the items and quantity purchased, then that’s 2-dimension. Once you consider additional information for each customer, such as country of residence and total spending, the complexity jumps to 4-dimension!

Visualisation of different dimensions. Source: Wikipedia

It is hard for us to imagine grouping items together beyond 3-dimensional space, but not so for machine learning. This makes machine learning much more powerful than traditional methods in finding meaningful segments.

Machine learning can make sense of multiple dimensions beyond our imagination, find similar characteristics of customers based on their information, and group similar customers together.

That’s the beauty of it!

But how do I know what’s the optimal number of groups to form?

You can find the optimal number of groups by following these two principles:

  1. Customers in the same cluster should be close together (tight intra-cluster distance)
  2. Each different cluster of customers should be far from each other (far inter-cluster distance)

Here’s another way of interpreting for the above principles:

  1. Birds of the similar feather flock together. They flock close to each other to find like-minded friends; the more like-minded they are, the closer they flock together.
  2. Different flocks do not come near each together. Each flock is proud of their unique identity; the more distinct their identity, the further they will distance themselves from other flocks.

One method for finding the optimal number of groups is to use Silhouette Score. It takes into consideration both the intra-cluster and inter-cluster distance and returns a score; the lower the score, the more meaningful the clusters formed.

One of the most challenging aspects of using K-means is deciding how many clusters to form. This can be identified mathematically by using Silhouette Score.

Great. Could you illustrate using K-means to segment an actual customer dataset?

I will illustrate using K-means to perform RFM (Recency, Frequency, and Monetary) customer segmentation. The data source is from an actual online retailer in the UK.

I have already pre-processed the data by performing the following step:

  1. Extract most recent 1-year transactions data.
  2. Calculate the Recency of each customer by their latest transaction date.
  3. Calculate the Frequency of each customer by summing the number of invoices tagged to each customer.
  4. Calculate the Monetary Value of each customer by summing up their respective total spend.
# Calculate 1-year date range from latest data
end_date = df['Date'].max()

# Filter 1-year data range from original df
start_date = end_date - pd.to_timedelta(364, unit='d')
df_rfm = df[(df['Date'] >= start_date) & (df['Date'] <= end_date)]

# Create hypothetical snapshot date
snapshot_date = end_date + dt.timedelta(days=1)

# Calculate Recency, Frequency and Monetary value for each customer
df_rfm = df_rfm.groupby(['CustomerID']).agg({
    'Date': lambda x: (snapshot_date - x.max()).days,
    'InvoiceNo': 'count',
    'TotalSum': 'sum'})

# Rename the columns
df_rfm.rename(columns={'Date': 'Recency',
                       'InvoiceNo': 'Frequency',
                       'TotalSum': 'MonetaryValue'}, inplace=True)

# Print top 5 rows

Below is a snapshot of the RFM values of each customer that I created:

RFM value of each customer.

Anything else that I need to do before implementing K-means?

K-means gives the best result under the following conditions:

  1. Data’s distribution is not skewed (i.e. long-tail distribution)
  2. Data is standardised (i.e. mean of 0 and standard deviation of 1)

Why? Recall that K-means groups similar customers together based on their distance from centroids.

The location of each data point on the graph is determined by considering all information associated with the specific customer. If any of the information is not on the same distance scale, K-means might not form meaningful clusters for you.

Machine learning means learning from data. To get the best result, you should prepare the data to make it easy for the machine to learn.

Here are the exact steps to prepare the data before using K-means :

  1. Plot distribution charts to check for skewness. If the data is skewed (i.e. has long-tail distribution), perform log transformation to reduce the skewness
  2. Scale and centre the data to have a mean of 0 and variance of 1

I first check for skewness of data by plotting a distribution plot of Recency, Frequency, and MonetaryValue:

Distribution Plots of RFM. All variables are heavily skewed.

I performed log transformations to reduce the skewness of each variable. Below is the distribution plots of RFM after log transformation:

Distribution Plots of RFM. The skewness is reduced after log transformation.

Once the skewness is reduced, I standardised the data by centering and scaling. Note all the variables now have a mean of 0 and a standard deviation of 1.

Basic statistics of RFM. All variables have mean of 0 and standard deviation of 1 after centring and scaling.

How about finding the optimal number of groups?

Once the data is prepared, the next step is to run iterations of K-means (usually up to 10 clusters) to calculate the Silhouette Score for each cluster.

def optimal_kmeans(dataset, start=2, end=11):
    Calculate the optimal number of kmeans
        dataset : dataframe. Dataset for k-means to fit
        start : int. Starting range of kmeans to test
        end : int. Ending range of kmeans to test
        Values and line plot of Silhouette Score.
    # Create empty lists to store values for plotting graphs
    n_clu = []
    km_ss = []

    # Create a for loop to find optimal n_clusters
    for n_clusters in range(start, end):

        # Create cluster labels
        kmeans = KMeans(n_clusters=n_clusters)
        labels = kmeans.fit_predict(dataset)

        # Calcualte model performance
        silhouette_avg = round(silhouette_score(dataset, labels, random_state=1), 3)

        # Append score to lists

        print("No. Clusters: {}, Silhouette Score: {}, Change from Previous Cluster: {}".format(
            (km_ss[n_clusters - start] - km_ss[n_clusters - start - 1]).round(3)))

        # Plot graph at the end of loop
        if n_clusters == end - 1:

            plt.title('Silhouette Score')
            sns.pointplot(x=n_clu, y=km_ss)
            plt.savefig('silhouette_score.png', format='png', dpi=1000)

A lower Silhouette Score denotes the formation of better and more meaningful clusters; the result below shows the optimal number of clusters is four.

Silhouette Score of 2 to 10 clusters. The optimal number of clusters is 4.

Nonetheless, it is a common practice to implement K-means clustering on +/- 1 of optimal cluster identified; here, it is 3, 4, and 5 clusters.

This gives a wider perspective and facilitates meaningful discussion with your stakeholders to determine the appropriate number of customer segments.

Perhaps there could be some market peculiarities and your stakeholders might decide to implement their marketing strategies on 5 clusters instead of the optimal 4 clusters identified.

How does the end result of K-means segmentation look like?

Now we are ready to run the data through K-means of 3, 4 and 5 clusters to segment our customers.

def kmeans(df, clusters_number):
    Implement k-means clustering on dataset
        dataset : dataframe. Dataset for k-means to fit.
        clusters_number : int. Number of clusters to form.
        end : int. Ending range of kmeans to test.
        Cluster results and t-SNE visualisation of clusters.
    kmeans = KMeans(n_clusters = clusters_number, random_state = 1)

    # Extract cluster labels
    cluster_labels = kmeans.labels_
    # Create a cluster label column in original dataset
    df_new = df.assign(Cluster = cluster_labels)
    # Initialise TSNE
    model = TSNE(random_state=1)
    transformed = model.fit_transform(df)
    # Plot t-SNE
    plt.title('Flattened Graph of {} Clusters'.format(clusters_number))
    sns.scatterplot(x=transformed[:,0], y=transformed[:,1], hue=cluster_labels, style=cluster_labels, palette="Set1")
    return df_new, cluster_labels

Below is the result of the customer segmentation:

Flattened (t-SNE) graph of 3,4 and5 clusters.

Recall that each information associated with a customer creates an additional dimension. The above image is obtained by flattening three-dimensional graphs (created from Recency, Frequency, and MonetaryValue) into two-dimensional graphs for ease of visualisation.

This visualisation can give you a sense of how well the clusters are formed.

In case you are wondering, the technique for flattening high dimensional graph and visualising it in a two-dimensional format is known as t-Distributed Stochastic Neighbor Embedding (t-SNE). You can read up more on this if you are interested; the explanation for this is beyond the scope of this article.

How do I make use of the segmentation results in my marketing?

By this stage, each customer in the dataset has been tagged with their respective group number. You can proceed to use any industry common practice to visualise the results.

Below is an example of using Snake Plot and Relative Importance of Attributes Chart to build personas of each cluster of the segmentation. Both are commonly used in the marketing industry for customer segmentation.

Snake Plot of 3, 4, 5 clusters formed using K-means.
Relative Importance Chart of 3, 4, and 5 clusters formed using K-means.

You can take this result and compare it against your original segmentation done using traditional methods. Is there any big difference?

It is a good practice to perform a deep dive and understand why K-means thinks customers of a particular group belong together (yes, sadly K-means is unable to write us a marketing report on their segmentation decision yet).

With this understanding, you could initiate discussion with relevant stakeholders to seek their opinion and get alignment on how to best segment the customers before launching the next big marketing campaign.

All the relevant codes for this article can be found at my repo.


K-means is a simple but powerful segmentation method. Anyone doing customer or market segmentation should use this to augment traditional methods. Otherwise, they risk becoming obsolete in the age of artificial intelligence.

If you are keen to learn more about Unsupervised Learning and Clustering Methods, AISG has a course for it.

Open Source Solutions at FOSSASIA Summit 2020

In the recent FOSSASIA Summit in March, AI Singapore contributed two speakers to share with the community our efforts to build open source solutions for artificial intelligence. Held against the backdrop of COVID-19, many of the sessions in this year’s event had to be moved online and the organisers deserve full credit for coordinating, aggregating and delivering all the scheduled content despite multiple challenges in last minute travel restrictions and strict social distancing measures put in place.

To start off, one of our principal AI consultants, Tern Poh, gave an overview of AI Singapore’s mission and programmes and where our open source pre-built solutions (aka ‘AI Bricks’) fit within them.

If you are interested in zooming in on specific parts of Tern Poh’s presentation, below are some bookmarks.

  • 1:06 Tern Poh’s profile
  • 2:29 AI Singapore’s mission and programmes
  • 4:13 The pillars in AI Singapore
  • 6:51 The 100E and apprenticeship programmes
  • 9:45 Examples of industry projects in the 100E and apprenticeship programmes
  • 13:25 Introduction to AI Bricks
  • 15:32 TagUI RPA tool
  • 16:28 Golden Retriever information retrieval tool
  • 18:17 Corgi auto labeling tool
  • 19:08 Speech Lab voice-to-text engine
  • 21:27 The AI Makerspace website

Continuing from where Tern Poh left off, our RPA (robotic process automation) engineer, Yi Sheng, gave the audience a deep dive into the use of the TagUI tool, one of the AI Bricks. In a comprehensive coverage of the tool, he demonstrated its utility in various use cases as well as walked everyone through the code that make task automation come alive.

You can find useful bookmarks of Yi Sheng’s sharing below.

  • 0:52 Demo of COVID-19 Update Temperature flow
  • 2:35 Demo of Forex Gmail flow
  • 4:36 Demo of Letter flow
  • 6:25 Code walkthrough of COVID-19 Update Temperature flow
  • 11:23 Code walkthrough of Forex Gmail flow
  • 16:05 Code walkthrough of Letter flow
  • 17:46 Installing TagUI
  • 21:00 Writing and deploying your first flow
  • 26:02 Navigating the documentation of TagUI
  • 34:07 Creating a more complex flow : 1-for-1 Deals

The AI Singapore team here will continue to do its best to contribute to open source solutions in the field of artificial intelligence.

mailing list sign up

Mailing List Sign Up C360