Towards Semantic-Aware Multimodal and Multilingual Deep Learning Systems for E-Commerce Applications

The project aims to address the scarcity of annotated data for low-resource languages and the constraints in learning complex semantics, which are key challenges faced by deep learning models that are applied in e-commerce where the environment is multi-lingual and multi-modal.

Target sector: Finance

Lead PI:  Asst Prof Luu Anh Tuan (NTU)


  • Prof Cong Gao (NTU)
  • Prof Liu Yang (NTU)


  • Assoc Prof Chen Wenli (NIE)
  • Yan Shuicheng (SEA AI Lab)
  • Liu Qian (SEA AI Lab)

Host Institution: Nanyang Technological University (NTU)

Industry Partner: SEA Limited

Problem Scope

E-commerce is typically carried out in various languages across the globe. However, there is no effective deep learning model for e-commerce applications that works well in the multilingual environment. The key reason lies in the dearth of annotated training data for languages of interest. As such, when the model is trained on one rich-source language (e.g., English or Chinese), its performance will drop significantly when applying the model in another low-resource language (e.g. Thai or Vietnamese), where obtaining rich annotated data is a time-consuming and expensive process.

In addition, e-commerce is also conducted in a multimodal environment that comprises not only textual data but also other forms of information such as images, video, and audio. Most current deep learning models for e-commerce are trained distinctly from text-based corpora, and fail to map and discover the implicit relations of the objects in the image/video. This impedes the models to learn complex semantics in multimodal inputs and hence, fail to understand customer requirements, leading to wrong recommendations or answers.

Design Approach

We will develop novel deep learning models novel for e-commerce applications in the setting of multilingual and multimodal environments, comprising (i) a meta- learning framework that deals with limited annotated data in low-resource languages and possibly unseen languages, and (ii) a novel approach to incorporate multi-modal information into current deep learning architectures.

We will focus on 5 low-resource languages in Southeast Asia (i.e. Bahasa Indonesia, Bahasa Malaysia, Vietnamese, Thai and Filipino) which are the most significant based on population and they comprise approximately 68% of industry partner SEA’s total quarterly active e-commerce users.

Potential Impact/Benefits to Target Sector 

Success would contribute to transformative disruptions in the e-commerce sector by introducing a new generation of AI systems that are capable of handling many different languages, and more intelligent in understanding and discovering complex semantics within customer requirements and behaviours. These would help e-commerce companies in Singapore improve their services and systems (incl. chatbot, recommendation and sentiment analysis) to boost revenues across multiple markets.