The Platforms Engineering Group supports AI Innovation and other Pillars in AI Singapore and empowers our users to solve challenging problems through infrastructure, platforms and engineering.
The team comprises InfraOps, DataOps, MLOps and SecureAI teams that build internal software platforms to enable and empower our users to create meaningful and robust solutions.
AI model training and inferencing consume a lot of electricity. With the push for a more sustainable way to train and operate AI models, the team is also exploring more efficient methodologies to host such AI models.
The team also provides training and mentors for the AIAP apprentices and works closely with the 100E and Bricks teams to assist and enable the AI engineers to scale their AI models with best practices in ML Ops, CI/CD pipelines and AI robustness.
Our teams include folks with extensive experience in High-Performance Computing, Big Data/Internet of Things, Infrastructure, Ops, Software Engineering, Data Engineering and Machine Learning.
Platforms Engineering Teams
The InfraOps team manages, operates and secures the on-prem and cloud infrastructure and internal software platforms that enable AI Singapore engineering teams.
The DataOps team looks after the data infrastructure and data processing pipelines. They help onboard, secure and decommission critical datasets from our collaborators and project sponsors throughout the lifecycle of the projects.
The MLOps team works closely with the 100E engineering teams to ensure that good software practices and tooling are adopted and to scale the ML training and deployment pipelines.
Our unique SecureAI team is dedicated to developing processes and tooling to support the building of secure and trustworthy AI solutions. They help ensure that the robustness testing and coverage are addressed in the ML model training process for our AI engineering teams.
On-Premise and Cloud Hybrid High-Performance Clusters for AI/ML workloads
Over 7000 x86 and POWER CPUs serving infra, data, ML workloads.
0.5 PB of storage providing object store, NFS/PFS and file I/O services
We leverage Google Cloud and Azure for the latest and greatest in cloud technology and infra, including AI Accelerators (A100 GPUs, Cloud TPU, other xPUs, FPGAs)
32 NVIDIA V100 GPUs, and 6 FPGAs for accelerated ML training and inference workloads.
10G Ethernet and 100G Infiniband networks providing infra and cluster networking
NUS-NSCC innovation 4.0 Data Centre
AI Singapore collaborates with the National Supercomputing Centre (NSCC) and hosts Singapore’s largest AI supercomputer (as of 2022). We believe large-scale AI models require the proven techniques from the High-Performance Computing (HPC) world, such as parallel computing, robust scheduling for workloads and checkpointing mechanisms for long-running AI runs.