How do you predict the travel time distribution of a user while factoring in the effect of a traffic disruption on a nearby road? How do you keep track of users’ locations at scale and identify those with similar travel patterns? These are just two of the many research questions that Professor Cong Gao of Nanyang Technological University is addressing through his work on enriched geospatial data management and mining.
It has been more than 12 years since Prof Cong first started focusing on this area of research, and he continues to be fascinated by the challenges and potential surrounding the huge and ever-growing volumes of such data.
“With the proliferation of technologies such as smartphones, stationary sensors and satellites, a flood of geospatial data is becoming available. Such data is enriched by multiple additional sources or contexts such as social information, text, multimedia data and scientific measurements,” he said. “This huge amount of enriched geospatial data holds the key to new and possibly useful knowledge.”
Today, Prof Cong continues to push the envelope in spatial data mining, developing new techniques that can be applied to different types of enriched spatial data. Some examples of this data include enriched point spatial data such as points of interest, trajectory data and region data.
In the traffic example, deep generative models have been developed to make use of enriched geospatial data to predict travel time distribution, future travel speed as well as the impact of a traffic accident on a nearby road.
For the user tracking scenario, Prof Cong and his research team used a novel probabilistic approach to model the spatial, temporal and activity aspects of human behaviour from the user’s historical mobility data. The model has been successfully applied to accurately predict users’ locations, identify potential persons of interest and predict the next location of a user.
Prof Cong has also proposed a similarity computation measure for trajectory data, which can be used to identify users with similar travel patterns. In the current COVID-19 pandemic, this ability to analyse spatial or spatio-temporal data is crucial for applications such as contact tracing and spread prediction.
In the area of enriched geospatial data management, Professor Cong’s work on spatial-textural indexing, published in Very Large Databases 2009, has been used as a benchmark for subsequent work on spatial-keyword query processing. The paper, which attracted more than 500 citations, opened up a new sub-area of research into spatial databases which has been followed up on by top researchers and international research groups. He has also built systems for managing both static and streaming enriched geospatial data to support various types of queries on such data.
Going forward, Prof Cong is looking to build a machine learning-driven database for enriched spatial data to support both data querying and more advanced data analytics. The new database system and accompanying techniques can be used to create value-added services based on large enriched spatial data, for various application domains such as the telecom industry, intelligent transportation and smart cities. “The ultimate goal,” he said, “is to invent enabling techniques to power the next generation of intelligent systems and unleash the enormous value of large enriched geospatial data.”