Symbolic Regression:
A Pathway To Interpretability Towards Automated Scientific Discovery

Tutorial at ACM KDD 2024, August 25, 10:00 am-1:00 pm

Abstract

Symbolic regression is a machine learning technique employed for learning mathematical equations directly from data. Mathematical equations capture both functional and causal relationships in the data. In addition, they are simple, compact, generalizable, and interpretable models, making them the best candidates for i) learning inherently transparent models and ii) boosting scientific discovery. Symbolic regression has received a growing interest since the last decade and is tackled using different approaches in supervised and unsupervised deep learning, thanks to the enormous progress achieved in deep learning in the last twenty years. This tutorial overviews symbolic regression: problem definition, approaches, and key limitations, discusses why physical sciences are beneficial to symbolic regression, and explores possible future directions in this research area.

Support materials include a friendly introduction and full review on Symbolic Regression and an online living review.

Tutorial Outline

Why black-box models?
Explainable versus Interpretable AI
Natural sciences versus ML
Why Symbolic Regression?
Symbolic regression (SR):
1. Problem definition
2. Benchmark datasets
3. Key challenges
SR Approaches and applications

Presenters

Prof. Sanjay Chawla
He is a Senior Research Director and leads the Qatar Center for Artificial Intelligence with the Qatar Computing Research Institute. His research work spans diverse areas of data mining and machine learning, including analysis of spatio-temporal data and anomaly detection. His work has received several best paper awards, most recently the most influential paper in the last ten years at PAKDD 2021. Before joining QCRI, he was a Professor and Head of the School of Information Technologies at the University of Sydney, Australia. He was a PC-Cochair of ACM SIGKDD 2021.

Dr. Nour Makke
She is a researcher at Qatar Center for Artificial Intelligence with the Qatar Computing Research Institute. Her research work focuses on interpretable machine learning for scientific discovery in the natural sciences, in particular the physical sciences, and the development and application of AI-based methods to these disciplines, with an emphasis on symbolic regression within the context of interpretable AI.