by Daniel Trugman (University of Nevada Reno), Tushar Mittal (Penn State University), Xuesong Ding (University of Texas at Austin)
Sep 13, 2023
A webinar series to gather community opinions on the best AI/ML strategies and necessary steps to accomplish SZ4D goals.
On August 2nd - 5th 2023, SZ4D hosted a three-day virtual workshop focused on how ideas and techniques from machine learning and artificial intelligence (ML/AI) could transform SZ4D science and initiatives spanning all three working groups - Faulting & Earthquake (FEC), Magmatic Drivers of Eruption (MDE), and Landscapes & Seascapes (L&S). The workshop consisted of three sessions (each 2.5 hours long), held on three consecutive days, and was robustly attended, with about 250 unique participants. The participation numbers clearly indicated significant interest in this topic among the SZ4D community. Participants came from a diverse range of scientific disciplines, AI/ML experience, and career stages and included many new faces to the SZ4D community, all united by a common interest in research at the interface between data science and geoscience.
The motivation for the workshop came from the recognition that initial scientific planning and infrastructure efforts within SZ4D were focused on classical, physic ths-based modeling and inversion workflows. In contrast, the long-term vision for SZ4D with multiple sensor arrays and datasets would lead to an immense contribution of Big Data in the geosciences on a scale rarely achieved in practice. Thus, concerted efforts are needed to reach out to and mobilize the community of geodata scientists to position SZ4D to effectively utilize these big datasets that are frequently outside the realm of what is computationally feasible with traditional analysis approaches. In addition, the multi-sensor, multi-disciplinary datasets (including modeling results) that the SZ4D program envisions collecting would require new analysis approaches to answer the various science questions described in the implementation report. ML/AI methods can significantly help with addressing this challenge.
The three workshop sessions focused on three different themes with relevance to SZ4D data science: (1) Making Sense of Data with ML/AI, (2) Making Predictions with ML/AI, and (3) Facilitating Process-Based Models with ML/AI. During each session, three keynote speakers representing different Working Groups within SZ4D presented research ideas on cutting-edge ML/AI applications in the geosciences to inspire participants to see connections between these techniques and their own scientific domain expertise. The presentation slides for all talks are publicly available on the workshop page. The speakers covered a diverse range of topics ranging from challenges & best practices for data compilation and annotation (for supervised learning), new technical approaches using ML/AI to speed up forward modeling work and parameter inversion, using AI for data denoising and unsupervised clustering, as well as a few successful case studies using AI in seismic data processing and real-time hazard prediction. Following these keynote lectures, participants were grouped into breakout rooms to reflect on these talks, connect to their research interests, and discuss future priorities and activities for SZ4D. Summaries of each breakout session were reported back to all attendees during the final synthesis period of each session. The concepts and techniques conveyed by keynote speakers resonated broadly with participants, who made insightful connections to their disciplines. As with any novel line of research, there will inevitably be challenges in adapting these workflows to current and future scientific questions and datasets relevant to SZ4D. A few common challenges highlighted across the board included access to suitable labeled datasets, training models with sparse data and incorporating multiple data types, and ways to incorporate physical priors/physics based models.
The immense participation in this workshop illustrates the importance of AI/ML to future SZ4D initiatives and its capability to draw broad support from the scientific community across geosciences outside the focused SZ4D ecosystem. Workshop participants voiced strong interest in future AI/ML activities, especially similar workshops, training modules, summer schools, and skill-building workshops that would expand the capacity for novel research for scientists of all career stages. The participants also expressed strong interest in SZ4D/MCS leading community efforts to develop easy to use and well documented software tools (e.g., GUIs in python) for some of the AI/ML techniques mentioned in the talks and discussion to make these methods more accessible by reducing the barrier to entry. Moreover, SZ4D could serve as a community repository for instructional tools, open datasets, and example workflows that scientists could learn from and adapt within their research domain. The success of the 2023 SZ4D AI/ML Virtual Workshop demonstrates how essential it will be that the SZ4D community embraces and meets the Big Data challenges and discovery opportunities head-on.