Bin Yu, Chancellor’s professor at the University of California, received her Honorary Doctorate from University of Lausanne (UNIL) (Faculty of Business and Economics), in June 2021. The ceremony took place online : https://www.youtube.com/watch?v=1JRGPGhcYzU. Prof. Yu will give a seminar in person on 8th September 2022.
Title: The Predictability-computability-stability (PCS) framework for veridical data science with a case study to seek genetic drivers of a heart disease
Abstract:
"A.I. is like nuclear energy -- both promising and dangerous" -- Bill Gates, 2019.
Data Science is a pillar of A.I. and has driven most of recent cutting-edge discoveries in biomedical research and beyond. Human judgement calls are ubiquitous at every step of a data science life cycle, e.g., in choosing data cleaning methods, predictive algorithms and data perturbations. Such judgment calls are often responsible for the "dangers" of A.I. To maximally mitigate these dangers, we introduce a framework based on three core principles: Predictability, Computability and Stability (PCS), for a veridical (truthful)data science process. The PCS framework unifies and expands on the best practices of machine learning and statistics.PCS emphasizes reality check through predictability and takes a full account of uncertainty sources in the whole data science life cycle including those from human judgmentcalls such as those in data curation/cleaning. PCS Iconsists of a workflow and documentation and is supported by our software package v-flow (https://github.com/Yu-Group/veridical-flow).
We illustrate usefulness of PCS in the development of iterative random forests (iRF) for predictable and stable non-linear interaction discovery (in collaboration with the Brown Lab at LBNL and Berkeley Statistics).
Finally, in the pursuit of genetic drivers of a heart disease called hypertrophic cardiomyopathy (HCM) as a CZ Biohub project in collaboration with the Ashley Lab at Stanford Medical School and others, we use iRF and UK Biobank data to recommend gene-gene interaction targets for knock-down experiments. We then analyze the experimental data to show exciting findings on genetic drivers of HCM.