Rigorous Validation & Data Science

Our validation program is centered on one principle: AI models should be trusted only when they are independently replicated, clinically relevant, and transparent about their limits. We use cross-cohort benchmarking, robustness testing, and reproducibility-focused evaluation to determine whether model performance holds across institutions, populations, and data modalities, rather than only in the training environment.

We recently secured support through the NIH Autism Data Science Initiative (ADSI) for Validate ASD: Independent Multimodal Replication and Validation of Autism Data-Science Models. This project applies a two-track validation strategy, combining intact model testing with code-blinded replication, and integrates structured clinical data from Texas Children’s Hospital with genomic data from SPARK, metabolomics from BaBS, and environmental exposure models linked to placental neurodevelopmental biomarkers. The goal is to define where current autism models generalize, where recalibration is needed, and how fairness and reliability can be improved for real-world pediatric use.

This work is designed to produce high-impact validation reports and community-accessible tools that raise standards for autism model evaluation and clinical deployment. Representative validation-related publications include (Jeong and Liu, 2019), (Raman et al., 2018), (Yi and Liu, 2011), and (Wan et al., 2014).