UROP Proceeding 2024-25

School of Engineering Department of Computer Science and Engineering 109 The Future of Medical Imaging: Advancements in Analysis through Vision Language and Large Models Supervisor: CHEN Hao / CSE Student: WEI Yuhan / COMP Course: UROP 1100, Fall UROP 2100, Summer Training Medical Image Foundation Models (MIFMs) requires large-scale datasets, which are costly to acquire and annotate due to expensive imaging equipment and strict privacy regulations. In contrast, synthetic data provides a cost-effective alternative by eliminating concerns over data privacy, manual annotation, and storage costs. Motivated by the assumption that anatomical structures can be modeled using a mixture of Gaussian distributions, we propose a novel MIFM framework, Randomized Synthetic data Engine (RaSE). This framework synthesizes medical images by sampling from multiple Gaussian distributions and learns to disentangle these Gaussian distributions. Extensive experiments beyond the real-based MIFMs demonstrate that the models pre-trained with our RaSE exhibit superior generalization across diverse imaging modalities and downstream tasks. The Future of Medical Imaging: Advancements in Analysis through Vision Language and Large Models Supervisor: CHEN Hao / CSE Student: ZHANG Luoyi / COSC Course: UROP 1100, Fall UROP 2100, Spring UROP 3100, Summer This report presents a study conducted during Summer 2025, evaluating the performance and training dynamics of three state-of-the-art vision models. The work is structured into three sections: 1. Review of related works: An analysis of innovations in the three models. 2. Experimental framework: A systematic experiment of hyperparameter sensitivity (learning rates, batch sizes) across model scales, giving insight into training convergence and stability. 3. Reflections: Empirical validation of an optimal training configuration balancing performance and computational cost, along with limitations and future directions. This study not only provides a reproducible pipeline for medical image analysis but also highlights trade-offs between transformers and SSM-based architectures.

RkJQdWJsaXNoZXIy NDk5Njg=