Robust computer vision with applications to microscopic image analysis
Shunxin Wang is a PhD student in the Department of Datamanagement & Biometrics. (Co)Promotors are prof.dr.ir. R.N.J. Veldhuis, prof.dr. C. Brune and dr. N. Strisciuglio from the Faculty of Electrical Engineering, Mathematics and Computer Science.
Computer vision models often struggle in real-world applications due to data distribution shifts caused by variations in imaging conditions, sensor errors, or stylistic differences. This performance drop is particularly concerning in high-stakes domains like healthcare and autonomous driving, motivating research into improving model robustness and generalization. This thesis explores model learning behavior and develops techniques to improve robustness against common corruptions such as noise and blur.
The research first reviews existing robustness methods, e.g. data augmentation, and finds that simply increasing model and dataset size does not effectively improve generalization. Instead, a novel frequency augmentation approach is proposed, combined with traditional augmentation methods to better handle real-world corruptions. Further analysis reveals that models often rely on frequency shortcuts, exploiting small and specific sets of frequencies for predictions. This can harm out-of-distribution (OOD) performance. Methods are proposed to mitigate this learning bias without sacrificing in-distribution performance.
The study also addresses real-world challenges in microscope image analysis, where blur degrades image quality. A disentanglement representation learning technique separates cell structures from blur artifacts, enabling sharper image reconstructions. Additionally, a novel plant protoplast dataset is introduced to advance research in automated cell analysis for agricultural applications.
In conclusion, this work advances model robustness through innovative augmentation and representation learning techniques while uncovering critical insights into shortcut learning. Future directions include designing inherently robust architectures and training strategies to ensure reliable performance across diverse real-world scenarios.