Developing reliable AI models for clinical applications, especially in breast cancer, requires access to both clinical and imaging data. While The Cancer Imaging Archive (TCIA) offers a large collection of publicly available medical images and clinical data, these existing datasets are often too heterogeneous for direct use in AI model development. This study aimed to harmonize these datasets to create a unified resource for AI research in breast cancer.
The authors concluded that the RV-Cherry-Picker platform offers the largest publicly available, harmonized breast cancer dataset, providing a valuable resource for developing AI models aimed at improving clinical outcomes.
Key points:
- The proposed platform allows unified access to the largest, homogenized public imaging dataset for breast cancer.
- A methodology for the semantically enriched homogenization of public clinical data is presented.
- The platform is able to make a detailed selection of breast MRI data for the development of AI models.
Article: Public data homogenization for AI model development in breast cancer
Authors: Vassilis Kilintzis, Varvara Kalokyri, Haridimos Kondylakis, Smriti Joshi, Katerina Nikiforaki, Oliver Díaz, Karim Lekadir, Manolis Tsiknakis & Kostas Marias