Latest advances in Research: Building a multimodal X-ray foundation model

Emanuele Valeriano

We are announcing novel AI research designed to transform healthcare analytics, medical image analysis, and radiological diagnosis targeting a breadth of diagnostic challenges. GE HealthCare’s X-ray foundation model is built on an expansive dataset of 1.2 million anonymized X-ray images from diverse regions across the full body and powered by the most performant language models available. Through this research, GE HealthCare is exploring how these models can offer real-world practical value to healthcare professionals seeking efficient and reliable tools for diagnostics and data management by performing established tasks like segmentation and classification with improved accuracy.

Most of the healthcare industry’s data is unstructured. Data from medical images, notes, audio recordings, and device readings exist across multiple modalities, which render it unusable by traditional analytics, business insight capabilities, and even most machine learning algorithms. However, traditional AI approaches in the industry, while effective within specific modalities, struggle with these diverse data types ranging from text and images to audio and video. Additionally, traditional approaches require vast amounts of domain-specific data and manual feature engineering for different disease states leading to costly and resource-intensive development cycles. As a result, critical medical insights can be missed, and clinicians are forced to spend valuable time and resources sifting through data manually.

To address these challenges, we are pioneering the development of foundation models in healthcare. These sophisticated AI systems are fine-tuned for healthcare datasets, with the goal of enabling superior performance and adaptability across diverse applications. If fully developed, multimodal medical LLMs could become the foundation for new assistive technologies in professional medicine, medical research, and consumer applications. As with our past initiatives, we emphasize the critical need for a comprehensive evaluation of these technologies in collaboration with the medical community and the broader healthcare ecosystem

In internal testing, our full-body X-ray model stands out even with limited training data or when faced with out-of-domain challenges, showcasing its robustness and generalizability. Remarkably, when we fine-tuned the model using only chest-specific training data, it still showed significant improvements in non-chest-related tasks, such as anatomy detection and lead marker detection, outperforming in our experiments existing chest-specialized pre-trained models.

Our ongoing research is focused on rigorously comparing our X-ray model against the latest publicly available models. The initial results are compelling, suggesting the X-ray model could perform critical tasks like segmentation, classification, and visual localization with high accuracy.

Today’s announcement builds on GE HealthCare’s pioneering research in healthcare foundation models. Our ultrasound research model, SonoSAMTrack combines a promptable foundation model for segmenting objects of interest on ultrasound images with a state-of-the-art contour tracking model.

Advantages of Foundation Models

As outlined above, foundation models can overcome the limitations of traditional AI approaches by moving past narrow modalities to generalizing across diverse tasks. Foundation models can unlock significant advantages for healthcare professionals:

  1. Broad generalization: Foundation models excel at generalizing across diverse tasks and domains. By leveraging their pre-training on vast and varied datasets, these models can quickly adapt to new scenarios and applications. This broad applicability means a single model can potentially handle multiple tasks that previously required separate, specialized models.

  2. Minimal fine-tuning: One of the most significant advantages of foundation models is their ability to perform well on new tasks with minimal fine-tuning. Unlike traditional models that often require extensive retraining for each new application, foundation models can be adapted to new tasks quickly, often with just a small amount of task-specific data.

  3. Cost-effective and fast: The efficiency and scalability of foundation models can translate into significant cost savings and faster development times. By reducing the need for extensive data collection, labeling, and model development for each new task, developers can implement AI solutions more rapidly. In addition, foundation models can adapt to new tasks without the need to fully retrain the model from scratch, which could translate to lowered costs. This efficiency can allow for quicker iterations and more agile responses to evolving healthcare needs.

  4. Robustness across domains: Foundation models demonstrate remarkable robustness when faced with out-of-domain challenges. Their comprehensive pre-training allows them to maintain high performance even when applied to tasks or data distributions that differ from their original training set. This robustness can be particularly valuable in healthcare, where data can vary significantly across different institutions, populations, or medical specialties.

Foundation Models in Action

We used a dataset of X-ray images and corresponding radiology reports sourced from various imaging sites and regions, expanding beyond the traditional focus on chest X-rays. This curated dataset encompasses a wide range of anatomies, providers, manufacturers, demographics, and pathologies, aiming to enhance the generalizability and applicability of our models across diverse clinical scenarios. The training process for our foundation model also uses licensed datasets that are anonymized and compliant with privacy and healthcare requirements.

Potential applications of the foundation model include:

  1. Report generation: By fine-tuning our pre-trained models on specific datasets like IU-X-ray, we've achieved substantial gains in report quality compared to baseline models. Our full-body pre-trained model showed performance improvements on metrics that measure semantic similarity and summarization capability (e.g., CIDEr and ROUGE) when compared to other similar models. If successful, this advancement could streamline a process that was time-consuming and susceptible to human error, allowing clinicians to spend more time focused on patient care.  report-generation.pngFigure. Report generation example from X-ray images and reference report.

  2. Classification: We are evaluating our model's performance on a diverse set of downstream classification tasks, including disease diagnosis, gender identification, view detection, anatomical landmark localization, marker detection, and mirror reflection identification. Using a linear probing setup with a three-layer flat structure Multi-layer Perceptron (MLP) as the classifier head, initial results reveal that our model demonstrated significant improvements in mean average AUROC (mAUC) scores across various tasks.

  3. Grounding: We are testing the utility of our pre-trained model on grounding tasks. This task involves locating the relevant region in a medical image corresponding to a textual phrase query. These  tasks can potentially enable grounded medical interpretations which are critical for the deployment of responsible and explainable AI.  Our model, when fine-tuned on a dataset specialized for the chest region, achieved significant improvements over both baseline and chest-specialized pre-trained models on the MiOU metric, which measures the overlap between the predicted segmentation area and the ground truth. This advancement could enhance the interpretability of AI insights, allowing clinicians to visually pinpoint the exact areas within an image that the model is focusing on, helping building trust in AI-driven outputs.

By leveraging a diverse dataset encompassing various anatomies, demographics, and pathologies, our foundation model exhibits enhanced generalization capabilities, outperforming both baseline models and specialized chest X-ray pre-trained models. We've achieved improvements across several chest and non-chest classification tasks and demonstrated more robust performance, even in scenarios with limited labeled data availability.

The impact of these findings extends beyond immediate improvements in specific tasks. Foundation models have the potential to democratize access to advanced AI capabilities in healthcare. Their adaptability and scalability promise faster development cycles, reduced reliance on massive datasets for each new application, and ultimately, a more rapid and widespread adoption of AI-powered solutions in healthcare.

The journey toward fully realizing the potential of foundation models in healthcare is ongoing, but the progress we've made is significant. By addressing the unique challenges of healthcare data and harnessing the power of large-scale, diverse datasets, we're creating AI solutions that are not only more powerful and efficient but also more adaptable to the complex and ever-changing landscape of healthcare. For the future, we will continue to evolve our model, and develop new foundation models that accelerate the development of new applications by pushing at the boundaries of what’s possible in healthcare.

Of paramount importance, our commitment to responsible AI remains unwavering. We recognize the profound impact AI can have on society, particularly in a field as critical as healthcare. Our foundation models are designed with robust oversight mechanisms to ensure responsible use, emphasizing explainability, fairness, and the mitigation of bias.

As we continue to innovate in this space, we invite healthcare leaders, technologists, and clinicians to join us in exploring the vast potential of these powerful tools. Together, we can shape a future where AI not only enhances our capabilities but also fundamentally improves the quality and accessibility of healthcare for all.

The model is being developed as a result of GE HealthCare’s strategic collaboration with Amazon Web Services. 


This article was written by Emanuele Valeriano, who is a Principal Technical Product Manager in the AI organization at GE HealthCare. His work focuses on foundation models to accelerate the development of AI in the healthcare sector. In his spare time, he enjoys cooking delicious meals and traveling the world.


Learn more about GE HealthCare's AI research here:

SonoSAM, pioneering research analysis of AI in ultrasound imaging

Machine learning models in healthcare: enhancing diagnostic AI models with smart data selection