Medicine, by its nature, is a multifaceted domain that requires the synthesis of information across various modalities. Medical generative vision-language models (VLMs) make a first step in this direction and promise many exciting clinical applications. However, existing models typically have to be fine-tuned on sizeable down-stream datasets, which poses a significant limitation as in many medical applications data is scarce, necessitating models that are capable of learning from few examples in real-time. Here we propose Med-Flamingo, a multimodal few-shot learner adapted to the medical domain. Based on OpenFlamingo-9B, we continue pre-training on paired and interleaved medical image-text data from publications and textbooks. Med-Flamingo ...
My thesis develops machine learning methods that exploit multimodal clinical data to improve medical...
Curated datasets for healthcare are often limited due to the need of human annotations from experts....
The prowess that makes few-shot learning desirable in medical image analysis is the efficient use of...
Medicine, by its nature, is a multifaceted domain that requires the synthesis of information across ...
Building models that can be rapidly adapted to novel tasks using only a handful of annotated example...
With the availability of large-scale, comprehensive, and general-purpose vision-language (VL) datase...
Recently a number of studies demonstrated impressive performance on diverse vision-language multimod...
The large-scale pre-trained vision language models (VLM) have shown remarkable domain transfer capab...
In this work, we investigate the usefulness of vision-language models (VLMs) and large language mode...
The common practice in developing computer-aided diagnosis (CAD) models based on transformer archite...
Medical visual question answering (VQA) is a challenging task that requires answering clinical quest...
While deep learning approaches have led to breakthroughs in many medical image analysis tasks, the a...
The project aims to build a medical artificial intelligence (AI) assistant which has the potential t...
Multi-modal foundation models are typically trained on millions of pairs of natural images and text ...
Medical images are playing an important role in the medical domain. A mature medical visual question...
My thesis develops machine learning methods that exploit multimodal clinical data to improve medical...
Curated datasets for healthcare are often limited due to the need of human annotations from experts....
The prowess that makes few-shot learning desirable in medical image analysis is the efficient use of...
Medicine, by its nature, is a multifaceted domain that requires the synthesis of information across ...
Building models that can be rapidly adapted to novel tasks using only a handful of annotated example...
With the availability of large-scale, comprehensive, and general-purpose vision-language (VL) datase...
Recently a number of studies demonstrated impressive performance on diverse vision-language multimod...
The large-scale pre-trained vision language models (VLM) have shown remarkable domain transfer capab...
In this work, we investigate the usefulness of vision-language models (VLMs) and large language mode...
The common practice in developing computer-aided diagnosis (CAD) models based on transformer archite...
Medical visual question answering (VQA) is a challenging task that requires answering clinical quest...
While deep learning approaches have led to breakthroughs in many medical image analysis tasks, the a...
The project aims to build a medical artificial intelligence (AI) assistant which has the potential t...
Multi-modal foundation models are typically trained on millions of pairs of natural images and text ...
Medical images are playing an important role in the medical domain. A mature medical visual question...
My thesis develops machine learning methods that exploit multimodal clinical data to improve medical...
Curated datasets for healthcare are often limited due to the need of human annotations from experts....
The prowess that makes few-shot learning desirable in medical image analysis is the efficient use of...