vector-institute
/

open-pmc-clip

Model card Files Files and versions Community

open-pmc-clip / README.md

neginb's picture

Create README.md

7fe339b verified 4 months ago

|

history blame contribute delete

1.74 kB

	---
	license: mit
	datasets:
	- vector-institute/open-pmc
	metrics:
	- accuracy
	- f1
	- recall
	---
	<div align="center">
	<img src="https://github.com/VectorInstitute/pmc-data-extraction/blob/0a969136344a07267bb558d01f3fe76b36b93e1a/media/open-pmc-pipeline.png?raw=true"
	alt="Open-PMC Pipeline"
	width="1000" />
	</div>

	<p align="center">
	<strong>Arxiv:</strong> <a href="http://arxiv.org/abs/2503.14377" target="_blank">Arxiv</a>
	\|
	<strong>Code:</strong> <a href="https://github.com/VectorInstitute/pmc-data-extraction" target="_blank">Open-PMC Github</a>
	\|
	<strong>Dataset:</strong> <a href="https://huggingface.co/datasets/vector-institute/open-pmc" target="_blank">Hugging Face</a>
	</p>


	## Model Overview

	This model is a checkpoint trained on the Open-PMC dataset. It utilizes a Vision Transformer (ViT-base16) as the backbone for visual feature extraction and PubMedBERT for processing text data. The model is trained using Contrastive Learning with the vanilla Info-NCE loss to learn meaningful representations across different modalities.

	## Model Architecture

	- Vision Backbone: ViT-B/16 (Pretrained on ImageNet)
	- Text Backbone: PubMedBERT (Pretrained on PubMedCentral Abstracts)
	- Training Objective: Contrastive Learning with Info-NCE Loss

	## Training Framework

	The model was trained using the mmlearn framework, which is designed for multimodal learning. You can find more information and access the framework [here](https://github.com/vectorInstitute/mmlearn).

	## How to Use

	Please visit out GitHub for information on how to run benchmarking using this checkpoint