Image Classification
MLX
Safetensors
data2vec-vision
vision
File size: 979 Bytes
6b0d36d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b1d189a
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
---
license: apache-2.0
library_name: mlx
tags:
- image-classification
- vision
datasets:
- imagenet
- imagenet-1k
---

# Data2Vec-Vision (large-sized model, fine-tuned on ImageNet-1k)

![model image](https://raw.githubusercontent.com/patrickvonplaten/scientific_images/master/data2vec.png)

BEiT model pre-trained in a self-supervised fashion and fine-tuned on ImageNet-1k (1,2 million images, 1000 classes) at resolution 224x224. It was introduced in the paper [data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language](https://arxiv.org/abs/2202.03555) by Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli and first released in [this repository](https://github.com/facebookresearch/data2vec_vision/tree/main/beit).

## Usage

```python
from mlx_ssl.models import Data2VecVisionForImageClassification

model = Data2VecVisionForImageClassification.from_pretrained(
    "mlx-community/data2vec-vision-large-ft1k"
)
```