food11-vit
This model is a fine-tuned version of google/vit-base-patch16-224 on the Food11 dataset.
Model description
ViT-base transformer trained to classify food images into 11 categories using transfer learning and PyTorch Lightning.
Intended uses & limitations
This model is intended for food image classification tasks with a fixed set of 11 common food types. It may not generalize to out-of-distribution food images or fine-grained food variants.
Classes
- Bread
- Dairy product
- Dessert
- Egg
- Fried food
- Meat
- Noodles-Pasta
- Rice
- Seafood
- Soup
- Vegetable-Fruit
Training and evaluation data
The model was trained on the training split of the Food11 dataset (9,866 images) and validated on the validation split (3,430 images). The test set was not used.
Training procedure
Training hyperparameters
The following hyperparameters were used:
- learning_rate: 2e-5
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: AdamW
- lr_scheduler_type: linear
- num_epochs: 5
Training results
Epoch | Step | Training Loss | Validation Loss | Validation Accuracy |
---|---|---|---|---|
1 | 308 | 1.2517 | 0.1991 | 0.9531 |
2 | 617 | 0.4728 | 0.1376 | 0.9621 |
3 | 926 | 0.2027 | 0.1281 | 0.9621 |
4 | 1235 | 0.2861 | 0.1395 | 0.9589 |
5 | 1544 | 0.2943 | 0.1223 | 0.9659 |
Framework versions
- Transformers 4.39.3
- PyTorch 2.1.2
- Datasets 2.18.0
- Tokenizers 0.15.1
- Downloads last month
- 13
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for Skorm/food11-vit
Base model
google/vit-base-patch16-224