ASL2000: American Sign Language Recognition Model

Model Description

This is a pre-trained I3D (Inflated 3D ConvNet) model for American Sign Language recognition with 2000 classes.

Architecture: I3D (Inflated 3D ConvNet)
Dataset: WLASL (Word-Level American Sign Language)
Classes: 2000 ASL words
Performance:
- Top-1 Accuracy: 32.48%
- Top-5 Accuracy: 57.31%
- Top-10 Accuracy: 66.31%

Usage

import torch
from pytorch_i3d import InceptionI3d

# Load model
model = InceptionI3d(400, in_channels=3)
model.load_state_dict(torch.load('weights/rgb_imagenet.pt', map_location='cpu'))
model.replace_logits(2000)
model.load_state_dict(torch.load('asl2000_model.pt', map_location='cpu'))
model.eval()

# Run inference on video
# See inference.py for complete example

Citation

@inproceedings{li2020word,
  title={Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison},
  author={Li, Dongxu and Rodriguez, Cristian and Yu, Xin and Li, Hongdong},
  booktitle={The IEEE Winter Conference on Applications of Computer Vision},
  pages={1459--1469},
  year={2020}
}

License

MIT License - Academic use only

Downloads last month: 5

Inference Providers NEW

Video Classification

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support