ASL2000: American Sign Language Recognition Model

Model Description

This is a pre-trained I3D (Inflated 3D ConvNet) model for American Sign Language recognition with 2000 classes.

  • Architecture: I3D (Inflated 3D ConvNet)
  • Dataset: WLASL (Word-Level American Sign Language)
  • Classes: 2000 ASL words
  • Performance:
    • Top-1 Accuracy: 32.48%
    • Top-5 Accuracy: 57.31%
    • Top-10 Accuracy: 66.31%

Usage

import torch
from pytorch_i3d import InceptionI3d

# Load model
model = InceptionI3d(400, in_channels=3)
model.load_state_dict(torch.load('weights/rgb_imagenet.pt', map_location='cpu'))
model.replace_logits(2000)
model.load_state_dict(torch.load('asl2000_model.pt', map_location='cpu'))
model.eval()

# Run inference on video
# See inference.py for complete example

Citation

@inproceedings{li2020word,
  title={Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison},
  author={Li, Dongxu and Rodriguez, Cristian and Yu, Xin and Li, Hongdong},
  booktitle={The IEEE Winter Conference on Applications of Computer Vision},
  pages={1459--1469},
  year={2020}
}

License

MIT License - Academic use only

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support