ControlNet
/

marlin_vit_large_ytf

Feature Extraction

Model card Files Files and versions

marlin_vit_large_ytf / README.md

ControlNet's picture

Update README.md

f23e6bb verified 5 months ago

|

history blame contribute delete

1.83 kB

	---
	library_name: transformers
	tags:
	- video
	- feature
	- face
	license: cc
	base_model:
	- ControlNet/MARLIN
	pipeline_tag: feature-extraction
	---


	# MARLIN: Masked Autoencoder for facial video Representation LearnINg

	This repo is the official PyTorch implementation for the paper
	[MARLIN: Masked Autoencoder for facial video Representation LearnINg](https://openaccess.thecvf.com/content/CVPR2023/html/Cai_MARLIN_Masked_Autoencoder_for_Facial_Video_Representation_LearnINg_CVPR_2023_paper) (CVPR 2023) ([arXiv](https://arxiv.org/abs/2211.06627)).


	## Use `transformers` (HuggingFace) for Feature Extraction

	Requirements:
	- Python
	- PyTorch
	- transformers
	- einops

	Currently the huggingface model is only for direct feature extraction without any video pre-processing (e.g. face detection, cropping, strided window, etc).


	```python
	import torch
	from transformers import AutoModel

	model = AutoModel.from_pretrained(
	"ControlNet/marlin_vit_large_ytf", # or other variants
	trust_remote_code=True
	)
	tensor = torch.rand([1, 3, 16, 224, 224]) # (B, C, T, H, W)
	output = model(tensor) # torch.Size([1, 1568, 384])
	```

	## License

	This project is under the CC BY-NC 4.0 license. See [LICENSE](LICENSE) for details.

	## References
	If you find this work useful for your research, please consider citing it.
	```bibtex
	@inproceedings{cai2022marlin,
	title = {MARLIN: Masked Autoencoder for facial video Representation LearnINg},
	author = {Cai, Zhixi and Ghosh, Shreya and Stefanov, Kalin and Dhall, Abhinav and Cai, Jianfei and Rezatofighi, Hamid and Haffari, Reza and Hayat, Munawar},
	booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
	year = {2023},
	month = {June},
	pages = {1493-1504},
	doi = {10.1109/CVPR52729.2023.00150},
	publisher = {IEEE},
	}
	```