Sapiens2-5B-Surface

Per-pixel surface-normal estimation (3-channel unit vectors in camera frame).

This repository contains the 5B Surface Normal Estimation checkpoint, finetuned from the Sapiens2-5B pretrained backbone.

Model Details

Quick Start

Install the Sapiens2 repo (pip install -e .), download the checkpoint, and run the demo:

# 1. Download the checkpoint to $SAPIENS_CHECKPOINT_ROOT/normal/
hf download facebook/sapiens2-normal-5b sapiens2_5b_normal.safetensors \
    --local-dir ~/sapiens2_host/normal

# 2. Run the demo (edit INPUT, OUTPUT, and MODEL_NAME inside the script)
cd $SAPIENS_ROOT/sapiens/dense
./scripts/demo/normal.sh

See the Surface Normal Estimation guide for details on inputs, outputs, and visualization options.

Model Card

Field Value
Architecture Sapiens2 ViT backbone + Surface Normal Estimation head
Backbone parameters 5.071 B
Backbone FLOPs 15.722 T
Embedding dim 2432
Layers 56
Attention heads 32
Inference resolution 1024 ร— 768 (H ร— W)
Patch size 16

Sapiens2-Surface Family

Model Params FLOPs Embed dim Layers Heads
Sapiens2-0.4B 0.398 B 1.260 T 1024 24 16
Sapiens2-0.8B 0.818 B 2.592 T 1280 32 16
Sapiens2-1B 1.462 B 4.715 T 1536 40 24
Sapiens2-5B (this) 5.071 B 15.722 T 2432 56 32

See the Sapiens2 Collection for all variants and other downstream task checkpoints.

Intended Use

  • Surface Normal Estimation on human-centric imagery
  • Research on human-centric vision

License

Released under the Sapiens2 License.

Citation

@article{khirodkarsapiens2,
  title={Sapiens2},
  author={Khirodkar, Rawal and Wen, He and Martinez, Julieta and Dong, Yuan and Su, Zhaoen and Saito, Shunsuke},
  journal={arXiv preprint arXiv:2604.21681},
  year={2026}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for facebook/sapiens2-normal-5b

Finetuned
(4)
this model

Space using facebook/sapiens2-normal-5b 1

Collection including facebook/sapiens2-normal-5b

Paper for facebook/sapiens2-normal-5b