2.61 TB
33,190 files
Updated 9 days ago
NameSize
ScanRefer
Scannet
checkpoints
objaverse_sonata_features
precomputed_voxel_1_5
precomputed_voxel_1_5_eval
3rscan_sonata_feat.tar.gz49.7 GB
xet
GPT_dataset_qwen25_final_test.json5.91 MB
xet
GPT_dataset_qwen25_final_train.json68.3 MB
xet
GPT_dataset_qwen3vl_final_test.json6.58 MB
xet
GPT_dataset_qwen3vl_final_train.json79.3 MB
xet
README.md3.76 kB
xet
checkpoints469 MB
xet
dense_captioning_train.json155 MB
xet
dense_captioning_val.json40.2 MB
xet
global_feat.tar.gz266 GB
xet
grounding_multi3drefer_train.json975 MB
xet
grounding_multi3drefer_val.json130 MB
xet
grounding_multi3drefer_val_iou25.json130 MB
xet
grounding_scanrefer_train.json1.63 GB
xet
leo_proposal_mappings.tar.gz222 kB
xet
leo_proposals.tar.gz50.4 MB
xet
local_feat.tar.gz65.5 GB
xet
multi3drefer_train.json198 MB
xet
multi3drefer_val_gt.json51.5 MB
xet
multi3drefer_val_samelabel.json14.9 MB
xet
new_dense_captioning_train.json155 MB
xet
new_grounding_multi3drefer_train.json1.02 GB
xet
new_grounding_multi3drefer_val.json132 MB
xet
new_grounding_multi3drefer_val_iou25.json132 MB
xet
new_grounding_scanrefer_train.json1.71 GB
xet
new_norep_dense_captioning_train.json31.1 MB
xet
new_norep_grounding_multi3drefer_train.json205 MB
xet
new_norep_grounding_scanrefer_train.json342 MB
xet
new_scanrefer_val.json113 MB
xet
new_scanrefer_val_iou25.json113 MB
xet
object_captioning_sceneverse_scannet_train.json22.2 MB
xet
object_captioning_sceneverse_scannet_train_filtered.json291 kB
xet
object_captioning_sceneverse_scannet_val.json4.46 MB
xet
object_features.tar.gz162 GB
xet
qa_scanqa_train.json75.7 MB
xet
qa_scanqa_val.json29.9 MB
xet
qa_sqa3d_train.json38.6 MB
xet
qa_sqa3d_val.json4.75 MB
xet
qwen25vl7b_global_local_stage1_feat4_1_b64.tar.gz3.77 GB
xet
qwen25vl7b_global_local_stage1_feat8_2.tar.gz470 MB
xet
qwen25vl7b_global_local_stage1_feat8_2_b64.tar.gz3.78 GB
xet
qwen25vl7b_stage2_stage2_feat4_1.tar.gz63.2 GB
xet
scannet_gt_masks_sonata_feat.tar.gz80 GB
xet
scannet_scene_masks.tar.gz26.9 MB
xet
scannet_sonata_feat.tar.gz208 GB
xet
scannetpp_centers.tar.gz218 kB
xet
scanrefer_gt_masks.tar.gz38.7 MB
xet
scanrefer_val.json111 MB
xet
scanrefer_val_gt.json44.2 MB
xet
scanrefer_val_iou25.json111 MB
xet
scanrefer_val_samelabel.json10 MB
xet
scene_captioning_3rscan_train.json20.2 MB
xet
scene_captioning_scannet_train.json4.22 MB
xet
scene_captioning_scannet_val.json896 kB
xet
scene_mask.tar.gz10.9 MB
xet
sonata_feat.tar.gz48.1 GB
xet
stage2_mixed_train.json708 MB
xet
stage2_voxel_1_5.tar.gz137 GB
xet
train_mask.tar.gz218 kB
xet
README.md

Dataset Format

Each JSON file contains a list of data samples. Every sample uses the global/local token format, where <global> represents the broader context (a scene or an object) and <local> represents a specific part within it (an object or a component).

[
    {
        "conversations": [
            {
                "role": "user",
                "content": "Looking at the scene <global>, explain the appearance of the highlighted object <local> and where it is located."
            },
            {
                "role": "assistant",
                "content": "This object is a tall, narrow cabinet with a light wood-grain finish ..."
            }
        ],
        "global": [
            {
                "id": "036bce3393",
                "feat_path": "data/sonata_feat/036bce3393_down.npz",
                "sample_mask_path": "data/scene_mask/036bce3393_mask_32768.npy"
            }
        ],
        "local": [
            {
                "global_id": "036bce3393",
                "mask_path": "data/train_mask/036bce3393/036bce3393_part_1.npy"
            }
        ],
        "metadata": {
            "tasks": "dense_description",
            "level": "scene-object"
        }
    }
]

Fields

conversations

A list of message objects representing the dialogue. The model is trained to predict all assistant turns.

  • role: Either "user" or "assistant".
  • content: The message text. User messages contain <global> and <local> placeholders that will be replaced with point cloud embeddings during processing.

global

A list of global point cloud entries. Each <global> placeholder in the conversation maps to an entry here (by order).

  • id: A unique identifier for this global point cloud.
  • feat_path: Path to the .npz feature file containing feat_down (downsampled point features) and inverse (raw-to-downsampled index mapping).
  • sample_mask_path: Path to a .npy mask file used for sampling the global features. May be empty for object-level globals.

local

A list of local (masked) point cloud entries. Each <local> placeholder in the conversation maps to an entry here (by order).

  • global_id: References the id of the parent global entry that this local region belongs to.
  • mask_path: Path to a .npy mask file that selects the specific points within the global point cloud.

metadata

  • tasks: The task type (e.g., "dense_description").
  • level: The spatial hierarchy level, which determines the semantics of <global> and <local>:
Level <global> means <local> means Example prompt
scene-object A 3D scene An object in the scene "Looking at the scene <global>, explain the appearance of the highlighted object <local>."
scene-subobject A 3D scene A sub-part of an object in the scene "Describe the selected object <local> in scene <global> and its surroundings in detail."
object-subobject An individual object A component of that object "Describe the selected component <local> of the object <global>, including its shape, proportions, material, color, and exact position."

Data Distribution

Level Train Test
scene-object 26,453 (40.8%) 3,089 (57.0%)
scene-subobject 9,046 (13.9%) 1,002 (18.5%)
object-subobject 29,415 (45.3%) 1,329 (24.5%)
Total 64,914 5,420

Files

  • GPT_dataset_qwen25_final_train.json — Training set (Qwen2.5 format)
  • GPT_dataset_qwen25_final_test.json — Test set (Qwen2.5 format)
  • GPT_dataset_qwen3vl_final_train.json — Training set (Qwen3-VL format)
  • GPT_dataset_qwen3vl_final_test.json — Test set (Qwen3-VL format)
Total size
2.61 TB
Files
33,190
Last updated
May 6
Pre-warmed CDN
US EU US EU

Contributors