|
--- |
|
tags: |
|
- curb ramp detection |
|
- accessibility |
|
license: mit |
|
datasets: |
|
- projectsidewalk/rampnet-dataset |
|
base_model: |
|
- timm/convnextv2_base.fcmae_ft_in22k_in1k_384 |
|
pipeline_tag: object-detection |
|
--- |
|
|
|
**RampNet** is a two-stage pipeline that addresses the scarcity of curb ramp detection datasets by using government location data to automatically generate over 210,000 annotated Google Street View panoramas. This new dataset is then used to train a state-of-the-art curb ramp detection model that significantly outperforms previous efforts. In this repo, we provide the checkpoint for our curb ramp detection model. |
|
|
|
|
|
|
|
**Usage:** |
|
|
|
*For a step-by-step walkthrough, see our [Google Colab notebook](https://colab.research.google.com/drive/1TOtScud5ac2McXJmg1n_YkOoZBchdn3w?usp=sharing), which includes a visualization in addition to the code below.* |
|
```py |
|
import torch |
|
from transformers import AutoModel |
|
from PIL import Image |
|
import numpy as np |
|
from torchvision import transforms |
|
from skimage.feature import peak_local_max |
|
|
|
IMAGE_PATH = "example.jpg" |
|
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
|
|
model = AutoModel.from_pretrained("projectsidewalk/rampnet-model", trust_remote_code=True).to(DEVICE).eval() |
|
|
|
preprocess = transforms.Compose([ |
|
transforms.Resize((2048, 4096), interpolation=transforms.InterpolationMode.BILINEAR), |
|
transforms.ToTensor(), |
|
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) |
|
]) |
|
|
|
img = Image.open(IMAGE_PATH).convert("RGB") |
|
img_tensor = preprocess(img).unsqueeze(0).to(DEVICE) |
|
|
|
with torch.no_grad(): |
|
heatmap = model(img_tensor).squeeze().cpu().numpy() |
|
|
|
peaks = peak_local_max(np.clip(heatmap, 0, 1), min_distance=10, threshold_abs=0.5) |
|
scale_w = img.width / heatmap.shape[1] |
|
scale_h = img.height / heatmap.shape[0] |
|
coordinates = [(int(c * scale_w), int(r * scale_h)) for r, c in peaks] |
|
|
|
# Coordinates of detected curb ramps |
|
print(coordinates) |
|
``` |
|
|
|
 |
|
|
|
|
|
If you get an error like this (see below), then you need to update your transformers library version. |
|
``` |
|
>>> model = AutoModel.from_pretrained("projectsidewalk/rampnet-model", trust_remote_code=True) |
|
Traceback (most recent call last): |
|
File "<stdin>", line 1, in <module> |
|
File "/gscratch/scrubbed/jsomeara/envs/sidewalk-validator-ai/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained |
|
model_class = add_generation_mixin_to_remote_model(model_class) |
|
File "/gscratch/scrubbed/jsomeara/envs/sidewalk-validator-ai/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 726, in add_generation_mixin_to_remote_model |
|
has_custom_generate = "GenerationMixin" not in str(getattr(model_class, "generate")) |
|
AttributeError: type object 'KeypointModel' has no attribute 'generate' |
|
>>> |
|
``` |
|
|
|
Citation: |
|
|
|
```bibtex |
|
@inproceedings{omeara2025rampnet, |
|
author = {John S. O'Meara and Jared Hwang and Zeyu Wang and Michael Saugstad and Jon E. Froehlich}, |
|
title = {{RampNet: A Two-Stage Pipeline for Bootstrapping Curb Ramp Detection in Streetscape Images from Open Government Metadata}}, |
|
booktitle = {{ICCV'25 Workshop on Vision Foundation Models and Generative AI for Accessibility: Challenges and Opportunities (ICCV 2025 Workshop)}}, |
|
year = {2025}, |
|
doi = {https://doi.org/10.48550/arXiv.2508.09415}, |
|
url = {https://cv4a11y.github.io/ICCV2025/index.html}, |
|
note = {DOI: forthcoming} |
|
} |
|
``` |