File size: 4,542 Bytes
c124ce8 2ca03a5 c124ce8 511963f c124ce8 ce4bbab f8f63c4 ce4bbab c124ce8 ce4bbab c124ce8 607f6b8 ce4bbab |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 |
---
license: mit
tags:
- object-detection
- computer-vision
- feet
- pytorch
- faster-rcnn
library_name: torchvision
pipeline_tag: object-detection
model-index:
- name: Faster R-CNN Foot Detector
results: []
---
# 🦶 Faster R-CNN Foot Detection Model

by [Tony Assi](https://www.tonyassi.com/)
This model detects **feet or shoes** in an image using a fine-tuned [Faster R-CNN](https://pytorch.org/vision/stable/models/generated/torchvision.models.detection.fasterrcnn_resnet50_fpn.html) model from Torchvision.
It was trained on a small custom dataset of foot annotations and is intended as a starting point for foot/shoe detection in street, fashion, or movement-based applications.
Web demo: [](https://huggingface.co/spaces/tonyassi/foot-detection)
---
## 🧠 Model Details
- Base model: `fasterrcnn_resnet50_fpn` (pretrained on COCO)
- Fine-tuned on 40 images of feet/shoes
- Class labels:
- `1`: foot/shoe
- Bounding box outputs with confidence scores
- Optimized for CPU (but works with MPS and CUDA)
---
## ⚡️ Quick Start
To download this repository:
```bash
git clone https://github.com/tonyassi/FootDetection.git
cd FootDetection
```
Install:
```bash
pip install -r requirements.txt
```
Usage:
```python
from FootDetection import FootDetection
from PIL import Image
# Initialize model (first run will auto-download weights)
foot_detection = FootDetection("cpu") # "cuda" for GPU or "mps" for Apple Silicon
# Load image
img = Image.open("image.jpg").convert("RGB")
# Run detection
results = foot_detection.detect(img, threshold=0.1)
print(results)
# Draw boxes
img_with_boxes = foot_detection.draw_boxes(img)
img_with_boxes.show()
img_with_boxes.save("annotated_image.jpg")
```
---
## 📦 Usage
```bash
pip install torch torchvision pillow huggingface_hub
```
```python
import os
import torch
import torchvision
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
from PIL import Image, ImageDraw
from torchvision.transforms import functional as F
from huggingface_hub import hf_hub_download
# ===== CONFIG =====
device = torch.device("cpu") # or "mps" if stable
checkpoint_dir = "checkpoints"
checkpoint_file = "fasterrcnn_foot.pth"
local_path = os.path.join(checkpoint_dir, checkpoint_file)
# ===== Ensure Checkpoint Exists =====
if not os.path.exists(local_path):
os.makedirs(checkpoint_dir, exist_ok=True)
print("Downloading model from Hugging Face...")
local_path = hf_hub_download(
repo_id="tonyassi/foot-detection",
filename=checkpoint_file,
local_dir=checkpoint_dir
)
# ===== Load Model =====
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(weights="DEFAULT")
in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, 2)
model.load_state_dict(torch.load(local_path, map_location=device))
model.to(device)
model.eval()
# ===== Function: Foot Detection =====
def foot_detection(image, threshold=0.1):
"""Takes a PIL image, returns bounding boxes + scores above threshold"""
image_tensor = F.to_tensor(image).unsqueeze(0).to(device)
with torch.no_grad():
outputs = model(image_tensor)[0]
boxes = []
scores = []
for box, score in zip(outputs["boxes"], outputs["scores"]):
if score >= threshold:
boxes.append(box.tolist())
scores.append(score.item())
return {
"boxes": boxes,
"scores": scores
}
# ===== Function: Draw Bounding Boxes =====
def draw_bounding_box(image, detection):
"""Draws boxes and scores on a copy of the image"""
image_copy = image.copy()
draw = ImageDraw.Draw(image_copy)
for box, score in zip(detection["boxes"], detection["scores"]):
x0, y0, x1, y1 = box
draw.rectangle([x0, y0, x1, y1], outline="red", width=3)
draw.text((x0, y0), f"{score:.2f}", fill="red")
return image_copy
from PIL import Image
# ==== Load and prepare image ====
image_path = "test.jpg" # replace with your image path
image = Image.open(image_path).convert("RGB")
# ==== Run detection ====
detections = foot_detection(image, threshold=0.3)
# ==== Draw results ====
result_image = draw_bounding_box(image, detections)
result_image.show() # or result_image.save("output.jpg")
``` |