YOLOv12 Grayscale - Cavity Detection Model

A specialized YOLOv12 model modified to work with single-channel (grayscale) images for cavity detection. This model has been converted from the original triple-input (9-channel) architecture to grayscale (1-channel) input, making it memory-efficient and suitable for medical imaging applications.

🎯 Model Overview

  • Architecture: YOLOv12-n (Nano) with grayscale input
  • Input Channels: 1 (grayscale) instead of 3 (RGB)
  • Task: Object Detection (Cavity Detection)
  • Classes: 1 class (Cavity)
  • Image Size: 640x640
  • Parameters: ~2.5M parameters
  • Framework: Ultralytics YOLOv12

πŸ”₯ Key Features

  • βœ… Grayscale Input: Works with single-channel images (1/3 memory usage vs RGB)
  • βœ… Medical Imaging Optimized: Designed for dental X-ray analysis
  • βœ… Fast Inference: Optimized for real-time detection
  • βœ… Easy Integration: Compatible with Ultralytics ecosystem
  • βœ… Automatic Conversion: Converts RGB images to grayscale automatically

πŸ“Š Dataset

Training was performed on a custom cavity detection dataset:

  • Dataset Size: 98 train, 12 validation, 12 test images
  • Classes: 1 (Cavity)
  • Format: YOLO format annotations
  • Image Type: Dental X-ray images (grayscale)

Dataset Structure

dataset/
β”œβ”€β”€ train/
β”‚   β”œβ”€β”€ images/          # 98 training images
β”‚   └── labels/          # YOLO format annotations
β”œβ”€β”€ valid/
β”‚   β”œβ”€β”€ images/          # 12 validation images
β”‚   └── labels/          # YOLO format annotations
└── test/
    β”œβ”€β”€ images/          # 12 test images
    └── labels/          # YOLO format annotations

Label Format

YOLO format: <class_id> <x_center> <y_center> <width> <height>

Example:

0 0.5234 0.4123 0.1234 0.0987

πŸš€ Quick Start

Installation

pip install ultralytics opencv-python numpy torch torchvision

Inference

from ultralytics import YOLO

# Load the trained model
model = YOLO('best.pt')

# Predict on a grayscale image
results = model.predict(
    source='path/to/xray/image.jpg',
    conf=0.25,      # Confidence threshold
    save=True,      # Save results
    imgsz=640       # Image size
)

# Process results
for result in results:
    boxes = result.boxes
    for box in boxes:
        cls = int(box.cls[0])
        conf = float(box.conf[0])
        xyxy = box.xyxy[0].tolist()
        print(f"Cavity detected - Confidence: {conf:.2%}")

Batch Inference

# Predict on multiple images
results = model.predict(
    source='path/to/images/folder/',
    save=True,
    conf=0.25
)

πŸŽ“ Training

Dataset Configuration

Create dataset.yaml:

path: /path/to/dataset

train: train/images
val: valid/images
test: test/images

nc: 1
names:
  0: Cavity

Training Command

# Using Python API
from ultralytics import YOLO

model = YOLO('ultralytics/cfg/models/v12/yolov12_triple.yaml')

results = model.train(
    data='dataset.yaml',
    epochs=100,
    imgsz=640,
    batch=16,
    device=0,
    patience=50,
    optimizer='AdamW',
    lr0=0.001
)

Training Parameters

Parameter Value Description
epochs 100-300 Number of training epochs
batch 8-32 Batch size
imgsz 640 Input image size
device 0 or 'cpu' GPU device or CPU
lr0 0.001 Initial learning rate

πŸ“ˆ Model Performance

  • Training Time: ~30 minutes (on CPU)
  • Memory Usage: 1/3 of RGB model
  • Inference Speed: Real-time capable

πŸ”§ Model Architecture

Modified from YOLOv12 triple-input architecture:

Original vs Modified

Feature Original (Triple) Modified (Grayscale)
Input Channels 9 1
First Layer TripleInputConv Conv
Memory Usage 3x baseline 1/3x baseline

Key Modifications

  1. Changed input layer from 9 channels to 1 channel
  2. Modified data loader for automatic RGB→grayscale conversion
  3. Updated augmentation pipeline for single-channel images
  4. Disabled HSV augmentations (not applicable to grayscale)
  5. Optimized mosaic and letterbox augmentations

🎨 Data Augmentation

Optimized for grayscale images:

  • Geometric: Rotation (Β±10Β°), Translation (Β±10%), Scaling (50-150%), Horizontal flip
  • Photometric: Brightness adjustment (Β±40%), Blur
  • Advanced: Mosaic augmentation, Copy-paste, Random erasing

Note: HSV augmentations are disabled for grayscale.

πŸ’Ύ Export Model

from ultralytics import YOLO

model = YOLO('best.pt')

# Export to ONNX
model.export(format='onnx', imgsz=640)

# Export to TensorRT
model.export(format='engine', imgsz=640, half=True)

πŸ“ Citation

@misc{yolov12-grayscale-cavity,
  title={YOLOv12 Grayscale for Cavity Detection},
  author={Suphawut},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/suphawutq56789/yolov12-grayscale-cavity}}
}

πŸ“„ License

This model is released under the AGPL-3.0 license.

πŸ™ Acknowledgments

  • Built on Ultralytics YOLO
  • Modified from Triple-Input YOLO architecture
  • Trained for dental cavity detection

Built with ❀️ using Ultralytics YOLOv12

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support