YOLOv12 Grayscale - Cavity Detection Model

A specialized YOLOv12 model modified to work with single-channel (grayscale) images for cavity detection. This model has been converted from the original triple-input (9-channel) architecture to grayscale (1-channel) input, making it memory-efficient and suitable for medical imaging applications.

🎯 Model Overview

Architecture: YOLOv12-n (Nano) with grayscale input
Input Channels: 1 (grayscale) instead of 3 (RGB)
Task: Object Detection (Cavity Detection)
Classes: 1 class (Cavity)
Image Size: 640x640
Parameters: ~2.5M parameters
Framework: Ultralytics YOLOv12

🔥 Key Features

✅ Grayscale Input: Works with single-channel images (1/3 memory usage vs RGB)
✅ Medical Imaging Optimized: Designed for dental X-ray analysis
✅ Fast Inference: Optimized for real-time detection
✅ Easy Integration: Compatible with Ultralytics ecosystem
✅ Automatic Conversion: Converts RGB images to grayscale automatically

📊 Dataset

Training was performed on a custom cavity detection dataset:

Dataset Size: 98 train, 12 validation, 12 test images
Classes: 1 (Cavity)
Format: YOLO format annotations
Image Type: Dental X-ray images (grayscale)

Dataset Structure

dataset/
├── train/
│   ├── images/          # 98 training images
│   └── labels/          # YOLO format annotations
├── valid/
│   ├── images/          # 12 validation images
│   └── labels/          # YOLO format annotations
└── test/
    ├── images/          # 12 test images
    └── labels/          # YOLO format annotations

Label Format

YOLO format: <class_id> <x_center> <y_center> <width> <height>

Example:

0 0.5234 0.4123 0.1234 0.0987

🚀 Quick Start

Installation

pip install ultralytics opencv-python numpy torch torchvision

Inference

from ultralytics import YOLO

# Load the trained model
model = YOLO('best.pt')

# Predict on a grayscale image
results = model.predict(
    source='path/to/xray/image.jpg',
    conf=0.25,      # Confidence threshold
    save=True,      # Save results
    imgsz=640       # Image size
)

# Process results
for result in results:
    boxes = result.boxes
    for box in boxes:
        cls = int(box.cls[0])
        conf = float(box.conf[0])
        xyxy = box.xyxy[0].tolist()
        print(f"Cavity detected - Confidence: {conf:.2%}")

Batch Inference

# Predict on multiple images
results = model.predict(
    source='path/to/images/folder/',
    save=True,
    conf=0.25
)

🎓 Training

Dataset Configuration

Create dataset.yaml:

path: /path/to/dataset

train: train/images
val: valid/images
test: test/images

nc: 1
names:
  0: Cavity

Training Command

# Using Python API
from ultralytics import YOLO

model = YOLO('ultralytics/cfg/models/v12/yolov12_triple.yaml')

results = model.train(
    data='dataset.yaml',
    epochs=100,
    imgsz=640,
    batch=16,
    device=0,
    patience=50,
    optimizer='AdamW',
    lr0=0.001
)

Training Parameters

Parameter	Value	Description
`epochs`	100-300	Number of training epochs
`batch`	8-32	Batch size
`imgsz`	640	Input image size
`device`	0 or 'cpu'	GPU device or CPU
`lr0`	0.001	Initial learning rate

📈 Model Performance

Training Time: ~30 minutes (on CPU)
Memory Usage: 1/3 of RGB model
Inference Speed: Real-time capable

🔧 Model Architecture

Modified from YOLOv12 triple-input architecture:

Original vs Modified

Feature	Original (Triple)	Modified (Grayscale)
Input Channels	9	1
First Layer	TripleInputConv	Conv
Memory Usage	3x baseline	1/3x baseline

Key Modifications

Changed input layer from 9 channels to 1 channel
Modified data loader for automatic RGB→grayscale conversion
Updated augmentation pipeline for single-channel images
Disabled HSV augmentations (not applicable to grayscale)
Optimized mosaic and letterbox augmentations

🎨 Data Augmentation

Optimized for grayscale images:

Geometric: Rotation (±10°), Translation (±10%), Scaling (50-150%), Horizontal flip
Photometric: Brightness adjustment (±40%), Blur
Advanced: Mosaic augmentation, Copy-paste, Random erasing

Note: HSV augmentations are disabled for grayscale.

💾 Export Model

from ultralytics import YOLO

model = YOLO('best.pt')

# Export to ONNX
model.export(format='onnx', imgsz=640)

# Export to TensorRT
model.export(format='engine', imgsz=640, half=True)

📝 Citation

@misc{yolov12-grayscale-cavity,
  title={YOLOv12 Grayscale for Cavity Detection},
  author={Suphawut},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/suphawutq56789/yolov12-grayscale-cavity}}
}

📄 License

This model is released under the AGPL-3.0 license.

🙏 Acknowledgments

Built on Ultralytics YOLO
Modified from Triple-Input YOLO architecture
Trained for dental cavity detection

Built with ❤️ using Ultralytics YOLOv12

Downloads last month: -; Downloads are not tracked for this model. How to track