|
--- |
|
base_model: |
|
- Ultralytics/YOLO11 |
|
pipeline_tag: object-detection |
|
library_name: ultralytics |
|
metrics: |
|
- mAP50 |
|
- mAP50-95 |
|
- accuracy50 |
|
- precision |
|
- recall |
|
- f1 |
|
model-index: |
|
- name: MacPaw/yolov11l-ui-elements-detection |
|
results: |
|
- task: |
|
type: object-detection |
|
metrics: |
|
- type: accuracy |
|
value: 0.65359 |
|
name: accuracy@0.5 |
|
- task: |
|
type: object-detection |
|
metrics: |
|
- type: precision |
|
value: 0.49055 |
|
name: precision |
|
- task: |
|
type: object-detection |
|
metrics: |
|
- type: recall |
|
value: 0.43433 |
|
name: recall |
|
- task: |
|
type: object-detection |
|
metrics: |
|
- type: f1 |
|
value: 0.43776 |
|
name: f1 |
|
- task: |
|
type: object-detection |
|
metrics: |
|
- type: map |
|
value: 0.46644 |
|
name: mAP@0.5 |
|
- task: |
|
type: object-detection |
|
metrics: |
|
- type: map |
|
value: 0.31295 |
|
name: mAP@0.5-0.95 |
|
datasets: |
|
- MacPaw/Screen2AX-Element |
|
license: agpl-3.0 |
|
--- |
|
|
|
# π YOLOv11l β UI Elements Detection |
|
|
|
This model is a fine-tuned version of [`Ultralytics/YOLO11`](https://huggingface.co/Ultralytics/YOLO11), trained to detect **UI elements** in macOS application screenshots. |
|
|
|
It is part of the **Screen2AX** project β a research effort focused on generating accessibility metadata using computer vision. |
|
|
|
--- |
|
|
|
## π§ Task Overview |
|
|
|
- **Task:** Object Detection |
|
- **Target:** Individual UI elements |
|
- **Supported Labels:** |
|
``` |
|
['AXButton', 'AXDisclosureTriangle', 'AXImage', 'AXLink', 'AXTextArea'] |
|
``` |
|
|
|
This model detects common interactive components typically surfaced in accessibility trees on macOS. |
|
|
|
--- |
|
|
|
## π Dataset |
|
|
|
- Training data: [`MacPaw/Screen2AX-Element`](https://huggingface.co/datasets/MacPaw/Screen2AX-Element) |
|
|
|
--- |
|
|
|
## π How to Use |
|
|
|
### π§ Install Dependencies |
|
|
|
```bash |
|
pip install huggingface_hub ultralytics |
|
``` |
|
|
|
### π§ͺ Load the Model and Run Predictions |
|
|
|
```python |
|
from huggingface_hub import hf_hub_download |
|
from ultralytics import YOLO |
|
|
|
# Download the model |
|
model_path = hf_hub_download( |
|
repo_id="MacPaw/yolov11l-ui-elements-detection", |
|
filename="ui-elements-detection.pt", |
|
) |
|
|
|
# Load and run prediction |
|
model = YOLO(model_path) |
|
results = model.predict("/path/to/your/image") |
|
|
|
# Display result |
|
results[0].show() |
|
``` |
|
|
|
--- |
|
|
|
## π License |
|
|
|
This model is licensed under the **GNU Affero General Public License v3.0 (AGPL-3.0)**, as inherited from the original YOLOv11 base model. |
|
|
|
--- |
|
|
|
## π Related Projects |
|
|
|
- [Screen2AX Project](https://github.com/MacPaw/Screen2AX) |
|
- [Screen2AX HuggingFace Collection](https://huggingface.co/collections/MacPaw/screen2ax-687dfe564d50f163020378b8) |
|
- [YOLOv11l β UI Groups Detection](https://huggingface.co/MacPaw/yolov11l-ui-groups-detection) |
|
|
|
--- |
|
|
|
## βοΈ Citation |
|
|
|
If you use this model in your research, please cite the Screen2AX paper: |
|
|
|
```bibtex |
|
@misc{muryn2025screen2axvisionbasedapproachautomatic, |
|
title={Screen2AX: Vision-Based Approach for Automatic macOS Accessibility Generation}, |
|
author={Viktor Muryn and Marta Sumyk and Mariya Hirna and Sofiya Garkot and Maksym Shamrai}, |
|
year={2025}, |
|
eprint={2507.16704}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.LG}, |
|
url={https://arxiv.org/abs/2507.16704}, |
|
} |
|
``` |
|
|
|
--- |
|
|
|
## π MacPaw Research |
|
|
|
Learn more at [https://research.macpaw.com](https://research.macpaw.com) |