File size: 2,679 Bytes
d267531 7ed3fbc d267531 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
---
base_model:
- Ultralytics/YOLO11
pipeline_tag: object-detection
library_name: ultralytics
tags:
- yolov11
- ultralytics
- yolo
- vision
- object-detection
- pytorch
- ui
datasets:
- MacPaw/Screen2AX-Group
license: agpl-3.0
---
# π YOLOv11l β UI Groups Detection
This model is a fine-tuned version of [`Ultralytics/YOLO11`](https://huggingface.co/Ultralytics/YOLO11), trained to detect **UI groups** (e.g., toolbars, tab groups) in macOS application screenshots.
It is part of the **Screen2AX** project, a research-driven effort to generate accessibility metadata for macOS applications using vision-based techniques.
---
## π§ Task Overview
- **Task:** Object Detection
- **Target:** macOS UI groups
- **Supported Label(s):**
```
['AXGroup']
```
This model detects higher-level UI groupings that are commonly used to structure accessible interfaces (e.g., `AXGroup`, `AXTabGroup`, `AXToolbar`, etc.).
---
## π Dataset
- **Training data:** [`MacPaw/Screen2AX-Group`](https://huggingface.co/datasets/MacPaw/Screen2AX-Group)
---
## π How to Use
### π§ Install Dependencies
```bash
pip install huggingface_hub ultralytics
```
### π§ͺ Load the Model and Run Predictions
```python
from huggingface_hub import hf_hub_download
from ultralytics import YOLO
# Download the model from the Hugging Face Hub
model_path = hf_hub_download(
repo_id="MacPaw/yolov11l-ui-groups-detection",
filename="ui-groups-detection.pt"
)
# Load and run prediction
model = YOLO(model_path)
results = model.predict("/path/to/your/image")
# Visualize or process results
results[0].show()
```
---
## π License
This model is licensed under the **GNU Affero General Public License v3.0 (AGPL-3.0)**, inherited from the original YOLOv11 base model.
---
## π Related Projects
- [Screen2AX Project](https://github.com/MacPaw/Screen2AX)
- [Screen2AX HuggingFace Collection](https://huggingface.co/collections/MacPaw/screen2ax-687dfe564d50f163020378b8)
- [YOLOv11l β UI Elements Detection](https://huggingface.co/MacPaw/yolov11l-ui-elements-detection)
---
## βοΈ Citation
If you use this model, please cite the Screen2AX paper:
```bibtex
@misc{muryn2025screen2axvisionbasedapproachautomatic,
title={Screen2AX: Vision-Based Approach for Automatic macOS Accessibility Generation},
author={Viktor Muryn and Marta Sumyk and Mariya Hirna and Sofiya Garkot and Maksym Shamrai},
year={2025},
eprint={2507.16704},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2507.16704},
}
```
---
## π MacPaw Research
Learn more at [https://research.macpaw.com](https://research.macpaw.com) |