File size: 2,679 Bytes
d267531
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7ed3fbc
 
 
 
 
 
 
 
 
d267531
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
base_model:
- Ultralytics/YOLO11
pipeline_tag: object-detection
library_name: ultralytics
tags:
- yolov11
- ultralytics
- yolo
- vision
- object-detection
- pytorch
- ui
datasets:
- MacPaw/Screen2AX-Group
license: agpl-3.0
---

# πŸ” YOLOv11l β€” UI Groups Detection

This model is a fine-tuned version of [`Ultralytics/YOLO11`](https://huggingface.co/Ultralytics/YOLO11), trained to detect **UI groups** (e.g., toolbars, tab groups) in macOS application screenshots.

It is part of the **Screen2AX** project, a research-driven effort to generate accessibility metadata for macOS applications using vision-based techniques.

---

## 🧠 Task Overview

- **Task:** Object Detection
- **Target:** macOS UI groups
- **Supported Label(s):**  
  ```
  ['AXGroup']
  ```

This model detects higher-level UI groupings that are commonly used to structure accessible interfaces (e.g., `AXGroup`, `AXTabGroup`, `AXToolbar`, etc.).

---

## πŸ—‚ Dataset

- **Training data:** [`MacPaw/Screen2AX-Group`](https://huggingface.co/datasets/MacPaw/Screen2AX-Group)

---

## πŸš€ How to Use

### πŸ”§ Install Dependencies

```bash
pip install huggingface_hub ultralytics
```

### πŸ§ͺ Load the Model and Run Predictions

```python
from huggingface_hub import hf_hub_download
from ultralytics import YOLO

# Download the model from the Hugging Face Hub
model_path = hf_hub_download(
    repo_id="MacPaw/yolov11l-ui-groups-detection",
    filename="ui-groups-detection.pt"
)

# Load and run prediction
model = YOLO(model_path)
results = model.predict("/path/to/your/image")

# Visualize or process results
results[0].show()
```

---

## πŸ“œ License

This model is licensed under the **GNU Affero General Public License v3.0 (AGPL-3.0)**, inherited from the original YOLOv11 base model.

---

## πŸ”— Related Projects

- [Screen2AX Project](https://github.com/MacPaw/Screen2AX)
- [Screen2AX HuggingFace Collection](https://huggingface.co/collections/MacPaw/screen2ax-687dfe564d50f163020378b8)
- [YOLOv11l β€” UI Elements Detection](https://huggingface.co/MacPaw/yolov11l-ui-elements-detection)

---

## ✍️ Citation

If you use this model, please cite the Screen2AX paper:

```bibtex
@misc{muryn2025screen2axvisionbasedapproachautomatic,
      title={Screen2AX: Vision-Based Approach for Automatic macOS Accessibility Generation}, 
      author={Viktor Muryn and Marta Sumyk and Mariya Hirna and Sofiya Garkot and Maksym Shamrai},
      year={2025},
      eprint={2507.16704},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2507.16704}, 
}
```

---

## 🌐 MacPaw Research

Learn more at [https://research.macpaw.com](https://research.macpaw.com)