metadata

base_model:
  - Ultralytics/YOLO11
pipeline_tag: object-detection
library_name: ultralytics
metrics:
  - mAP50
  - mAP50-95
  - accuracy50
  - precision
  - recall
  - f1
model-index:
  - name: MacPaw/yolov11l-ui-elements-detection
    results:
      - task:
          type: object-detection
        metrics:
          - type: accuracy
            value: 0.65359
            name: accuracy@0.5
      - task:
          type: object-detection
        metrics:
          - type: precision
            value: 0.49055
            name: precision
      - task:
          type: object-detection
        metrics:
          - type: recall
            value: 0.43433
            name: recall
      - task:
          type: object-detection
        metrics:
          - type: f1
            value: 0.43776
            name: f1
      - task:
          type: object-detection
        metrics:
          - type: map
            value: 0.46644
            name: mAP@0.5
      - task:
          type: object-detection
        metrics:
          - type: map
            value: 0.31295
            name: mAP@0.5-0.95
datasets:
  - MacPaw/Screen2AX-Element
license: agpl-3.0

🔍 YOLOv11l — UI Elements Detection

This model is a fine-tuned version of Ultralytics/YOLO11, trained to detect UI elements in macOS application screenshots.

It is part of the Screen2AX project — a research effort focused on generating accessibility metadata using computer vision.

🧠 Task Overview

Task: Object Detection
Target: Individual UI elements

Supported Labels:

['AXButton', 'AXDisclosureTriangle', 'AXImage', 'AXLink', 'AXTextArea']

This model detects common interactive components typically surfaced in accessibility trees on macOS.

🗂 Dataset

Training data: MacPaw/Screen2AX-Element

🚀 How to Use

🔧 Install Dependencies

pip install huggingface_hub ultralytics

🧪 Load the Model and Run Predictions

from huggingface_hub import hf_hub_download
from ultralytics import YOLO

# Download the model
model_path = hf_hub_download(
    repo_id="MacPaw/yolov11l-ui-elements-detection",
    filename="ui-elements-detection.pt",
)

# Load and run prediction
model = YOLO(model_path)
results = model.predict("/path/to/your/image")

# Display result
results[0].show()

📜 License

This model is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0), as inherited from the original YOLOv11 base model.

🔗 Related Projects

✍️ Citation

If you use this model in your research, please cite the Screen2AX paper:

@misc{muryn2025screen2axvisionbasedapproachautomatic,
      title={Screen2AX: Vision-Based Approach for Automatic macOS Accessibility Generation}, 
      author={Viktor Muryn and Marta Sumyk and Mariya Hirna and Sofiya Garkot and Maksym Shamrai},
      year={2025},
      eprint={2507.16704},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2507.16704}, 
}

🌐 MacPaw Research

Learn more at https://research.macpaw.com