--- license: mit library_name: BEVANet tags: - image-segmentation - real-time-semantic-segmentation - real-time - semantic-segmentation - computer-vision - pytorch datasets: - Cityscapes metrics: - mIoU - FPS ---

BEVANet: Bilateral Efficient Visual Attention Network for Real-Time Semantic Segmentation (ICIP 2025 spotlight)

[![arXiv paper](https://img.shields.io/badge/arXiv%20paper-2508.07300-red.svg?logo=arXiv)](https://arxiv.org/abs/2508.07300) [![ICIP paper](https://img.shields.io/badge/IEEE%20paper-ICIP25%20spotlight-blue.svg?logo=IEEE)](https://ieeexplore.ieee.org/document/11084676) [![GitHub Code](https://img.shields.io/badge/Code-GitHub-black.svg?logo=github)](https://github.com/maomao0819/BEVANet) [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT) [Ping-Mao Huang](https://www.linkedin.com/in/maomao0819/), I-Tien Chao, Ping-Chia Huang, Jia-Wei Liao, Yung-Yu Chuang
National Taiwan University
## Model Description - **Task**: Real-Time Semantic Segmentation - **Dataset**: Cityscapes - **Model**: BEVANet BEVANet semantic segmentation model trained on Cityscapes dataset. Achieves 81.0% mIoU with 32.8 FPS on RTX3090. ## Performance - **mIoU**: 81.0% - **FPS**: 32.8 (RTX3090) - **Parameters**: 58.6M ## Usage ```python from models.BEVANet import BEVANet_SEG # Load model from specific branch model = BEVANet_SEG.from_pretrained( "maomao0819/BEVANet", revision="main" ) # For main branch (no revision needed) # model = BEVANet_SEG.from_pretrained("maomao0819/BEVANet") ``` ## All Model Variants | Model | Branch | Dataset | Task | Performance | FPS | Parameters | |-------|--------|---------|------|-------------|-----|------------| | BEVANet-S | `imagenet-bevanet-s` | ImageNet | Classification | 71.1% Top-1 | 198.6 | 16.3M | | BEVANet | `imagenet-bevanet` | ImageNet | Classification | 76.3% Top-1 | 82.3 | 57.4M | | BEVANet-S | `cityscapes-bevanet-s` | Cityscapes | Segmentation | 78.2% mIoU | 70.0 | 15.2M | | BEVANet | `main` | Cityscapes | Segmentation | 81.0% mIoU | 32.8 | 58.6M | | BEVANet-S | `camvid-bevanet-s` | CamVid | Segmentation | 83.1% mIoU | 79.4 | 15.2M | | BEVANet | `ade20k-bevanet` | ADE20K | Segmentation | 39.8% mIoU | 73.3 | 58.9M | ## Citation If you use this model, please cite: ```bibtex @inproceedings{huang2025bevanet, title={Bevanet: Bilateral Efficient Visual Attention Network for Real-Time Semantic Segmentation}, author={Huang, Ping-Mao and Chao, I-Tien and Huang, Ping-Chia and Liao, Jia-Wei and Chuang, Yung-Yu}, booktitle={2025 IEEE International Conference on Image Processing (ICIP)}, pages={2778--2783}, year={2025}, organization={IEEE} } ```