| --- |
| language: |
| - en |
| license: apache-2.0 |
| tags: |
| - computer-vision |
| - image-matching |
| - overlap-detection |
| - feature-extraction |
| datasets: |
| - SSSSphinx/SCoDe |
| --- |
| |
| # SCoDe: Scale-aware Co-visible Region Detection for Image Matching |
|
|
| <div align="center"> |
|
|
| [](https://www.sciencedirect.com/science/article/abs/pii/S0924271625003260) |
| [](https://doi.org/10.1016/j.isprsjprs.2025.08.015) |
| [](https://xupan.top/Projects/scode) |
| [](https://github.com/SSSSphinx/SCoDe) |
|
|
| </div> |
|
|
| ## Overview |
|
|
| SCoDe is a scale-aware co-visible region detection model designed for robust image matching. It detects overlapping regions between image pairs while being invariant to scale variations, making it particularly effective for structure-from-motion and 3D reconstruction tasks. |
|
|
| This model is built upon the CCOE (Co-visible region detection with Overlap Estimation) architecture and has been trained on the MegaDepth dataset. |
|
|
| ## Model Details |
|
|
| - **Architecture**: CCOE-based transformer with multi-scale attention |
| - **Backbone**: ResNet-50 |
| - **Input Size**: 1024×1024 (configurable) |
| - **Training Dataset**: MegaDepth |
| - **Framework**: PyTorch |
|
|
| ### Key Features |
|
|
| - Scale-aware overlap region detection |
| - Rotation-invariant matching capabilities |
| - End-to-end trainable pipeline |
| - Compatible with various feature extractors (SIFT, SuperPoint, D2-Net, R2D2, DISK) |
|
|
| ## Usage |
|
|
| ### Installation |
|
|
| ```bash |
| pip install torch torchvision |
| git clone https://github.com/SSSSphinx/SCoDe.git |
| cd SCoDe |
| pip install -r requirements.txt |
| ``` |
|
|
| ### Quick Start |
|
|
| ```python |
| import torch |
| from src.config.default import get_cfg_defaults |
| from src.model import CCOE |
| |
| # Load configuration |
| cfg = get_cfg_defaults() |
| cfg.merge_from_file('configs/scode_config.py') |
| |
| # Initialize model |
| device = 'cuda' if torch.cuda.is_available() else 'cpu' |
| model = CCOE(cfg.CCOE).eval().to(device) |
| |
| # Load pre-trained weights |
| model.load_state_dict(torch.load('weights/scode.pth', map_location=device)) |
| |
| # Model is ready for inference |
| with torch.no_grad(): |
| # Process image pair (example) |
| image1 = torch.randn(1, 3, 1024, 1024).to(device) |
| image2 = torch.randn(1, 3, 1024, 1024).to(device) |
| output = model({'image1': image1, 'image2': image2}) |
| ``` |
|
|
| ### Training |
|
|
| ```bash |
| # Single GPU training |
| python train_scode.py --num_workers 4 --epoch 15 --batch_size 4 --validation --learning_rate 1e-5 |
| |
| # Multi-GPU distributed training (4 GPUs) |
| python -m torch.distributed.launch --nproc_per_node 4 --master_port=29501 train_scode.py \ |
| --num_workers 4 --epoch 15 --batch_size 4 --validation --learning_rate 1e-5 |
| ``` |
|
|
| ### Evaluation |
|
|
| #### Rotation Invariance Evaluation |
| ```bash |
| python rot_inv_eval.py \ |
| --extractors superpoint d2net r2d2 disk \ |
| --image_pairs path/to/image/pairs \ |
| --output_dir outputs/scode_rot_eval |
| ``` |
|
|
| #### Pose Estimation Evaluation |
| ```bash |
| python eval_pose_estimation.py \ |
| --results_dir outputs/megadepth_results \ |
| --dataset megadepth |
| ``` |
|
|
| #### Radar Evaluation |
| ```bash |
| python eval_radar.py \ |
| --results_dir outputs/radar_results |
| ``` |
|
|
| ## Configuration |
|
|
| Main configuration files: |
| - [`configs/scode_config.py`](configs/scode_config.py) - SCoDe model configuration |
| - [`src/config/default.py`](src/config/default.py) - Default configuration template |
|
|
| ### Key Parameters |
|
|
| ```python |
| # Training |
| cfg.DATASET.TRAIN.IMAGE_SIZE = [1024, 1024] |
| cfg.DATASET.TRAIN.BATCH_SIZE = 4 |
| cfg.DATASET.TRAIN.PAIRS_LENGTH = 128000 |
| |
| # Validation |
| cfg.DATASET.VAL.IMAGE_SIZE = [1024, 1024] |
| |
| # Model |
| cfg.CCOE.BACKBONE.NUM_LAYERS = 50 |
| cfg.CCOE.BACKBONE.STRIDE = 32 |
| cfg.CCOE.CCA.DEPTH = [2, 2, 2, 2] |
| cfg.CCOE.CCA.NUM_HEADS = [8, 8, 8, 8] |
| ``` |
|
|
| ## Dataset |
|
|
| The model is trained on the [MegaDepth](https://github.com/zhengqili/MegaDepth) dataset with scale-aware pair generation. |
|
|
| Dataset preparation: |
| ```bash |
| python dataset_preparation.py \ |
| --base_path dataset/megadepth/MegaDepth \ |
| --num_per_scene 5000 |
| ``` |
|
|
| Validation pairs are automatically generated and evaluated during training. |
|
|
| ## Model Performance |
|
|
| SCoDe demonstrates strong performance on: |
| - **Rotation Invariance**: Robust to image rotations up to 360° |
| - **Scale Invariance**: Effective across multiple image scales |
| - **Pose Estimation**: Improved camera pose estimation on MegaDepth benchmark |
| - **Feature Matching**: Enhanced matching accuracy with various feature extractors |
|
|
| ## Supported Feature Extractors |
|
|
| The model works seamlessly with: |
| - SIFT (with brute-force matcher) |
| - SuperPoint (with NN matcher) |
| - D2-Net |
| - R2D2 |
| - DISK |
|
|
| ## Citation |
|
|
| If you find this project useful in your research, please cite our paper: |
|
|
| ```bibtex |
| @article{pan2025scale, |
| title={Scale-aware co-visible region detection for image matching}, |
| author={Pan, Xu and Xia, Zimin and Zheng, Xianwei}, |
| journal={ISPRS Journal of Photogrammetry and Remote Sensing}, |
| volume={229}, |
| pages={122--137}, |
| year={2025}, |
| publisher={Elsevier} |
| } |
| ``` |
|
|
| ## License |
|
|
| This project is licensed under the Apache-2.0 License. See the LICENSE file for details. |
|
|
| ## Acknowledgments |
|
|
| - [MegaDepth](https://github.com/zhengqili/MegaDepth) - Dataset and benchmarks |
| - [OETR](https://github.com/TencentYoutuResearch/ImageMatching-OETR) - Model initialization strategies |
| - PyTorch team for the excellent framework |
|
|
| ## Contact |
|
|
| For questions or issues, please visit the [GitHub repository](https://github.com/SSSSphinx/SCoDe) or contact the authors. |
|
|
| --- |
|
|
| **Paper**: [Scale-aware Co-visible Region Detection for Image Matching](https://www.sciencedirect.com/science/article/abs/pii/S0924271625003260) |
| **Project Page**: [https://xupan.top/Projects/scode](https://xupan.top/Projects/scode) |