Tltly2013 commited on
Commit
a1e61eb
Β·
1 Parent(s): 850f48c

update preview checkpoint

Browse files
README.md CHANGED
@@ -5,64 +5,67 @@ tags:
5
  - JarvisIR
6
  - weights
7
  description: |
8
- This is the weights repository for CVPR 2025 JarvisIR paper.
9
- Contains all pretrained model weights used in the paper.
10
  ---
11
 
12
  # JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration
13
 
14
  ## Model Description
15
 
16
- JarvisIR is a novel vision-language model (VLM) based intelligent image restoration system designed for autonomous driving perception under adverse weather conditions. The system uses a VLM as a central controller to dynamically coordinate multiple expert restoration models for handling complex weather degradations including rain, fog, night scenes, and snow.
17
 
18
  ## Key Features
19
 
20
- - **VLM-based Controller**: First framework to use vision-language models for controlling image restoration workflows
21
- - **Multi-Expert Coordination**: Dynamic scheduling of specialized restoration models (denoising, super-resolution, deraining, etc.)
22
- - **Weather-Adaptive**: Handles multiple weather degradations: night/low-light, rain, fog, snow scenarios
23
- - **Two-Stage Training**: Supervised Fine-Tuning (SFT) + Mixed-Rank Reward-based Human Feedback (MRRHF) alignment
24
 
25
  ## Model Architecture
26
 
27
- The system consists of:
28
- 1. **VLM Controller**: Based on LLaVA-v1.5-7B for task planning and model selection
29
- 2. **Expert Models**: Specialized restoration networks for different degradation types
30
- 3. **Reward Models**: Multiple IQA models for quality assessment and alignment
 
31
 
32
  ## Training Data
33
 
34
- - **CleanBench-Synthetic**: 150K synthetic degraded images with annotations
35
- - **CleanBench-Real**: 80K real-world adverse weather images for alignment training
36
- - **Coverage**: Four main weather scenarios (night, rain, fog, snow) with multiple degradation combinations
 
 
37
 
38
  ## Performance
39
 
40
- - **50% average improvement** in perception metrics on CleanBench-Real compared to existing all-in-one methods
41
- - Superior performance across all weather conditions tested
42
- - Enhanced robustness and generalization to real-world scenarios
43
 
44
  ## Intended Use
45
 
46
- **Primary Applications:**
47
- - Autonomous driving perception systems
48
- - Multi-weather image restoration pipelines
49
- - Research in vision-language model applications
50
-
51
 
52
  ## Model Checkpoints
53
 
54
- This repository contains weights for:
55
- - `jarvisir`: Model after supervised fine-tuning and MRRHF alignment stage
56
- - `expert-tools/`: Individual specialist restoration model weights
57
-
58
 
59
  ## Citation
60
 
 
 
61
  ```bibtex
62
- @inproceedings{jarvisir2025,
63
- title={JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration},
64
- author={Lin, Yunlong and Lin, Zixu and Chen, Haoyu and Pan, Panwang and Li, Chenxin and Chen, Sixiang and Kairun, Wen and Jin, Yeying and Li, Wenbo and Ding, Xinghao},
65
- booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
 
66
  year={2025}
67
  }
68
  ```
@@ -75,5 +78,4 @@ This repository contains weights for:
75
 
76
  ## Acknowledgments
77
 
78
- This work advances the field of intelligent image restoration by combining vision-language models with expert system coordination, specifically targeting autonomous driving applications under challenging weather conditions.
79
-
 
5
  - JarvisIR
6
  - weights
7
  description: |
8
+ This repository contains the official weights for the CVPR 2025 paper "JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration".
 
9
  ---
10
 
11
  # JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration
12
 
13
  ## Model Description
14
 
15
+ JarvisIR is a novel system that leverages a Vision-Language Model (VLM) to intelligently restore images for autonomous driving perception in adverse weather. It acts as a central controller, dynamically coordinating multiple expert restoration models to tackle complex degradations such as rain, fog, low-light, and snow.
16
 
17
  ## Key Features
18
 
19
+ - **VLM Controller**: The first framework to employ a Vision-Language Model for orchestrating image restoration workflows.
20
+ - **Multi-Expert Coordination**: Dynamically schedules specialized restoration models for tasks like denoising, super-resolution, and deraining.
21
+ - **Adaptive Restoration**: Effectively handles a wide range of adverse weather conditions, including night/low-light, rain, fog, and snow.
22
+ - **Advanced Training Strategy**: Utilizes a two-stage process of Supervised Fine-Tuning (SFT) followed by alignment with Mixed-Rank Reward-based Human Feedback (MRRHF).
23
 
24
  ## Model Architecture
25
 
26
+ The system comprises three core components:
27
+
28
+ 1. **VLM Controller**: A LLaVA-v1.5-7B model serves as the core for task planning and expert model selection.
29
+ 2. **Expert Models**: A suite of specialized networks, each tailored for a specific restoration task (e.g., deraining, defogging).
30
+ 3. **Reward Models**: A set of Image Quality Assessment (IQA) models that provide feedback for quality assessment and alignment during training.
31
 
32
  ## Training Data
33
 
34
+ JarvisIR was trained on a large-scale, comprehensive dataset:
35
+
36
+ - **CleanBench-Synthetic**: A dataset of 150,000 synthetically degraded images with corresponding annotations.
37
+ - **CleanBench-Real**: A collection of 80,000 real-world images captured in adverse weather, used for alignment training.
38
+ - **Comprehensive Coverage**: The data covers four primary weather scenarios (night, rain, fog, snow) with various combinations of degradations.
39
 
40
  ## Performance
41
 
42
+ - Achieves a **50% average improvement** in perception metrics on the CleanBench-Real dataset compared to state-of-the-art all-in-one methods.
43
+ - Demonstrates superior performance across all tested weather conditions.
44
+ - Exhibits enhanced robustness and generalization capabilities in real-world driving scenarios.
45
 
46
  ## Intended Use
47
 
48
+ **Primary Use Cases:**
49
+ - Enhancing perception systems in autonomous vehicles.
50
+ - Building robust, multi-weather image restoration pipelines.
51
+ - Advancing research into the applications of Vision-Language Models in image processing.
 
52
 
53
  ## Model Checkpoints
54
 
55
+ This repository provides the following model weights:
56
+ - `pertained`: The complete model after both Supervised Fine-Tuning and MRRHF alignment stages.
57
+ - `agent-tools/`: The weights for each individual expert restoration model.
 
58
 
59
  ## Citation
60
 
61
+ If you find JarvisIR useful in your research, please cite our paper:
62
+
63
  ```bibtex
64
+ @inproceedings{lin2025jarvisir,
65
+ title={Jarvisir: Elevating autonomous driving perception with intelligent image restoration},
66
+ author={Lin, Yunlong and Lin, Zixu and Chen, Haoyu and Pan, Panwang and Li, Chenxin and Chen, Sixiang and Wen, Kairun and Jin, Yeying and Li, Wenbo and Ding, Xinghao},
67
+ booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
68
+ pages={22369--22380},
69
  year={2025}
70
  }
71
  ```
 
78
 
79
  ## Acknowledgments
80
 
81
+ This work contributes to the advancement of intelligent image restoration by integrating Vision-Language Models with expert system coordination.
 
config.json β†’ pretrained/preview/config.json RENAMED
File without changes
generation_config.json β†’ pretrained/preview/generation_config.json RENAMED
File without changes
preprocessor_config.json β†’ pretrained/preview/preprocessor_config.json RENAMED
File without changes
processor_config.json β†’ pretrained/preview/processor_config.json RENAMED
File without changes
pytorch_model.bin.index.json β†’ pretrained/preview/pytorch_model.bin.index.json RENAMED
File without changes
special_tokens_map.json β†’ pretrained/preview/special_tokens_map.json RENAMED
File without changes
tokenizer_config.json β†’ pretrained/preview/tokenizer_config.json RENAMED
File without changes