update preview checkpoint
Browse files- README.md +34 -32
- config.json β pretrained/preview/config.json +0 -0
- generation_config.json β pretrained/preview/generation_config.json +0 -0
- preprocessor_config.json β pretrained/preview/preprocessor_config.json +0 -0
- processor_config.json β pretrained/preview/processor_config.json +0 -0
- pytorch_model.bin.index.json β pretrained/preview/pytorch_model.bin.index.json +0 -0
- special_tokens_map.json β pretrained/preview/special_tokens_map.json +0 -0
- tokenizer_config.json β pretrained/preview/tokenizer_config.json +0 -0
README.md
CHANGED
@@ -5,64 +5,67 @@ tags:
|
|
5 |
- JarvisIR
|
6 |
- weights
|
7 |
description: |
|
8 |
-
This
|
9 |
-
Contains all pretrained model weights used in the paper.
|
10 |
---
|
11 |
|
12 |
# JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration
|
13 |
|
14 |
## Model Description
|
15 |
|
16 |
-
JarvisIR is a novel
|
17 |
|
18 |
## Key Features
|
19 |
|
20 |
-
- **VLM
|
21 |
-
- **Multi-Expert Coordination**:
|
22 |
-
- **
|
23 |
-
- **
|
24 |
|
25 |
## Model Architecture
|
26 |
|
27 |
-
The system
|
28 |
-
|
29 |
-
|
30 |
-
|
|
|
31 |
|
32 |
## Training Data
|
33 |
|
34 |
-
|
35 |
-
|
36 |
-
- **
|
|
|
|
|
37 |
|
38 |
## Performance
|
39 |
|
40 |
-
- **50% average improvement** in perception metrics on CleanBench-Real compared to
|
41 |
-
-
|
42 |
-
-
|
43 |
|
44 |
## Intended Use
|
45 |
|
46 |
-
**Primary
|
47 |
-
-
|
48 |
-
-
|
49 |
-
-
|
50 |
-
|
51 |
|
52 |
## Model Checkpoints
|
53 |
|
54 |
-
This repository
|
55 |
-
- `
|
56 |
-
- `
|
57 |
-
|
58 |
|
59 |
## Citation
|
60 |
|
|
|
|
|
61 |
```bibtex
|
62 |
-
@inproceedings{
|
63 |
-
title={
|
64 |
-
author={Lin, Yunlong and Lin, Zixu and Chen, Haoyu and Pan, Panwang and Li, Chenxin and Chen, Sixiang and
|
65 |
-
booktitle={Proceedings of the
|
|
|
66 |
year={2025}
|
67 |
}
|
68 |
```
|
@@ -75,5 +78,4 @@ This repository contains weights for:
|
|
75 |
|
76 |
## Acknowledgments
|
77 |
|
78 |
-
This work
|
79 |
-
|
|
|
5 |
- JarvisIR
|
6 |
- weights
|
7 |
description: |
|
8 |
+
This repository contains the official weights for the CVPR 2025 paper "JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration".
|
|
|
9 |
---
|
10 |
|
11 |
# JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration
|
12 |
|
13 |
## Model Description
|
14 |
|
15 |
+
JarvisIR is a novel system that leverages a Vision-Language Model (VLM) to intelligently restore images for autonomous driving perception in adverse weather. It acts as a central controller, dynamically coordinating multiple expert restoration models to tackle complex degradations such as rain, fog, low-light, and snow.
|
16 |
|
17 |
## Key Features
|
18 |
|
19 |
+
- **VLM Controller**: The first framework to employ a Vision-Language Model for orchestrating image restoration workflows.
|
20 |
+
- **Multi-Expert Coordination**: Dynamically schedules specialized restoration models for tasks like denoising, super-resolution, and deraining.
|
21 |
+
- **Adaptive Restoration**: Effectively handles a wide range of adverse weather conditions, including night/low-light, rain, fog, and snow.
|
22 |
+
- **Advanced Training Strategy**: Utilizes a two-stage process of Supervised Fine-Tuning (SFT) followed by alignment with Mixed-Rank Reward-based Human Feedback (MRRHF).
|
23 |
|
24 |
## Model Architecture
|
25 |
|
26 |
+
The system comprises three core components:
|
27 |
+
|
28 |
+
1. **VLM Controller**: A LLaVA-v1.5-7B model serves as the core for task planning and expert model selection.
|
29 |
+
2. **Expert Models**: A suite of specialized networks, each tailored for a specific restoration task (e.g., deraining, defogging).
|
30 |
+
3. **Reward Models**: A set of Image Quality Assessment (IQA) models that provide feedback for quality assessment and alignment during training.
|
31 |
|
32 |
## Training Data
|
33 |
|
34 |
+
JarvisIR was trained on a large-scale, comprehensive dataset:
|
35 |
+
|
36 |
+
- **CleanBench-Synthetic**: A dataset of 150,000 synthetically degraded images with corresponding annotations.
|
37 |
+
- **CleanBench-Real**: A collection of 80,000 real-world images captured in adverse weather, used for alignment training.
|
38 |
+
- **Comprehensive Coverage**: The data covers four primary weather scenarios (night, rain, fog, snow) with various combinations of degradations.
|
39 |
|
40 |
## Performance
|
41 |
|
42 |
+
- Achieves a **50% average improvement** in perception metrics on the CleanBench-Real dataset compared to state-of-the-art all-in-one methods.
|
43 |
+
- Demonstrates superior performance across all tested weather conditions.
|
44 |
+
- Exhibits enhanced robustness and generalization capabilities in real-world driving scenarios.
|
45 |
|
46 |
## Intended Use
|
47 |
|
48 |
+
**Primary Use Cases:**
|
49 |
+
- Enhancing perception systems in autonomous vehicles.
|
50 |
+
- Building robust, multi-weather image restoration pipelines.
|
51 |
+
- Advancing research into the applications of Vision-Language Models in image processing.
|
|
|
52 |
|
53 |
## Model Checkpoints
|
54 |
|
55 |
+
This repository provides the following model weights:
|
56 |
+
- `pertained`: The complete model after both Supervised Fine-Tuning and MRRHF alignment stages.
|
57 |
+
- `agent-tools/`: The weights for each individual expert restoration model.
|
|
|
58 |
|
59 |
## Citation
|
60 |
|
61 |
+
If you find JarvisIR useful in your research, please cite our paper:
|
62 |
+
|
63 |
```bibtex
|
64 |
+
@inproceedings{lin2025jarvisir,
|
65 |
+
title={Jarvisir: Elevating autonomous driving perception with intelligent image restoration},
|
66 |
+
author={Lin, Yunlong and Lin, Zixu and Chen, Haoyu and Pan, Panwang and Li, Chenxin and Chen, Sixiang and Wen, Kairun and Jin, Yeying and Li, Wenbo and Ding, Xinghao},
|
67 |
+
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
|
68 |
+
pages={22369--22380},
|
69 |
year={2025}
|
70 |
}
|
71 |
```
|
|
|
78 |
|
79 |
## Acknowledgments
|
80 |
|
81 |
+
This work contributes to the advancement of intelligent image restoration by integrating Vision-Language Models with expert system coordination.
|
|
config.json β pretrained/preview/config.json
RENAMED
File without changes
|
generation_config.json β pretrained/preview/generation_config.json
RENAMED
File without changes
|
preprocessor_config.json β pretrained/preview/preprocessor_config.json
RENAMED
File without changes
|
processor_config.json β pretrained/preview/processor_config.json
RENAMED
File without changes
|
pytorch_model.bin.index.json β pretrained/preview/pytorch_model.bin.index.json
RENAMED
File without changes
|
special_tokens_map.json β pretrained/preview/special_tokens_map.json
RENAMED
File without changes
|
tokenizer_config.json β pretrained/preview/tokenizer_config.json
RENAMED
File without changes
|