update preview checkpoint

Browse files

Files changed (8) hide show

README.md +34 -32
config.json → pretrained/preview/config.json +0 -0
generation_config.json → pretrained/preview/generation_config.json +0 -0
preprocessor_config.json → pretrained/preview/preprocessor_config.json +0 -0
processor_config.json → pretrained/preview/processor_config.json +0 -0
pytorch_model.bin.index.json → pretrained/preview/pytorch_model.bin.index.json +0 -0
special_tokens_map.json → pretrained/preview/special_tokens_map.json +0 -0
tokenizer_config.json → pretrained/preview/tokenizer_config.json +0 -0

README.md CHANGED Viewed

@@ -5,64 +5,67 @@ tags:
 - JarvisIR
 - weights
 description: |
-  This is the weights repository for CVPR 2025 JarvisIR paper.
-  Contains all pretrained model weights used in the paper.
 ---
 # JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration
 ## Model Description
-JarvisIR is a novel vision-language model (VLM) based intelligent image restoration system designed for autonomous driving perception under adverse weather conditions. The system uses a VLM as a central controller to dynamically coordinate multiple expert restoration models for handling complex weather degradations including rain, fog, night scenes, and snow.
 ## Key Features
-- **VLM-based Controller**: First framework to use vision-language models for controlling image restoration workflows
-- **Multi-Expert Coordination**: Dynamic scheduling of specialized restoration models (denoising, super-resolution, deraining, etc.)
-- **Weather-Adaptive**: Handles multiple weather degradations: night/low-light, rain, fog, snow scenarios
-- **Two-Stage Training**: Supervised Fine-Tuning (SFT) + Mixed-Rank Reward-based Human Feedback (MRRHF) alignment
 ## Model Architecture
-The system consists of:
-1. **VLM Controller**: Based on LLaVA-v1.5-7B for task planning and model selection
-2. **Expert Models**: Specialized restoration networks for different degradation types
-3. **Reward Models**: Multiple IQA models for quality assessment and alignment
 ## Training Data
-- **CleanBench-Synthetic**: 150K synthetic degraded images with annotations
-- **CleanBench-Real**: 80K real-world adverse weather images for alignment training
-- **Coverage**: Four main weather scenarios (night, rain, fog, snow) with multiple degradation combinations
 ## Performance
-- **50% average improvement** in perception metrics on CleanBench-Real compared to existing all-in-one methods
-- Superior performance across all weather conditions tested
-- Enhanced robustness and generalization to real-world scenarios
 ## Intended Use
-**Primary Applications:**
-- Autonomous driving perception systems
-- Multi-weather image restoration pipelines
-- Research in vision-language model applications
 ## Model Checkpoints
-This repository contains weights for:
-- `jarvisir`: Model after supervised fine-tuning and MRRHF alignment stage
-- `expert-tools/`: Individual specialist restoration model weights
 ## Citation
 ```bibtex
-@inproceedings{jarvisir2025,
-  title={JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration},
-  author={Lin, Yunlong and Lin, Zixu and Chen, Haoyu and Pan, Panwang and Li, Chenxin and Chen, Sixiang and Kairun, Wen and Jin, Yeying and Li, Wenbo and Ding, Xinghao},
-  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
   year={2025}
 }
 ```
@@ -75,5 +78,4 @@ This repository contains weights for:
 ## Acknowledgments
-This work advances the field of intelligent image restoration by combining vision-language models with expert system coordination, specifically targeting autonomous driving applications under challenging weather conditions.

 - JarvisIR
 - weights
 description: |
+  This repository contains the official weights for the CVPR 2025 paper "JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration".
 ---
 # JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration
 ## Model Description
+JarvisIR is a novel system that leverages a Vision-Language Model (VLM) to intelligently restore images for autonomous driving perception in adverse weather. It acts as a central controller, dynamically coordinating multiple expert restoration models to tackle complex degradations such as rain, fog, low-light, and snow.
 ## Key Features
+- **VLM Controller**: The first framework to employ a Vision-Language Model for orchestrating image restoration workflows.
+- **Multi-Expert Coordination**: Dynamically schedules specialized restoration models for tasks like denoising, super-resolution, and deraining.
+- **Adaptive Restoration**: Effectively handles a wide range of adverse weather conditions, including night/low-light, rain, fog, and snow.
+- **Advanced Training Strategy**: Utilizes a two-stage process of Supervised Fine-Tuning (SFT) followed by alignment with Mixed-Rank Reward-based Human Feedback (MRRHF).
 ## Model Architecture
+The system comprises three core components:
+1.  **VLM Controller**: A LLaVA-v1.5-7B model serves as the core for task planning and expert model selection.
+2.  **Expert Models**: A suite of specialized networks, each tailored for a specific restoration task (e.g., deraining, defogging).
+3.  **Reward Models**: A set of Image Quality Assessment (IQA) models that provide feedback for quality assessment and alignment during training.
 ## Training Data
+JarvisIR was trained on a large-scale, comprehensive dataset:
+- **CleanBench-Synthetic**: A dataset of 150,000 synthetically degraded images with corresponding annotations.
+- **CleanBench-Real**: A collection of 80,000 real-world images captured in adverse weather, used for alignment training.
+- **Comprehensive Coverage**: The data covers four primary weather scenarios (night, rain, fog, snow) with various combinations of degradations.
 ## Performance
+- Achieves a **50% average improvement** in perception metrics on the CleanBench-Real dataset compared to state-of-the-art all-in-one methods.
+- Demonstrates superior performance across all tested weather conditions.
+- Exhibits enhanced robustness and generalization capabilities in real-world driving scenarios.
 ## Intended Use
+**Primary Use Cases:**
+- Enhancing perception systems in autonomous vehicles.
+- Building robust, multi-weather image restoration pipelines.
+- Advancing research into the applications of Vision-Language Models in image processing.
 ## Model Checkpoints
+This repository provides the following model weights:
+- `pertained`: The complete model after both Supervised Fine-Tuning and MRRHF alignment stages.
+- `agent-tools/`: The weights for each individual expert restoration model.
 ## Citation
+If you find JarvisIR useful in your research, please cite our paper:
 ```bibtex
+@inproceedings{lin2025jarvisir,
+  title={Jarvisir: Elevating autonomous driving perception with intelligent image restoration},
+  author={Lin, Yunlong and Lin, Zixu and Chen, Haoyu and Pan, Panwang and Li, Chenxin and Chen, Sixiang and Wen, Kairun and Jin, Yeying and Li, Wenbo and Ding, Xinghao},
+  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
+  pages={22369--22380},
   year={2025}
 }
 ```
 ## Acknowledgments
+This work contributes to the advancement of intelligent image restoration by integrating Vision-Language Models with expert system coordination.

config.json → pretrained/preview/config.json RENAMED Viewed

File without changes

generation_config.json → pretrained/preview/generation_config.json RENAMED Viewed

File without changes

preprocessor_config.json → pretrained/preview/preprocessor_config.json RENAMED Viewed

File without changes

processor_config.json → pretrained/preview/processor_config.json RENAMED Viewed

File without changes

pytorch_model.bin.index.json → pretrained/preview/pytorch_model.bin.index.json RENAMED Viewed

File without changes

special_tokens_map.json → pretrained/preview/special_tokens_map.json RENAMED Viewed

File without changes

tokenizer_config.json → pretrained/preview/tokenizer_config.json RENAMED Viewed

File without changes