Improve model card: Add metadata, tags, and clarify title

#1
by nielsr HF Staff - opened

This PR improves the model card for Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models by adding:

  • pipeline_tag: image-text-to-text: To correctly categorize the model as a Vision-Language Model that takes image and text inputs and generates text outputs. This will make the model discoverable under the relevant pipeline filter on the Hugging Face Hub.
  • library_name: transformers: To enable the automated "Use in Transformers" widget, providing users with an easy way to get started with the model. Evidence from config.json and tokenizer_config.json confirms compatibility with the transformers library, specifically for Qwen2-VL models.
  • Additional descriptive tags: Added qwen2, visual-reasoning, and reinforcement-learning to enhance discoverability.
  • Clarified Model Card Title: The main title has been updated from "Reason-RFT CoT Dateset" to "Reason-RFT Model Checkpoints" to more accurately reflect that the repository primarily hosts model checkpoints, as indicated in the current model card's subtitle.

The existing rich content, including links to the paper (ArXiv), project page, GitHub repository, and usage instructions, remains untouched.

Please review and merge this PR if everything looks good.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment