Soeren Moeller Christensen 's picture

Soeren Moeller Christensen

Sorenmc

·

Sorenmc

AI & ML interests

NLP, reinforcement learning

Recent Activity

upvoted an article about 1 month ago

Building and evaluating Multimodal Rerankers

commented on an article about 1 month ago

Building and evaluating Multimodal Rerankers

new activity 7 months ago

unsloth/Qwen2.5-VL-7B-Instruct-unsloth-bnb-4bit:Broken in transformers 4.53 >

View all activity

Organizations

None yet

upvoted an article about 1 month ago

Article

Building and evaluating Multimodal Rerankers

Nov 30, 2025

•

8

commented on Building and evaluating Multimodal Rerankers about 1 month ago

Very cool with the RL usage for rerankers!!

Wanted to share an alternative that I did october 2024 when no information was available on multimodal rerankers (https://github.com/huggingface/transformers/pull/34086)

I trained a multimodal reranker based on Qwen2-VL (later retrained on Qwen2.5-VL) and I experimented with the classification layer a bit! We used it as a binary classifier for relevance instead of actually ranking though.

First I tried the LM head style that you also used here, which i was inspired to try by reading this blogpost https://www.lighton.ai/lighton-blogs/monoqwen-vision, basically just trained this with focal loss on the "yes" "no" tokens

Then i tried replacing the LM head with a binary classifier head and also trained this with focal loss.

The binary classifier head was cheaper to train and run inference on because we didn't have to materialize the whole vocab just to get the 2 decision tokens (although you could do the slicing trick you also wrote about here), it was faster and also performance was slightly better (5% higher f1 if i remember correctly)

New activity in unsloth/Qwen2.5-VL-7B-Instruct-unsloth-bnb-4bit 7 months ago

Broken in transformers 4.53 >

#7 opened 7 months ago by

liked a model about 1 year ago

lmms-lab/Qwen2-VL-7B-GRPO-8k

8B • Updated Jan 28, 2025 • 3 • 3

upvoted a paper about 1 year ago

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published Dec 6, 2024 • 160

liked 2 models over 1 year ago

lmms-lab/llava-critic-7b

8B • Updated Oct 4, 2024 • 1.31k • 15

lightonai/MonoQwen2-VL-v0.1

Visual Document Retrieval • Updated Jun 3, 2025 • 784 • 47

commented a paper over 1 year ago

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 189 •

liked a model almost 2 years ago

knowledgator/UTC-DeBERTa-large

Token Classification • Updated May 15, 2024 • 3 • 14

updated a collection about 2 years ago

text-to-speech

2 items • Updated Dec 7, 2023

upvoted a paper about 2 years ago

Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis

Paper • 2312.03491 • Published Dec 6, 2023 • 34

liked 2 models over 2 years ago

TaylorAI/gte-tiny

Sentence Similarity • Updated Oct 7, 2023 • 77.3k • • 140

microsoft/phi-1_5

Text Generation • 1B • Updated Nov 24, 2025 • 96.4k • 1.35k

New activity in biu-nlp/abstract-sim-sentence over 2 years ago

Can i use SentenceTransformers instead of the presented code?

#1 opened over 2 years ago by

liked a model over 2 years ago

LongSafari/hyenadna-large-1m-seqlen

Updated Aug 13, 2023 • 63 • 30

updated a model almost 3 years ago

Sorenmc/q-FrozenLake-v1-4x4-noSlippery

Reinforcement Learning • Updated Feb 28, 2023

liked a Space about 3 years ago

Language Identification on 102 Languages

Identify language of text input