Catostrophic Model Failure
The model consistently entirely collapses and becomes unusable, or drifts into tangents extremely irrelevant to user question.
Question for biography of a music artist leads to emoji spam and model collapse.
Question about an obscure UK town leads to overexplanation and entirely irrelevant tangent.
My basic point is that unless this is fixed, or someone (somehow given size) releases a finetune, this model is very much not ready for any production workloads.
Looks like a chat template problem to me
Hi, Kiera, thanks for reporting the issue. Was trying to repro this information from ZenMux chat and several other entry point, was not successful.
Would you mind to share which inference provider you are using? (looks like OpenRouter AFAIK?) We also don't seem to see your second query. It would be beneficial if we could have that too.
Let's isolate the issue to see if it is model related or environment related. Best.
I use OpenRouter and have no problem reproducing problems. My assumption was also that this model was not finetuned very well. Just throw anything out on openrouter. This isn't a great prompt but it should be good enough:
"produce a script to compile all spreadsheets into a common spreadsheet using openpyxl. assume the number of columns is the same across all spreadsheets"
here's the end of the output
I used the Ling-1T ZenMux API with Roo Code. I set context window size to 128000 and everything else to default (e.g. I didn't change temperature).
In "Ask" mode, I entered the prompt same as the original post:
"Give me a biography on avicii"
It asked me if I wanted to "Provide general biography" or "Search external sources". I chose the first one, and got this:
Avicii (Tim Bergling) Biography
Full Name: Tim Bergling
Born: September 8, 1989, in Stockholm, Sweden
Died: April 20, 2018, in Muscat, Oman (aged 28)
Profession: DJ, record producer, remixer, songwriter
Career Highlights:
Rose to prominence in 2011 with the single "Levels," which became an international hit and signature EDM anthem.
Released debut album True (2013), blending electronic music with country, folk, and bluegrass elements; included hits "Wake Me Up" and "Hey Brother."
Follow-up album Stories (2015) featured collaborations with artists like Chris Martin (Coldplay) and Ryan Tedder (OneRepublic).
Known for pioneering the "big room house" subgenre and influencing mainstream pop-EDM fusion.
Retired from touring in 2016 due to health issues (acute pancreatitis, partly linked to heavy alcohol use) but continued producing music.
Legacy:
Posthumous EP Avīci (01) released in 2017.
Final studio album TIM released in 2019, completed by his family and collaborators.
The Tim Bergling Foundation established to support mental health awareness and suicide prevention in the music industry.
Inducted into the Swedish Music Hall of Fame in 2020.
Awards & Recognition:
2× Grammy nominee
Multiple EDM Awards, including Top DJ (Billboard, 2013–2015)
Ranked among DJ Mag’s Top 100 DJs multiple times (peaked at #3 in 2012, 2013)
Influence:
Avicii’s melodic, emotionally resonant style bridged electronic dance music and mainstream pop, inspiring a generation of producers. His struggles with mental health and industry pressures brought attention to artist well-being.
So that looks just fine.
I also did the "convert bash to python" prompt that
@StephenGenusa
posted, and that produced a proper Python script (untested, but it looked reasonable). It also added deps to requirements.txt
and gave a good explanation of the script.
Perhaps it is an issue with OpenRouter's template or something along those lines. It doesn't appear to be an issue with the model itself, AFAICT.
@jondurbin can you please have a look at this?
https://chutes.ai/app/chute/2da3c94d-c58c-5c1f-a65f-b2acd105603b?tab=chat Seems pretty coherent here when using chutes directly, let me reach out to openrouter to see if they are changing the chat template or something before it gets to us.
We (OpenRouter) will work with Chutes to investigate. We weren't applying any chat template on our end. Thanks for flagging folks.
The model was pulled while we take a look; hopefully it's something trivial and easy to fix. Thanks for the report, apologies for the inconvenience.
https://chutes.ai/app/chute/2da3c94d-c58c-5c1f-a65f-b2acd105603b?tab=chat If anyone wants to give it a test run, it's back up for free on chutes at the moment to get more feedback. I think the issue may have been that the FP8 version (via llm-compressor) didn't have the YaRN scaling applied when performing calibration. I regenerated the FP8 using 10k samples from ultrachat 200k, stratified by the total context size, with rope scaling applied during calibration.
Just in some casual one-off testing on the chat interface it seems to be working well, but please ping me asap if any issues arise!
@jondurbin , @pingToven and @RichardBian
I created an account on Chutes and was able to test all of the above prompts plus a recent prompt I had, asking AI to help with prompting for AI to make qualitative judgements. Wow. Everything was perfect! I am back to being really excited about this model!
By way of comparison, yesterday I asked grok to do "Deep Research" on research involving AI prompting for qualitative judgements. I kind-of sort-of got some of what I was looking for. The response I got from Ling-1T was a more concise response with a lot of information. Very impressed!
We (OpenRouter) will work with Chutes to investigate. We weren't applying any chat template on our end. Thanks for flagging folks.
Edit: nvm the provider got just taken offline anyways (;