ckpt-600 — merged

Self-contained snapshot of macminix/qwen3_voice_design_t1 with the ckpt-600 LoRA adapter folded into the Talker weights. No PEFT layers at runtime — load directly with Qwen3TTSModel.from_pretrained.

Quick start

from qwen_tts import Qwen3TTSModel

wrap = Qwen3TTSModel.from_pretrained("<this-repo-id>")
wavs, sr = wrap.generate_voice_design(
    text="Hello, this is a test.",
    instruct="A young adult female speaker speaks calmly at a normal pace.",
    language="english",
    temperature=0.9, top_p=1.0, top_k=50,
    repetition_penalty=1.05, max_new_tokens=600,
)

Repository layout

Same shape as the upstream macminix repo:

config.json, model.safetensors — Qwen3-TTS Talker + Code Predictor (merged)
speech_tokenizer/ — 12.5 fps × 16 codebook neural codec (unchanged)
tokenizer.*, vocab.json, merges.txt, added_tokens.json, special_tokens_map.json — Qwen2 BPE tokenizer
generation_config.json, preprocessor_config.json
vocence_config.yaml, chute_config.yml — runtime + Chutes deploy hints

You can drop your own miner.py into this repo (same contract as macminix's: class Miner with __init__(path_hf_repo: Path), warmup(), generate_wav(instruction, text) → (np.ndarray, int)); the standard Vocence chute wrapper will load this model unchanged.

Provenance

See merge_info.json for the exact base path, adapter path, LoRA hyperparameters, and merge timestamp.

Downloads last month: 14

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for handsometiger0202/f-llm

Base model

Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign

Adapter

macminix/qwen3_voice_design_t1

Finetuned

(6)

this model