arxiv:2509.19186

Improving Test-Time Performance of RVQ-based Neural Codecs

Published on Sep 23

Authors:

Abstract

An encoding algorithm improves synthesis quality in RVQ-based neural audio codecs by reducing quantization errors through better code selection.

AI-generated summary

The residual vector quantization (RVQ) technique plays a central role in recent advances in neural audio codecs. These models effectively synthesize high-fidelity audio from a limited number of codes due to the hierarchical structure among quantization levels. In this paper, we propose an encoding algorithm to further enhance the synthesis quality of RVQ-based neural codecs at test-time. Firstly, we point out the suboptimal nature of quantized vectors generated by conventional methods. We demonstrate that quantization error can be mitigated by selecting a different set of codes. Subsequently, we present our encoding algorithm, designed to identify a set of discrete codes that achieve a lower quantization error. We then apply the proposed method to pre-trained models and evaluate its efficacy using diverse metrics. Our experimental findings validate that our method not only reduces quantization errors, but also improves synthesis quality.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2509.19186 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2509.19186 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2509.19186 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.