[2511.20224] DuoTok: Source-Aware Dual-Track Tokenization for

[2511.20224] DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling

arXiv - AI April 02, 2026 3 min read

About this article

Abstract page for arXiv paper 2511.20224: DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling

Computer Science > Sound arXiv:2511.20224 (cs) [Submitted on 25 Nov 2025 (v1), last revised 1 Apr 2026 (this version, v2)] Title:DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling Authors:Rui Lin, Zhiyue Wu, Jiahe Le, Kangdi Wang, Weixiong Chen, Junyu Dai, Tao Jiang View a PDF of the paper titled DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling, by Rui Lin and 6 other authors View PDF HTML (experimental) Abstract:Audio tokenization bridges continuous waveforms and multi-track music language models. In dual-track modeling, tokens should preserve three properties at once: high-fidelity reconstruction, strong predictability under a language model, and cross-track correspondence. We introduce DuoTok, a source-aware dual-track tokenizer that addresses this trade-off through staged disentanglement. DuoTok first pretrains a semantic encoder, then regularizes it with multi-task supervision, freezes the encoder, and applies hard dual-codebook routing while keeping auxiliary objectives on quantized codes. A diffusion decoder reconstructs high-frequency details, allowing tokens to focus on structured information for sequence modeling. On standard benchmarks, DuoTok achieves a favorable predictability-fidelity trade-off, reaching the lowest cnBPT while maintaining competitive reconstruction at 0.75 kbps. Under a held-constant dual-track language modeling protocol, enBPT also improves, indicating gains beyond codeboo...

Originally published on April 02, 2026. Curated by AI News.

Llms

[2512.02966] Lumos: Let there be Language Model System Certification

Abstract page for arXiv paper 2512.02966: Lumos: Let there be Language Model System Certification

arXiv - AI · 4 min · about 2 hours ago

Llms

[2602.00750] Bypassing Prompt Injection Detectors through Evasive Injections

Abstract page for arXiv paper 2602.00750: Bypassing Prompt Injection Detectors through Evasive Injections

arXiv - AI · 4 min · about 2 hours ago

Llms

[2511.08225] Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback

Abstract page for arXiv paper 2511.08225: Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback

arXiv - AI · 4 min · about 2 hours ago

Llms

[2506.09354] "Is This Really a Human Peer Supporter?": Misalignments Between Peer Supporters and Experts in LLM-Supported Interactions

Abstract page for arXiv paper 2506.09354: "Is This Really a Human Peer Supporter?": Misalignments Between Peer Supporters and Experts in ...

arXiv - AI · 4 min · about 2 hours ago

[2511.20224] DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling

About this article

Related Articles

[2512.02966] Lumos: Let there be Language Model System Certification

[2602.00750] Bypassing Prompt Injection Detectors through Evasive Injections

[2511.08225] Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback

[2506.09354] "Is This Really a Human Peer Supporter?": Misalignments Between Peer Supporters and Experts in LLM-Supported Interactions

No comments

Stay updated with AI News