[2602.19816] Depth-Structured Music Recurrence: Budgeted Recurrent Attention for Full-Piece Symbolic Music Modeling
Summary
The paper presents Depth-Structured Music Recurrence (DSMR), a novel approach for symbolic music modeling that optimizes long-context processing while managing computational resources effectively.
Why It Matters
As music generation increasingly relies on AI, efficient modeling techniques like DSMR are crucial for enabling high-quality compositions on resource-constrained devices. This research addresses the challenge of long-context modeling in music, making it relevant for developers and researchers in AI and music technology.
Key Takeaways
- DSMR enhances long-context modeling for symbolic music generation.
- It employs a budgeted recurrent attention mechanism to optimize memory usage.
- The model is trained in a single pass, mirroring a musician's experience.
- Depth-wise horizon allocations improve efficiency without sacrificing performance.
- Experiments show DSMR's effectiveness on the MAESTRO dataset.
Computer Science > Sound arXiv:2602.19816 (cs) [Submitted on 23 Feb 2026] Title:Depth-Structured Music Recurrence: Budgeted Recurrent Attention for Full-Piece Symbolic Music Modeling Authors:Yungang Yi View a PDF of the paper titled Depth-Structured Music Recurrence: Budgeted Recurrent Attention for Full-Piece Symbolic Music Modeling, by Yungang Yi View PDF HTML (experimental) Abstract:Long-context modeling is essential for symbolic music generation, since motif repetition and developmental variation can span thousands of musical events. However, practical composition and performance workflows frequently rely on resource-limited devices (e.g., electronic instruments and portable computers), making heavy memory and attention computation difficult to deploy. We introduce Depth-Structured Music Recurrence (DSMR), a recurrent long-context Transformer for full-piece symbolic music modeling that extends context beyond fixed-length excerpts via segment-level recurrence with detached cross-segment states, featuring a layer-wise memory-horizon schedule that budgets recurrent KV states across depth. DSMR is trained in a single left-to-right pass over each complete composition, akin to how a musician experiences it from beginning to end, while carrying recurrent cross-segment states forward. Within this recurrent framework, we systematically study how depth-wise horizon allocations affect optimization, best-checkpoint perplexity, and efficiency. By allocating different history-window...