[2601.09220] From Hawkes Processes to Attention: Time-Modulated

[2601.09220] From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences

arXiv - Machine Learning March 25, 2026 3 min read

About this article

Abstract page for arXiv paper 2601.09220: From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences

Computer Science > Machine Learning arXiv:2601.09220 (cs) [Submitted on 14 Jan 2026 (v1), last revised 24 Mar 2026 (this version, v2)] Title:From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences Authors:Xinzi Tan, Kejian Zhang, Junhan Yu, Doudou Zhou View a PDF of the paper titled From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences, by Xinzi Tan and 3 other authors View PDF HTML (experimental) Abstract:Marked Temporal Point Processes (MTPPs) arise naturally in medical, social, commercial, and financial domains. However, existing Transformer-based methods mostly inject temporal information only via positional encodings, relying on shared or parametric decay structures, which limits their ability to capture heterogeneous and type-specific temporal effects. Inspired by this observation, we derive a novel attention operator called Hawkes Attention from the multivariate Hawkes process theory for MTPP, using learnable per-type neural kernels to modulate query, key and value projections, thereby replacing the corresponding parts in the traditional attention. Benefited from the design, Hawkes Attention unifies event timing and content interaction, learning both the time-relevant behavior and type-specific excitation patterns from the data. The experimental results show that our method achieves better performance compared to the baselines. In addition to the general MTPP, our attention mechanism can also be easily applied to...

Originally published on March 25, 2026. Curated by AI News.

Machine Learning

[P] ML project (XGBoost + Databricks + MLflow) — how to talk about “production issues” in interviews?

Hey all, I recently built an end-to-end fraud detection project using a large banking dataset: Trained an XGBoost model Used Databricks f...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

[D] The memory chip market lost tens of billions over a paper this community would have understood in 10 minutes

TurboQuant was teased recently and tens of billions gone from memory chip market in 48 hours but anyone in this community who read the pa...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

Copilot is ‘for entertainment purposes only,’ according to Microsoft’s terms of use | TechCrunch

AI skeptics aren’t the only ones warning users not to unthinkingly trust models’ outputs — that’s what the AI companies say themselves in...

TechCrunch - AI · 3 min · about 1 hour ago

Machine Learning

[P] Fused MoE Dispatch in Pure Triton: Beating CUDA-Optimized Megablocks at Inference Batch Sizes

I built a fused MoE dispatch kernel in pure Triton that handles the full forward pass for Mixture-of-Experts models. No CUDA, no vendor-s...

Reddit - Machine Learning · 1 min · about 2 hours ago

[2601.09220] From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences

About this article

Related Articles

[P] ML project (XGBoost + Databricks + MLflow) — how to talk about “production issues” in interviews?

[D] The memory chip market lost tens of billions over a paper this community would have understood in 10 minutes

Copilot is ‘for entertainment purposes only,’ according to Microsoft’s terms of use | TechCrunch

[P] Fused MoE Dispatch in Pure Triton: Beating CUDA-Optimized Megablocks at Inference Batch Sizes

No comments

Stay updated with AI News