Educational PyTorch repo for distributed training from scratch: DP, FSDP, TP, FSDP+TP, and PP
I put together a small educational repo that implements distributed training parallelism from scratch in PyTorch: https://github.com/shre...
ML algorithms, training, and inference
I put together a small educational repo that implements distributed training parallelism from scratch in PyTorch: https://github.com/shre...
AMD’s AI director just analyzed 6,852 Claude Code sessions, 234,760 tool calls, and 17,871 thinking blocks. Her conclusion: “Claude canno...
Code of Project: https://github.com/paulo101977/notebooks-rl/tree/main/re_requiem I’ve been working on training an agent to play a segmen...
Abstract page for arXiv paper 2508.02330: A Compression Based Classification Framework Using Symbolic Dynamics of Chaotic Maps
Abstract page for arXiv paper 2507.21037: When Brain Foundation Model Meets Cauchy-Schwarz Divergence: A New Framework for Cross-Subject ...
Abstract page for arXiv paper 2507.07580: COALA: Numerically Stable and Efficient Framework for Context-Aware Low-Rank Approximation
Abstract page for arXiv paper 2506.06482: TimeRecipe: A Time-Series Forecasting Recipe via Benchmarking Module Level Effectiveness
Abstract page for arXiv paper 2506.06303: Reward Is Enough: LLMs Are In-Context Reinforcement Learners
Abstract page for arXiv paper 2506.04831: EHR2Path: Scalable Modeling of Longitudinal Patient Pathways from Multimodal Electronic Health ...
Abstract page for arXiv paper 2505.22785: Navigating the Latent Space Dynamics of Neural Models
Abstract page for arXiv paper 2505.16950: Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning
Abstract page for arXiv paper 2505.15516: Explainable embeddings with Distance Explainer
Abstract page for arXiv paper 2502.01521: Symmetry-Guided Memory Augmentation for Efficient Locomotion Learning
Abstract page for arXiv paper 2409.11847: An efficient wavelet-based physics-informed neural network for multiscale problems
Abstract page for arXiv paper 2406.01969: Multiway Multislice PHATE: Visualizing Hidden Dynamics of RNNs through Training
Abstract page for arXiv paper 2210.11039: Entire Space Counterfactual Learning for Reliable Content Recommendations
Abstract page for arXiv paper 2603.24567: Trust Region Constrained Bayesian Optimization with Penalized Constraint Handling
Abstract page for arXiv paper 2603.24481: Multi-Agent Reasoning with Consistency Verification Improves Uncertainty Calibration in Medical...
Abstract page for arXiv paper 2603.24436: Enes Causal Discovery
Abstract page for arXiv paper 2603.24472: Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?
Abstract page for arXiv paper 2603.24477: Composer 2 Technical Report
Abstract page for arXiv paper 2603.24400: Neural Network Models for Contextual Regression
Abstract page for arXiv paper 2603.24396: Exploring How Fair Model Representations Relate to Fair Recommendations
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime