[2501.02949] MSA-CNN: A Lightweight Multi-Scale CNN with Attention for

[2501.02949] MSA-CNN: A Lightweight Multi-Scale CNN with Attention for Sleep Stage Classification

arXiv - Machine Learning March 25, 2026 4 min read

About this article

Abstract page for arXiv paper 2501.02949: MSA-CNN: A Lightweight Multi-Scale CNN with Attention for Sleep Stage Classification

Computer Science > Machine Learning arXiv:2501.02949 (cs) [Submitted on 6 Jan 2025 (v1), last revised 24 Mar 2026 (this version, v2)] Title:MSA-CNN: A Lightweight Multi-Scale CNN with Attention for Sleep Stage Classification Authors:Stephan Goerttler, Yucheng Wang, Emadeldeen Eldele, Min Wu, Fei He View a PDF of the paper titled MSA-CNN: A Lightweight Multi-Scale CNN with Attention for Sleep Stage Classification, by Stephan Goerttler and 4 other authors View PDF Abstract:Recent advancements in machine learning-based signal analysis, coupled with open data initiatives, have fuelled efforts in automatic sleep stage classification. Despite the proliferation of classification models, few have prioritised reducing model complexity, which is a crucial factor for practical applications. In this work, we introduce Multi-Scale and Attention Convolutional Neural Network (MSA-CNN), a lightweight architecture featuring as few as ~10,000 parameters. MSA-CNN leverages a novel multi-scale module employing complementary pooling to eliminate redundant filter parameters and dense convolutions. Model complexity is further reduced by separating temporal and spatial feature extraction and using cost-effective global spatial convolutions. This separation of tasks not only reduces model complexity but also mirrors the approach used by human experts in sleep stage scoring. We evaluated both small and large configurations of MSA-CNN against nine state-of-the-art baseline models across three public...

Originally published on March 25, 2026. Curated by AI News.

Llms

[P] Dante-2B: I'm training a 2.1B bilingual fully open Italian/English LLM from scratch on 2×H200. Phase 1 done — here's what I've built.

The problem If you work with Italian text and local models, you know the pain. Every open-source LLM out there treats Italian as an after...

Reddit - Machine Learning · 1 min · 24 minutes ago

Machine Learning

[R] Architecture Determines Optimization: Deriving Weight Updates from Network Topology (seeking arXiv endorsement - cs.LG)

Abstract: We derive neural network weight updates from first principles without assuming gradient descent or a specific loss function. St...

Reddit - Machine Learning · 1 min · about 3 hours ago

Machine Learning

[P] ML project (XGBoost + Databricks + MLflow) — how to talk about “production issues” in interviews?

Hey all, I recently built an end-to-end fraud detection project using a large banking dataset: Trained an XGBoost model Used Databricks f...

Reddit - Machine Learning · 1 min · about 4 hours ago

Machine Learning

[D] The memory chip market lost tens of billions over a paper this community would have understood in 10 minutes

TurboQuant was teased recently and tens of billions gone from memory chip market in 48 hours but anyone in this community who read the pa...

Reddit - Machine Learning · 1 min · about 4 hours ago

[2501.02949] MSA-CNN: A Lightweight Multi-Scale CNN with Attention for Sleep Stage Classification

About this article

Related Articles

[P] Dante-2B: I'm training a 2.1B bilingual fully open Italian/English LLM from scratch on 2×H200. Phase 1 done — here's what I've built.

[R] Architecture Determines Optimization: Deriving Weight Updates from Network Topology (seeking arXiv endorsement - cs.LG)

[P] ML project (XGBoost + Databricks + MLflow) — how to talk about “production issues” in interviews?

[D] The memory chip market lost tens of billions over a paper this community would have understood in 10 minutes

No comments

Stay updated with AI News