Data Science

Data analysis, statistics, and data engineering

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Accelerating science with AI and simulations
Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min ·
[2603.13793] GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages
Nlp

[2603.13793] GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages

Abstract page for arXiv paper 2603.13793: GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Langu...

arXiv - AI · 4 min ·

All Content

[2602.22236] CrossLLM-Mamba: Multimodal State Space Fusion of LLMs for RNA Interaction Prediction
Llms

[2602.22236] CrossLLM-Mamba: Multimodal State Space Fusion of LLMs for RNA Interaction Prediction

The article presents CrossLLM-Mamba, a novel framework for RNA interaction prediction that utilizes multimodal state space fusion of larg...

arXiv - Machine Learning · 4 min ·
[2602.22299] Decoding the Hook: A Multimodal LLM Framework for Analyzing the Hooking Period of Video Ads
Llms

[2602.22299] Decoding the Hook: A Multimodal LLM Framework for Analyzing the Hooking Period of Video Ads

This article presents a framework using multimodal large language models (MLLMs) to analyze the 'hooking period' of video ads, focusing o...

arXiv - Machine Learning · 4 min ·
[2602.22223] SQaLe: A Large Text-to-SQL Corpus Grounded in Real Schemas
Llms

[2602.22223] SQaLe: A Large Text-to-SQL Corpus Grounded in Real Schemas

The paper introduces SQaLe, a large-scale text-to-SQL dataset designed to enhance the development of models that convert natural language...

arXiv - Machine Learning · 3 min ·
[2602.22279] Learning to reconstruct from saturated data: audio declipping and high-dynamic range imaging
Machine Learning

[2602.22279] Learning to reconstruct from saturated data: audio declipping and high-dynamic range imaging

This paper presents a novel approach to reconstruct audio and images from clipped measurements using self-supervised learning, addressing...

arXiv - AI · 3 min ·
[2602.23360] Model Agreement via Anchoring
Machine Learning

[2602.23360] Model Agreement via Anchoring

The paper presents a method for reducing model disagreement in machine learning by using an anchoring technique, demonstrating its effect...

arXiv - AI · 4 min ·
[2602.23358] A Dataset is Worth 1 MB
Machine Learning

[2602.23358] A Dataset is Worth 1 MB

The paper presents PLADA, a novel method for efficient dataset transmission in machine learning, significantly reducing payload size whil...

arXiv - Machine Learning · 4 min ·
[2602.23349] FlashOptim: Optimizers for Memory Efficient Training
Machine Learning

[2602.23349] FlashOptim: Optimizers for Memory Efficient Training

FlashOptim introduces innovative optimizers that significantly reduce memory usage in neural network training, enhancing efficiency witho...

arXiv - AI · 4 min ·
[2602.22263] CryoNet.Refine: A One-step Diffusion Model for Rapid Refinement of Structural Models with Cryo-EM Density Map Restraints
Machine Learning

[2602.22263] CryoNet.Refine: A One-step Diffusion Model for Rapid Refinement of Structural Models with Cryo-EM Density Map Restraints

CryoNet.Refine introduces a one-step diffusion model for efficiently refining structural models using cryo-EM density maps, offering a si...

arXiv - AI · 4 min ·
[2602.22258] Poisoned Acoustics
Machine Learning

[2602.22258] Poisoned Acoustics

The paper 'Poisoned Acoustics' explores training-data poisoning attacks on deep neural networks, demonstrating significant vulnerabilitie...

arXiv - AI · 3 min ·
[2602.23341] Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms
Machine Learning

[2602.23341] Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms

This article presents efficient algorithms for estimating the mean from coarse data, addressing key questions in Gaussian mean estimation...

arXiv - Machine Learning · 4 min ·
[2602.23336] Differentiable Zero-One Loss via Hypersimplex Projections
Machine Learning

[2602.23336] Differentiable Zero-One Loss via Hypersimplex Projections

This paper presents a novel differentiable approximation to the zero-one loss, enhancing gradient-based optimization in machine learning ...

arXiv - Machine Learning · 3 min ·
[2602.23305] A Proper Scoring Rule for Virtual Staining
Machine Learning

[2602.23305] A Proper Scoring Rule for Virtual Staining

The paper introduces a novel scoring rule for evaluating generative virtual staining models in high-throughput screening, emphasizing the...

arXiv - Machine Learning · 3 min ·
[2602.23303] Inferential Mechanics Part 1: Causal Mechanistic Theories of Machine Learning in Chemical Biology with Implications
Machine Learning

[2602.23303] Inferential Mechanics Part 1: Causal Mechanistic Theories of Machine Learning in Chemical Biology with Implications

This article introduces 'Inferential Mechanics,' a framework combining causal theories with machine learning in chemical biology, address...

arXiv - Machine Learning · 4 min ·
[2602.22247] Multi-Dimensional Spectral Geometry of Biological Knowledge in Single-Cell Transformer Representations
Llms

[2602.22247] Multi-Dimensional Spectral Geometry of Biological Knowledge in Single-Cell Transformer Representations

This article explores how single-cell foundation models like scGPT encode biological knowledge through high-dimensional gene representati...

arXiv - Machine Learning · 4 min ·
[2602.23296] Conformalized Neural Networks for Federated Uncertainty Quantification under Dual Heterogeneity
Machine Learning

[2602.23296] Conformalized Neural Networks for Federated Uncertainty Quantification under Dual Heterogeneity

This article presents FedWQ-CP, a novel approach to federated uncertainty quantification that addresses dual heterogeneity in data and mo...

arXiv - Machine Learning · 4 min ·
[2602.23280] Physics Informed Viscous Value Representations
Nlp

[2602.23280] Physics Informed Viscous Value Representations

This paper presents a novel approach to offline goal-conditioned reinforcement learning by introducing a physics-informed regularization ...

arXiv - Machine Learning · 3 min ·
[2602.23219] Takeuchi's Information Criteria as Generalization Measures for DNNs Close to NTK Regime
Machine Learning

[2602.23219] Takeuchi's Information Criteria as Generalization Measures for DNNs Close to NTK Regime

This paper investigates Takeuchi's Information Criterion (TIC) as a measure for generalization in deep neural networks (DNNs) near the ne...

arXiv - Machine Learning · 4 min ·
[2602.22237] Optimized Disaster Recovery for Distributed Storage Systems: Lightweight Metadata Architectures to Overcome Cryptographic Hashing Bottleneck
Nlp

[2602.22237] Optimized Disaster Recovery for Distributed Storage Systems: Lightweight Metadata Architectures to Overcome Cryptographic Hashing Bottleneck

This paper presents a novel approach to disaster recovery in distributed storage systems, addressing the limitations of cryptographic has...

arXiv - AI · 3 min ·
[2602.22235] Unsupervised Denoising of Diffusion-Weighted Images with Bias and Variance Corrected Noise Modeling
Machine Learning

[2602.22235] Unsupervised Denoising of Diffusion-Weighted Images with Bias and Variance Corrected Noise Modeling

This article presents a novel approach for unsupervised denoising of diffusion-weighted images (dMRI) by addressing noise bias and varian...

arXiv - AI · 4 min ·
[2602.23188] Efficient Real-Time Adaptation of ROMs for Unsteady Flows Using Data Assimilation
Machine Learning

[2602.23188] Efficient Real-Time Adaptation of ROMs for Unsteady Flows Using Data Assimilation

This article presents a novel retraining strategy for Reduced Order Models (ROMs) that enhances real-time adaptation for unsteady flows u...

arXiv - Machine Learning · 4 min ·
Previous Page 37 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime