Data Science

Data analysis, statistics, and data engineering

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
[2603.16629] MLLM-based Textual Explanations for Face Comparison
Llms

[2603.16629] MLLM-based Textual Explanations for Face Comparison

Abstract page for arXiv paper 2603.16629: MLLM-based Textual Explanations for Face Comparison

arXiv - AI · 4 min ·
[2603.14267] DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization
Machine Learning

[2603.14267] DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization

Abstract page for arXiv paper 2603.14267: DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and ...

arXiv - AI · 4 min ·

All Content

[2603.02702] FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing
Nlp

[2603.02702] FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing

Abstract page for arXiv paper 2603.02702: FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing

arXiv - Machine Learning · 4 min ·
[2603.02237] Concept Heterogeneity-aware Representation Steering
Llms

[2603.02237] Concept Heterogeneity-aware Representation Steering

Abstract page for arXiv paper 2603.02237: Concept Heterogeneity-aware Representation Steering

arXiv - AI · 4 min ·
[2603.02239] Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents
Llms

[2603.02239] Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents

Abstract page for arXiv paper 2603.02239: Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foun...

arXiv - AI · 4 min ·
[2603.02221] MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabular Prediction
Llms

[2603.02221] MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabular Prediction

Abstract page for arXiv paper 2603.02221: MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabul...

arXiv - AI · 4 min ·
[2603.02215] RxnNano:Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction via Hierarchical Curriculum Learning
Llms

[2603.02215] RxnNano:Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction via Hierarchical Curriculum Learning

Abstract page for arXiv paper 2603.02215: RxnNano:Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction via Hierarchi...

arXiv - AI · 4 min ·
Llms

[D] Quantified analysis of 2,218 Gary Marcus claims - two independent LLM pipelines, scored against evidence

Built a dataset scoring every testable claim from Marcus's 474 Substack posts. Two pipelines (Claude Opus 4.6 and ChatGPT Codex) analyzed...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] I trained Qwen2.5-1.5b with RLVR (GRPO) vs SFT and compared benchmark performance

Hello everyone. I trained Qwen2.5-1.5b-Instruct with both RLVR and SFT on the GSM8K dataset and compared the results across GSM8K and MAT...

Reddit - Machine Learning · 1 min ·
[2510.18516] Decoding Dynamic Visual Experience from Calcium Imaging via Cell-Pattern-Aware Pretraining
Machine Learning

[2510.18516] Decoding Dynamic Visual Experience from Calcium Imaging via Cell-Pattern-Aware Pretraining

Abstract page for arXiv paper 2510.18516: Decoding Dynamic Visual Experience from Calcium Imaging via Cell-Pattern-Aware Pretraining

arXiv - Machine Learning · 3 min ·
[2510.00504] A universal compression theory for lottery ticket hypothesis and neural scaling laws
Machine Learning

[2510.00504] A universal compression theory for lottery ticket hypothesis and neural scaling laws

Abstract page for arXiv paper 2510.00504: A universal compression theory for lottery ticket hypothesis and neural scaling laws

arXiv - Machine Learning · 4 min ·
[2507.21783] Domain Generalization and Adaptation in Intensive Care with Anchor Regression
Machine Learning

[2507.21783] Domain Generalization and Adaptation in Intensive Care with Anchor Regression

Abstract page for arXiv paper 2507.21783: Domain Generalization and Adaptation in Intensive Care with Anchor Regression

arXiv - Machine Learning · 4 min ·
[2506.05639] FictionalQA: A Dataset for Studying Memorization and Knowledge Acquisition
Llms

[2506.05639] FictionalQA: A Dataset for Studying Memorization and Knowledge Acquisition

Abstract page for arXiv paper 2506.05639: FictionalQA: A Dataset for Studying Memorization and Knowledge Acquisition

arXiv - Machine Learning · 3 min ·
[2503.01441] A Randomized Linearly Convergent Frank-Wolfe-type Method for Smooth Convex Minimization over the Spectrahedron
Machine Learning

[2503.01441] A Randomized Linearly Convergent Frank-Wolfe-type Method for Smooth Convex Minimization over the Spectrahedron

Abstract page for arXiv paper 2503.01441: A Randomized Linearly Convergent Frank-Wolfe-type Method for Smooth Convex Minimization over th...

arXiv - Machine Learning · 3 min ·
[2504.08428] Standardization of Weighted Ranking Correlation Coefficients
Machine Learning

[2504.08428] Standardization of Weighted Ranking Correlation Coefficients

Abstract page for arXiv paper 2504.08428: Standardization of Weighted Ranking Correlation Coefficients

arXiv - Machine Learning · 4 min ·
[2503.17592] A Benchmark Dataset for Machine Learning Surrogates of Pore-Scale CO2-Water Interaction
Machine Learning

[2503.17592] A Benchmark Dataset for Machine Learning Surrogates of Pore-Scale CO2-Water Interaction

Abstract page for arXiv paper 2503.17592: A Benchmark Dataset for Machine Learning Surrogates of Pore-Scale CO2-Water Interaction

arXiv - Machine Learning · 3 min ·
[2406.04098] A Large-Scale Neutral Comparison Study of Survival Models on Low-Dimensional Data
Machine Learning

[2406.04098] A Large-Scale Neutral Comparison Study of Survival Models on Low-Dimensional Data

Abstract page for arXiv paper 2406.04098: A Large-Scale Neutral Comparison Study of Survival Models on Low-Dimensional Data

arXiv - Machine Learning · 4 min ·
[2602.02734] WAXAL: A Large-Scale Multilingual African Language Speech Corpus
Data Science

[2602.02734] WAXAL: A Large-Scale Multilingual African Language Speech Corpus

Abstract page for arXiv paper 2602.02734: WAXAL: A Large-Scale Multilingual African Language Speech Corpus

arXiv - AI · 4 min ·
[2602.01701] Beyond Single-Modal Analytics: A Framework for Integrating Heterogeneous LLM-Based Query Systems for Multi-Modal Data
Llms

[2602.01701] Beyond Single-Modal Analytics: A Framework for Integrating Heterogeneous LLM-Based Query Systems for Multi-Modal Data

Abstract page for arXiv paper 2602.01701: Beyond Single-Modal Analytics: A Framework for Integrating Heterogeneous LLM-Based Query System...

arXiv - AI · 4 min ·
[2512.08333] Robust Finetuning of Vision-Language-Action Robot Policies via Parameter Merging
Machine Learning

[2512.08333] Robust Finetuning of Vision-Language-Action Robot Policies via Parameter Merging

Abstract page for arXiv paper 2512.08333: Robust Finetuning of Vision-Language-Action Robot Policies via Parameter Merging

arXiv - AI · 4 min ·
[2511.10985] When Data is the Algorithm: A Systematic Study and Curation of Preference Optimization Datasets
Llms

[2511.10985] When Data is the Algorithm: A Systematic Study and Curation of Preference Optimization Datasets

Abstract page for arXiv paper 2511.10985: When Data is the Algorithm: A Systematic Study and Curation of Preference Optimization Datasets

arXiv - AI · 4 min ·
[2511.05522] AIRMap: AI-Generated Radio Maps for Wireless Digital Twins
Machine Learning

[2511.05522] AIRMap: AI-Generated Radio Maps for Wireless Digital Twins

Abstract page for arXiv paper 2511.05522: AIRMap: AI-Generated Radio Maps for Wireless Digital Twins

arXiv - AI · 4 min ·
Previous Page 19 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime