Data Science

Data analysis, statistics, and data engineering

Top This Week

[2603.13793] GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages
Nlp

[2603.13793] GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages

Abstract page for arXiv paper 2603.13793: GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Langu...

arXiv - AI · 4 min ·
[2602.08482] CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform
Llms

[2602.08482] CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform

Abstract page for arXiv paper 2602.08482: CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform

arXiv - AI · 3 min ·
[2512.17396] RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering
Data Science

[2512.17396] RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering

Abstract page for arXiv paper 2512.17396: RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering

arXiv - AI · 3 min ·

All Content

[2602.23277] Zeroth-Order Stackelberg Control in Combinatorial Congestion Games
Machine Learning

[2602.23277] Zeroth-Order Stackelberg Control in Combinatorial Congestion Games

This article presents the ZO-Stackelberg method for optimizing network parameters in combinatorial congestion games, enhancing efficiency...

arXiv - Machine Learning · 3 min ·
[2602.22828] TCM-DiffRAG: Personalized Syndrome Differentiation Reasoning Method for Traditional Chinese Medicine based on Knowledge Graph and Chain of Thought
Llms

[2602.22828] TCM-DiffRAG: Personalized Syndrome Differentiation Reasoning Method for Traditional Chinese Medicine based on Knowledge Graph and Chain of Thought

The article presents TCM-DiffRAG, a novel reasoning framework for Traditional Chinese Medicine (TCM) that enhances diagnosis through know...

arXiv - AI · 4 min ·
[2602.23132] From Agnostic to Specific: Latent Preference Diffusion for Multi-Behavior Sequential Recommendation
Machine Learning

[2602.23132] From Agnostic to Specific: Latent Preference Diffusion for Multi-Behavior Sequential Recommendation

This paper presents FatsMB, a novel framework for Multi-Behavior Sequential Recommendation (MBSR) that enhances user preference modeling ...

arXiv - Machine Learning · 4 min ·
[2602.23079] Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent
Llms

[2602.23079] Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent

This article introduces a novel LLM agent designed to assess and mitigate deanonymization risks in textual data using a method called SAL...

arXiv - Machine Learning · 3 min ·
[2602.23061] MoDora: Tree-Based Semi-Structured Document Analysis System
Nlp

[2602.23061] MoDora: Tree-Based Semi-Structured Document Analysis System

MoDora is a novel LLM-powered system designed for analyzing semi-structured documents, addressing challenges in information retrieval and...

arXiv - Machine Learning · 4 min ·
[2602.22740] AMLRIS: Alignment-aware Masked Learning for Referring Image Segmentation
Machine Learning

[2602.22740] AMLRIS: Alignment-aware Masked Learning for Referring Image Segmentation

The paper presents AMLRIS, a novel training strategy for Referring Image Segmentation (RIS) that enhances object segmentation through ali...

arXiv - AI · 3 min ·
[2602.23023] Low-degree Lower bounds for clustering in moderate dimension
Machine Learning

[2602.23023] Low-degree Lower bounds for clustering in moderate dimension

This paper explores the clustering of points from a mixture of isotropic Gaussians in moderate dimensions, establishing new polynomial lo...

arXiv - Machine Learning · 3 min ·
[2602.23013] SubspaceAD: Training-Free Few-Shot Anomaly Detection via Subspace Modeling
Llms

[2602.23013] SubspaceAD: Training-Free Few-Shot Anomaly Detection via Subspace Modeling

The paper introduces SubspaceAD, a training-free method for few-shot anomaly detection that utilizes subspace modeling to achieve state-o...

arXiv - Machine Learning · 4 min ·
[2602.23012] Sequential Regression for Continuous Value Prediction using Residual Quantization
Machine Learning

[2602.23012] Sequential Regression for Continuous Value Prediction using Residual Quantization

This article presents a novel approach to continuous value prediction using a residual quantization framework, enhancing prediction accur...

arXiv - Machine Learning · 4 min ·
[2602.22710] Same Words, Different Judgments: Modality Effects on Preference Alignment
Ai Safety

[2602.22710] Same Words, Different Judgments: Modality Effects on Preference Alignment

This study explores how modality affects preference alignment in AI systems, comparing human and synthetic evaluations of audio and text ...

arXiv - AI · 3 min ·
[2602.22985] Kernel Integrated $R^2$: A Measure of Dependence
Machine Learning

[2602.22985] Kernel Integrated $R^2$: A Measure of Dependence

The paper introduces Kernel Integrated $R^2$, a novel statistical measure of dependence that enhances the integrated $R^2$ by utilizing r...

arXiv - Machine Learning · 4 min ·
[2602.23006] Regular Fourier Features for Nonstationary Gaussian Processes
Machine Learning

[2602.23006] Regular Fourier Features for Nonstationary Gaussian Processes

The paper presents a novel approach using regular Fourier features to simulate nonstationary Gaussian processes, addressing limitations o...

arXiv - Machine Learning · 3 min ·
[2602.22683] SUPERGLASSES: Benchmarking Vision Language Models as Intelligent Agents for AI Smart Glasses
Llms

[2602.22683] SUPERGLASSES: Benchmarking Vision Language Models as Intelligent Agents for AI Smart Glasses

The paper introduces SUPERGLASSES, a benchmark for evaluating Vision Language Models (VLMs) in AI smart glasses, addressing the limitatio...

arXiv - AI · 4 min ·
[2602.22913] SIGMA: A Semantic-Grounded Instruction-Driven Generative Multi-Task Recommender at AliExpress
Llms

[2602.22913] SIGMA: A Semantic-Grounded Instruction-Driven Generative Multi-Task Recommender at AliExpress

The paper presents SIGMA, a novel generative multi-task recommender system developed for AliExpress, utilizing semantic grounding and ins...

arXiv - Machine Learning · 3 min ·
[2602.22903] PSQE: A Theoretical-Practical Approach to Pseudo Seed Quality Enhancement for Unsupervised MMEA
Llms

[2602.22903] PSQE: A Theoretical-Practical Approach to Pseudo Seed Quality Enhancement for Unsupervised MMEA

The paper presents PSQE, a method for enhancing pseudo seed quality in unsupervised multimodal entity alignment, addressing challenges in...

arXiv - Machine Learning · 4 min ·
[2602.22895] SPD Learn: A Geometric Deep Learning Python Library for Neural Decoding Through Trivialization
Machine Learning

[2602.22895] SPD Learn: A Geometric Deep Learning Python Library for Neural Decoding Through Trivialization

SPD Learn is a new Python library designed for geometric deep learning, specifically for neural decoding using symmetric positive definit...

arXiv - Machine Learning · 3 min ·
[2602.22732] Generative Recommendation for Large-Scale Advertising
Llms

[2602.22732] Generative Recommendation for Large-Scale Advertising

This paper introduces GR4AD, a generative recommendation system designed for large-scale advertising, enhancing ad revenue through innova...

arXiv - Machine Learning · 4 min ·
[2602.22699] DPSQL+: A Differentially Private SQL Library with a Minimum Frequency Rule
Machine Learning

[2602.22699] DPSQL+: A Differentially Private SQL Library with a Minimum Frequency Rule

DPSQL+ is a new SQL library designed to enhance data privacy by enforcing differential privacy and a minimum frequency rule, ensuring sen...

arXiv - Machine Learning · 4 min ·
[2602.22568] Quality-Aware Robust Multi-View Clustering for Heterogeneous Observation Noise
Machine Learning

[2602.22568] Quality-Aware Robust Multi-View Clustering for Heterogeneous Observation Noise

The paper presents Quality-Aware Robust Multi-View Clustering (QARMVC), a novel framework addressing the challenges of heterogeneous obse...

arXiv - AI · 4 min ·
[2602.22618] Advancing accelerator virtual beam diagnostics through latent evolution modeling: an integrated solution to forward, inverse, tuning, and UQ problems
Machine Learning

[2602.22618] Advancing accelerator virtual beam diagnostics through latent evolution modeling: an integrated solution to forward, inverse, tuning, and UQ problems

This article presents a novel hybrid machine learning framework, Latent Evolution Model (LEM), for advancing virtual beam diagnostics in ...

arXiv - Machine Learning · 4 min ·
Previous Page 35 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime