AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

Top This Week

[2511.21331] The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment
Machine Learning

[2511.21331] The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

Abstract page for arXiv paper 2511.21331: The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

arXiv - AI · 4 min ·
[2509.22367] What Is The Political Content in LLMs' Pre- and Post-Training Data?
Llms

[2509.22367] What Is The Political Content in LLMs' Pre- and Post-Training Data?

Abstract page for arXiv paper 2509.22367: What Is The Political Content in LLMs' Pre- and Post-Training Data?

arXiv - AI · 4 min ·
[2507.22264] SmartCLIP: Modular Vision-language Alignment with Identification Guarantees
Machine Learning

[2507.22264] SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

Abstract page for arXiv paper 2507.22264: SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

arXiv - AI · 4 min ·

All Content

[2509.24803] TimeOmni-1: Incentivizing Complex Reasoning with Time Series in Large Language Models
Llms

[2509.24803] TimeOmni-1: Incentivizing Complex Reasoning with Time Series in Large Language Models

The paper introduces TimeOmni-1, a model designed to enhance complex reasoning with time series data in large language models, addressing...

arXiv - AI · 4 min ·
[2502.01160] Scalable Precise Computation of Shannon Entropy
Machine Learning

[2502.01160] Scalable Precise Computation of Shannon Entropy

This paper presents a scalable tool, PSE, for precise computation of Shannon entropy, optimizing the process to enhance efficiency in qua...

arXiv - AI · 4 min ·
[2411.06624] A Review of Fairness and A Practical Guide to Selecting Context-Appropriate Fairness Metrics in Machine Learning
Machine Learning

[2411.06624] A Review of Fairness and A Practical Guide to Selecting Context-Appropriate Fairness Metrics in Machine Learning

This article reviews fairness in machine learning, emphasizing the need for context-appropriate fairness metrics and providing a flowchar...

arXiv - AI · 4 min ·
[2602.16708] Policy Compiler for Secure Agentic Systems
Llms

[2602.16708] Policy Compiler for Secure Agentic Systems

The article presents PCAS, a Policy Compiler designed to enforce complex authorization policies in LLM-based agents, improving compliance...

arXiv - AI · 4 min ·
[2602.16703] Measuring Mid-2025 LLM-Assistance on Novice Performance in Biology
Llms

[2602.16703] Measuring Mid-2025 LLM-Assistance on Novice Performance in Biology

This study evaluates the impact of large language models (LLMs) on novice performance in biology laboratory tasks, revealing modest benef...

arXiv - AI · 4 min ·
[2602.16660] Align Once, Benefit Multilingually: Enforcing Multilingual Consistency for LLM Safety Alignment
Llms

[2602.16660] Align Once, Benefit Multilingually: Enforcing Multilingual Consistency for LLM Safety Alignment

The paper presents a method for enhancing multilingual safety alignment in large language models (LLMs) using a resource-efficient Multi-...

arXiv - Machine Learning · 4 min ·
[2602.16601] Error Propagation and Model Collapse in Diffusion Models: A Theoretical Study
Machine Learning

[2602.16601] Error Propagation and Model Collapse in Diffusion Models: A Theoretical Study

This theoretical study examines error propagation and model collapse in diffusion models, highlighting how recursive training on syntheti...

arXiv - Machine Learning · 3 min ·
[2602.16610] Who can we trust? LLM-as-a-jury for Comparative Assessment
Llms

[2602.16610] Who can we trust? LLM-as-a-jury for Comparative Assessment

The paper explores the reliability of large language models (LLMs) as evaluators in natural language generation tasks, proposing a new mo...

arXiv - Machine Learning · 3 min ·
[2602.16608] Explainable AI: Context-Aware Layer-Wise Integrated Gradients for Explaining Transformer Models
Machine Learning

[2602.16608] Explainable AI: Context-Aware Layer-Wise Integrated Gradients for Explaining Transformer Models

The paper presents the Context-Aware Layer-Wise Integrated Gradients (CA-LIG) framework, enhancing explainability in Transformer models b...

arXiv - Machine Learning · 4 min ·
[2602.16520] Recursive language models for jailbreak detection: a procedural defense for tool-augmented agents
Llms

[2602.16520] Recursive language models for jailbreak detection: a procedural defense for tool-augmented agents

The paper presents RLM-JB, a framework utilizing Recursive Language Models for detecting jailbreak prompts in large language models, enha...

arXiv - AI · 3 min ·
[2602.16343] How to Label Resynthesized Audio: The Dual Role of Neural Audio Codecs in Audio Deepfake Detection
Machine Learning

[2602.16343] How to Label Resynthesized Audio: The Dual Role of Neural Audio Codecs in Audio Deepfake Detection

This article explores the dual role of neural audio codecs in labeling resynthesized audio for deepfake detection, highlighting the impac...

arXiv - Machine Learning · 3 min ·
[2602.16346] Helpful to a Fault: Measuring Illicit Assistance in Multi-Turn, Multilingual LLM Agents
Llms

[2602.16346] Helpful to a Fault: Measuring Illicit Assistance in Multi-Turn, Multilingual LLM Agents

This article presents STING, a framework for evaluating illicit assistance in multi-turn, multilingual LLM agents, highlighting the chall...

arXiv - Machine Learning · 4 min ·
[2602.16467] IndicEval: A Bilingual Indian Educational Evaluation Framework for Large Language Models
Llms

[2602.16467] IndicEval: A Bilingual Indian Educational Evaluation Framework for Large Language Models

IndicEval introduces a bilingual evaluation framework for large language models, assessing their performance on real examination question...

arXiv - AI · 4 min ·
[2602.16444] RoboGene: Boosting VLA Pre-training via Diversity-Driven Agentic Framework for Real-World Task Generation
Machine Learning

[2602.16444] RoboGene: Boosting VLA Pre-training via Diversity-Driven Agentic Framework for Real-World Task Generation

RoboGene introduces a framework for automating the generation of diverse, physically plausible robotic manipulation tasks, addressing the...

arXiv - AI · 4 min ·
[2602.16144] Missing-by-Design: Certifiable Modality Deletion for Revocable Multimodal Sentiment Analysis
Nlp

[2602.16144] Missing-by-Design: Certifiable Modality Deletion for Revocable Multimodal Sentiment Analysis

The paper presents Missing-by-Design (MBD), a framework for revocable multimodal sentiment analysis that enhances privacy compliance by a...

arXiv - Machine Learning · 3 min ·
[2602.16309] The Weight of a Bit: EMFI Sensitivity Analysis of Embedded Deep Learning Models
Machine Learning

[2602.16309] The Weight of a Bit: EMFI Sensitivity Analysis of Embedded Deep Learning Models

This article investigates the impact of different number representations on the vulnerability of embedded deep learning models to electro...

arXiv - AI · 3 min ·
[2602.16307] Generative AI Usage of University Students: Navigating Between Education and Business
Generative Ai

[2602.16307] Generative AI Usage of University Students: Navigating Between Education and Business

This study explores the use of generative AI by university students balancing education and work, highlighting its benefits and challenges.

arXiv - AI · 3 min ·
[2602.16241] Are LLMs Ready to Replace Bangla Annotators?
Llms

[2602.16241] Are LLMs Ready to Replace Bangla Annotators?

This article evaluates the effectiveness of Large Language Models (LLMs) as annotators for Bangla hate speech, revealing significant bias...

arXiv - AI · 3 min ·
[2602.16098] Collaborative Zone-Adaptive Zero-Day Intrusion Detection for IoBT
Machine Learning

[2602.16098] Collaborative Zone-Adaptive Zero-Day Intrusion Detection for IoBT

The paper presents a novel Zone-Adaptive Intrusion Detection framework for the Internet of Battlefield Things (IoBT), addressing the chal...

arXiv - Machine Learning · 4 min ·
[2602.16090] Examining Fast Radiative Feedbacks Using Machine-Learning Weather Emulators
Machine Learning

[2602.16090] Examining Fast Radiative Feedbacks Using Machine-Learning Weather Emulators

This article explores the use of machine-learning weather emulators to analyze fast radiative feedbacks in the climate system, focusing o...

arXiv - Machine Learning · 4 min ·
Previous Page 86 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime