AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[D] I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Machine Learning · 1 min · about 7 hours ago

Machine Learning

I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Artificial Intelligence · 1 min · about 7 hours ago

Ai Safety

Newsom signs executive order requiring AI companies to have safety, privacy guardrails

submitted by /u/Fcking_Chuck [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 12 hours ago

All Content

Llms

[2511.21104] BRIDGE: Building Representations In Domain Guided Program Synthesis

The paper presents BRIDGE, a framework for improving program synthesis through structured prompting, enhancing correctness and efficiency...

arXiv - Machine Learning · 4 min · about 1 month ago

Nlp

[2512.02435] Efficient Cross-Domain Offline Reinforcement Learning with Dynamics- and Value-Aligned Data Filtering

This paper presents a novel framework for cross-domain offline reinforcement learning, introducing a method that filters data based on bo...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2511.07922] SERL: Self-Examining Reinforcement Learning on Open-Domain

The paper introduces Self-Examining Reinforcement Learning (SERL), a novel framework that enhances the performance of large language mode...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2510.10625] ImpMIA: Leveraging Implicit Bias for Membership Inference Attack

The paper introduces ImpMIA, a novel Membership Inference Attack that leverages implicit bias in neural networks to identify training sam...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2507.12652] Federated Learning in Offline and Online EMG Decoding: A Privacy and Performance Perspective

This article explores the application of federated learning (FL) in offline and online EMG decoding, addressing privacy and performance c...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2411.09847] Towards a Fairer Non-negative Matrix Factorization

This article presents a novel approach to Non-negative Matrix Factorization (NMF) aimed at improving fairness in machine learning algorit...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.22115] Slice and Explain: Logic-Based Explanations for Neural Networks through Domain Slicing

The paper presents a novel approach called 'Slice and Explain,' which utilizes domain slicing to enhance the efficiency of logic-based ex...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.22083] Coarsening Bias from Variable Discretization in Causal Functionals

This paper discusses the coarsening bias introduced by discretizing continuous variables in causal functionals, proposing a bias-reduced ...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.21957] Learning to Collaborate via Structures: Cluster-Guided Item Alignment for Federated Recommendation

The paper presents CGFedRec, a novel framework for federated recommendation that enhances collaboration by using cluster-guided item alig...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21873] GFPL: Generative Federated Prototype Learning for Resource-Constrained and Data-Imbalanced Vision Task

The GFPL framework enhances federated learning by addressing data imbalance and communication overhead in resource-constrained vision tas...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21721] Private and Robust Contribution Evaluation in Federated Learning

This paper presents novel methods for evaluating contributions in federated learning while ensuring privacy and robustness, addressing vu...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21509] Fair Model-based Clustering

The paper presents Fair Model-based Clustering (FMC), a new algorithm that enhances fairness in clustering by ensuring the proportion of ...

arXiv - Machine Learning · 3 min · about 1 month ago

Ai Safety

[2602.21272] Counterdiabatic Hamiltonian Monte Carlo

The paper introduces Counterdiabatic Hamiltonian Monte Carlo (CHMC), an advanced sampling method that improves the efficiency of Hamilton...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.21262] Under the Influence: Quantifying Persuasion and Vigilance in Large Language Models

This paper investigates the interplay between persuasion and vigilance in Large Language Models (LLMs), revealing that these capacities a...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21252] INTACT: Intent-Aware Representation Learning for Cryptographic Traffic Violation Detection

The paper introduces INTACT, a novel framework for detecting cryptographic traffic violations by modeling violations as conditional const...

arXiv - Machine Learning · 3 min · about 1 month ago

Nlp

[2602.21212] Disaster Question Answering with LoRA Efficiency and Accurate End Position

This paper presents a disaster-focused question answering system optimized for Japanese disaster scenarios, achieving high accuracy with ...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21961] Robustness in sparse artificial neural networks trained with adaptive topology

This paper explores the robustness of sparse artificial neural networks with adaptive topology, demonstrating their competitive performan...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.21928] Learning Unknown Interdependencies for Decentralized Root Cause Analysis in Nonlinear Dynamical Systems

This paper presents a novel federated learning methodology for decentralized root cause analysis in nonlinear dynamical systems, addressi...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21844] JSAM: Privacy Straggler-Resilient Joint Client Selection and Incentive Mechanism Design in Differentially Private Federated Learning

The paper presents JSAM, a framework for optimizing client selection and privacy compensation in differentially private federated learnin...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.21750] From Words to Amino Acids: Does the Curse of Depth Persist?

This paper explores the depth inefficiency in protein language models (PLMs), revealing that later layers contribute less to output predi...

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 43 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Safety & Ethics

Top This Week

[D] I had an idea, would love your thoughts

I had an idea, would love your thoughts

Newsom signs executive order requiring AI companies to have safety, privacy guardrails

All Content

[2511.21104] BRIDGE: Building Representations In Domain Guided Program Synthesis

[2512.02435] Efficient Cross-Domain Offline Reinforcement Learning with Dynamics- and Value-Aligned Data Filtering

[2511.07922] SERL: Self-Examining Reinforcement Learning on Open-Domain

[2510.10625] ImpMIA: Leveraging Implicit Bias for Membership Inference Attack

[2507.12652] Federated Learning in Offline and Online EMG Decoding: A Privacy and Performance Perspective

[2411.09847] Towards a Fairer Non-negative Matrix Factorization

[2602.22115] Slice and Explain: Logic-Based Explanations for Neural Networks through Domain Slicing

[2602.22083] Coarsening Bias from Variable Discretization in Causal Functionals

[2602.21957] Learning to Collaborate via Structures: Cluster-Guided Item Alignment for Federated Recommendation

[2602.21873] GFPL: Generative Federated Prototype Learning for Resource-Constrained and Data-Imbalanced Vision Task

[2602.21721] Private and Robust Contribution Evaluation in Federated Learning

[2602.21509] Fair Model-based Clustering

[2602.21272] Counterdiabatic Hamiltonian Monte Carlo

[2602.21262] Under the Influence: Quantifying Persuasion and Vigilance in Large Language Models

[2602.21252] INTACT: Intent-Aware Representation Learning for Cryptographic Traffic Violation Detection

[2602.21212] Disaster Question Answering with LoRA Efficiency and Accurate End Position

[2602.21961] Robustness in sparse artificial neural networks trained with adaptive topology

[2602.21928] Learning Unknown Interdependencies for Decentralized Root Cause Analysis in Nonlinear Dynamical Systems

[2602.21844] JSAM: Privacy Straggler-Resilient Joint Client Selection and Incentive Mechanism Design in Differentially Private Federated Learning

[2602.21750] From Words to Amino Acids: Does the Curse of Depth Persist?

Related Topics

Stay updated with AI News