AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Safety

Bias in AI: Examples and 6 Ways to Fix it in 2026

AI bias is an anomaly in the output of ML algorithms due to prejudiced assumptions. Explore types of AI bias, examples, how to reduce bia...

AI Events · 36 min · less than a minute ago

Llms

[R] I built a benchmark that catches LLMs breaking physics laws

I got tired of LLMs confidently giving wrong physics answers, so I built a benchmark that generates adversarial physics questions and gra...

Reddit - Machine Learning · 1 min · about 6 hours ago

Machine Learning

We need to teach AI the essence of being human to reduce the risk of misalignment

One part of the alignment problem is that AI does not genuinely understand what it's like to live in the world, even though it can descri...

Reddit - Artificial Intelligence · 1 min · about 24 hours ago

All Content

Machine Learning

[2603.22721] HyFI: Hyperbolic Feature Interpolation for Brain-Vision Alignment

Abstract page for arXiv paper 2603.22721: HyFI: Hyperbolic Feature Interpolation for Brain-Vision Alignment

arXiv - AI · 4 min · 4 days ago

Machine Learning

[2603.22322] AEGIS: An Operational Infrastructure for Post-Market Governance of Adaptive Medical AI Under US and EU Regulations

Abstract page for arXiv paper 2603.22322: AEGIS: An Operational Infrastructure for Post-Market Governance of Adaptive Medical AI Under US...

arXiv - AI · 4 min · 4 days ago

Ai Safety

[2603.22314] Enhancing AI-Based Tropical Cyclone Track and Intensity Forecasting via Systematic Bias Correction

Abstract page for arXiv paper 2603.22314: Enhancing AI-Based Tropical Cyclone Track and Intensity Forecasting via Systematic Bias Correction

arXiv - AI · 4 min · 4 days ago

Llms

[2603.22305] CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset Allocation from Daily Trending Financial News

Abstract page for arXiv paper 2603.22305: CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset ...

arXiv - AI · 4 min · 4 days ago

Llms

I mapped how Reddit actually talks about AI safety: 6,374 posts, 23 clusters, some surprising patterns

I collected Reddit posts between Jan 29 - Mar 1, 2026 using 40 keyword-based search terms ("AI safety", "AI alignment", "EU AI Act", "AI ...

Reddit - Artificial Intelligence · 1 min · 5 days ago

Nlp

What if your AI agent could fix its own hallucinations without being told what's wrong?

Every autonomous AI agent has three problems: it contradicts itself, it can't decide, and it says things confidently that aren't true. Cu...

Reddit - Artificial Intelligence · 1 min · 5 days ago

Ai Safety

Algorithmic Gaslighting: A Formal Legal Template to Fight AI Safety Pivots That Cause Psychological Harm

TL;DR: Stop the AI "Emotional Whiplash" A documented design flaw can cause users to experience emotional distress when an AI abruptly swi...

Reddit - Artificial Intelligence · 1 min · 5 days ago

Ai Safety

[2603.18640] A Theoretical Comparison of No-U-Turn Sampler Variants: Necessary and Sufficient Convergence Conditions and Mixing Time Analysis under Gaussian Targets

Abstract page for arXiv paper 2603.18640: A Theoretical Comparison of No-U-Turn Sampler Variants: Necessary and Sufficient Convergence Co...

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2602.10273] Power-SMC: Low-Latency Sequence-Level Power Sampling for Training-Free LLM Reasoning

Abstract page for arXiv paper 2602.10273: Power-SMC: Low-Latency Sequence-Level Power Sampling for Training-Free LLM Reasoning

arXiv - Machine Learning · 4 min · 5 days ago

Machine Learning

[2505.19731] Proximal Point Nash Learning from Human Feedback

Abstract page for arXiv paper 2505.19731: Proximal Point Nash Learning from Human Feedback

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2603.08104] Invisible Safety Threat: Malicious Finetuning for LLM via Steganography

Abstract page for arXiv paper 2603.08104: Invisible Safety Threat: Malicious Finetuning for LLM via Steganography

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2512.19735] Improving Fairness of Large Language Model-Based ICU Mortality Prediction via Case-Based Prompting

Abstract page for arXiv paper 2512.19735: Improving Fairness of Large Language Model-Based ICU Mortality Prediction via Case-Based Prompting

arXiv - Machine Learning · 4 min · 5 days ago

Machine Learning

[2512.03290] ASPEN: An Adaptive Spectral Physics-Enabled Network for Ginzburg-Landau Dynamics

Abstract page for arXiv paper 2512.03290: ASPEN: An Adaptive Spectral Physics-Enabled Network for Ginzburg-Landau Dynamics

arXiv - Machine Learning · 4 min · 5 days ago

Machine Learning

[2501.16562] C-HDNet: Hyperdimensional Computing for Causal Effect Estimation from Observational Data Under Network Interference

Abstract page for arXiv paper 2501.16562: C-HDNet: Hyperdimensional Computing for Causal Effect Estimation from Observational Data Under ...

arXiv - Machine Learning · 4 min · 5 days ago

Machine Learning

[2603.21749] Model selection in hybrid quantum neural networks with applications to quantum transformer architectures

Abstract page for arXiv paper 2603.21749: Model selection in hybrid quantum neural networks with applications to quantum transformer arch...

arXiv - Machine Learning · 3 min · 5 days ago

Ai Safety

[2603.21647] FedCVU: Federated Learning for Cross-View Video Understanding

Abstract page for arXiv paper 2603.21647: FedCVU: Federated Learning for Cross-View Video Understanding

arXiv - Machine Learning · 3 min · 5 days ago

Ai Safety

[2603.21639] Engineering Distributed Governance for Regional Prosperity: A Socio-Technical Framework for Mitigating Under-Vibrancy via Human Data Engines

Abstract page for arXiv paper 2603.21639: Engineering Distributed Governance for Regional Prosperity: A Socio-Technical Framework for Mit...

arXiv - Machine Learning · 4 min · 5 days ago

Ai Safety

[2603.21487] GaussianSSC: Triplane-Guided Directional Gaussian Fields for 3D Semantic Completion

Abstract page for arXiv paper 2603.21487: GaussianSSC: Triplane-Guided Directional Gaussian Fields for 3D Semantic Completion

arXiv - Machine Learning · 3 min · 5 days ago

Computer Vision

[2603.21377] HamVision: Hamiltonian Dynamics as Inductive Bias for Medical Image Analysis

Abstract page for arXiv paper 2603.21377: HamVision: Hamiltonian Dynamics as Inductive Bias for Medical Image Analysis

arXiv - Machine Learning · 4 min · 5 days ago

Nlp

[2603.21042] Statistical Learning for Latent Embedding Alignment with Application to Brain Encoding and Decoding

Abstract page for arXiv paper 2603.21042: Statistical Learning for Latent Embedding Alignment with Application to Brain Encoding and Deco...

arXiv - Machine Learning · 3 min · 5 days ago

Previous Page 6 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Safety & Ethics

Top This Week

Bias in AI: Examples and 6 Ways to Fix it in 2026

[R] I built a benchmark that catches LLMs breaking physics laws

We need to teach AI the essence of being human to reduce the risk of misalignment

All Content

[2603.22721] HyFI: Hyperbolic Feature Interpolation for Brain-Vision Alignment

[2603.22322] AEGIS: An Operational Infrastructure for Post-Market Governance of Adaptive Medical AI Under US and EU Regulations

[2603.22314] Enhancing AI-Based Tropical Cyclone Track and Intensity Forecasting via Systematic Bias Correction

[2603.22305] CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset Allocation from Daily Trending Financial News

I mapped how Reddit actually talks about AI safety: 6,374 posts, 23 clusters, some surprising patterns

What if your AI agent could fix its own hallucinations without being told what's wrong?

Algorithmic Gaslighting: A Formal Legal Template to Fight AI Safety Pivots That Cause Psychological Harm

[2603.18640] A Theoretical Comparison of No-U-Turn Sampler Variants: Necessary and Sufficient Convergence Conditions and Mixing Time Analysis under Gaussian Targets

[2602.10273] Power-SMC: Low-Latency Sequence-Level Power Sampling for Training-Free LLM Reasoning

[2505.19731] Proximal Point Nash Learning from Human Feedback

[2603.08104] Invisible Safety Threat: Malicious Finetuning for LLM via Steganography

[2512.19735] Improving Fairness of Large Language Model-Based ICU Mortality Prediction via Case-Based Prompting

[2512.03290] ASPEN: An Adaptive Spectral Physics-Enabled Network for Ginzburg-Landau Dynamics

[2501.16562] C-HDNet: Hyperdimensional Computing for Causal Effect Estimation from Observational Data Under Network Interference

[2603.21749] Model selection in hybrid quantum neural networks with applications to quantum transformer architectures

[2603.21647] FedCVU: Federated Learning for Cross-View Video Understanding

[2603.21639] Engineering Distributed Governance for Regional Prosperity: A Socio-Technical Framework for Mitigating Under-Vibrancy via Human Data Engines

[2603.21487] GaussianSSC: Triplane-Guided Directional Gaussian Fields for 3D Semantic Completion

[2603.21377] HamVision: Hamiltonian Dynamics as Inductive Bias for Medical Image Analysis

[2603.21042] Statistical Learning for Latent Embedding Alignment with Application to Brain Encoding and Decoding

Related Topics

Stay updated with AI News