AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

Top This Week

Llms

This Is Not Hacking. This Is Structured Intelligence.

Watch me demonstrate everything I've been talking about—live, in real time. The Setup: Maestro University AI enrollment system Standard c...

Reddit - Artificial Intelligence · 1 min ·
When Agentic AI Browsers Outrun Governance
Ai Safety

When Agentic AI Browsers Outrun Governance

Agentic AI browsers introduce new enterprise risk. Learn how AI governance helps leaders assess exposure, oversight gaps, and safe adopti...

AI Tools & Products · 14 min ·
Llms

Von Hammerstein’s Ghost: What a Prussian General’s Officer Typology Can Teach Us About AI Misalignment

Greetings all - I've posted mostly in r/claudecode and r/aigamedev a couple of times previously. Working with CC for personal projects re...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2602.22903] PSQE: A Theoretical-Practical Approach to Pseudo Seed Quality Enhancement for Unsupervised MMEA
Llms

[2602.22903] PSQE: A Theoretical-Practical Approach to Pseudo Seed Quality Enhancement for Unsupervised MMEA

The paper presents PSQE, a method for enhancing pseudo seed quality in unsupervised multimodal entity alignment, addressing challenges in...

arXiv - Machine Learning · 4 min ·
[2602.22621] CGSA: Class-Guided Slot-Aware Adaptation for Source-Free Object Detection
Computer Vision

[2602.22621] CGSA: Class-Guided Slot-Aware Adaptation for Source-Free Object Detection

The paper presents CGSA, a novel framework for Source-Free Domain Adaptive Object Detection that integrates object-centric learning to en...

arXiv - AI · 3 min ·
[2602.22699] DPSQL+: A Differentially Private SQL Library with a Minimum Frequency Rule
Machine Learning

[2602.22699] DPSQL+: A Differentially Private SQL Library with a Minimum Frequency Rule

DPSQL+ is a new SQL library designed to enhance data privacy by enforcing differential privacy and a minimum frequency rule, ensuring sen...

arXiv - Machine Learning · 4 min ·
[2602.22570] Guidance Matters: Rethinking the Evaluation Pitfall for Text-to-Image Generation
Machine Learning

[2602.22570] Guidance Matters: Rethinking the Evaluation Pitfall for Text-to-Image Generation

The paper discusses the evaluation challenges in text-to-image generation, focusing on classifier-free guidance (CFG) and proposing a new...

arXiv - AI · 4 min ·
[2602.22631] TorchLean: Formalizing Neural Networks in Lean
Machine Learning

[2602.22631] TorchLean: Formalizing Neural Networks in Lean

TorchLean is a framework that formalizes neural networks within the Lean 4 theorem prover, enabling precise semantics for execution and v...

arXiv - Machine Learning · 4 min ·
[2602.22564] Addressing Climate Action Misperceptions with Generative AI
Llms

[2602.22564] Addressing Climate Action Misperceptions with Generative AI

This study explores how a personalized large language model (LLM) can correct climate action misperceptions among climate-concerned indiv...

arXiv - AI · 3 min ·
[2602.22609] EvolveGen: Algorithmic Level Hardware Model Checking Benchmark Generation through Reinforcement Learning
Machine Learning

[2602.22609] EvolveGen: Algorithmic Level Hardware Model Checking Benchmark Generation through Reinforcement Learning

EvolveGen introduces a novel framework for generating hardware model checking benchmarks using reinforcement learning, addressing the ben...

arXiv - Machine Learning · 4 min ·
[2602.22488] Explainability-Aware Evaluation of Transfer Learning Models for IoT DDoS Detection Under Resource Constraints
Machine Learning

[2602.22488] Explainability-Aware Evaluation of Transfer Learning Models for IoT DDoS Detection Under Resource Constraints

This article evaluates transfer learning models for IoT DDoS detection, focusing on explainability and resource constraints. It analyzes ...

arXiv - AI · 3 min ·
[2602.22481] Sydney Telling Fables on AI and Humans: A Corpus Tracing Memetic Transfer of Persona between LLMs
Llms

[2602.22481] Sydney Telling Fables on AI and Humans: A Corpus Tracing Memetic Transfer of Persona between LLMs

This article explores the relationship between AI and humans through the lens of large language models (LLMs), focusing on the Sydney per...

arXiv - AI · 4 min ·
[2602.22450] Silent Egress: When Implicit Prompt Injection Makes LLM Agents Leak Without a Trace
Llms

[2602.22450] Silent Egress: When Implicit Prompt Injection Makes LLM Agents Leak Without a Trace

The paper discusses the security risks posed by implicit prompt injection in large language model (LLM) agents, demonstrating how adversa...

arXiv - AI · 4 min ·
[2602.22449] A Fusion of context-aware based BanglaBERT and Two-Layer Stacked LSTM Framework for Multi-Label Cyberbullying Detection
Nlp

[2602.22449] A Fusion of context-aware based BanglaBERT and Two-Layer Stacked LSTM Framework for Multi-Label Cyberbullying Detection

This paper presents a novel framework combining BanglaBERT and a two-layer stacked LSTM for effective multi-label cyberbullying detection...

arXiv - Machine Learning · 4 min ·
[2602.22427] HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems
Llms

[2602.22427] HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

The paper presents HubScan, a tool designed to detect hubness poisoning in Retrieval-Augmented Generation systems, addressing a critical ...

arXiv - AI · 4 min ·
[2602.22282] Differentially Private Truncation of Unbounded Data via Public Second Moments
Nlp

[2602.22282] Differentially Private Truncation of Unbounded Data via Public Second Moments

This paper presents a novel approach to differentially private data truncation using public second moments, enhancing privacy without com...

arXiv - Machine Learning · 4 min ·
[2602.22246] Self-Purification Mitigates Backdoors in Multimodal Diffusion Language Models
Llms

[2602.22246] Self-Purification Mitigates Backdoors in Multimodal Diffusion Language Models

This article presents a framework called DiSP (Diffusion Self-Purification) to mitigate backdoor attacks in Multimodal Diffusion Language...

arXiv - Machine Learning · 4 min ·
[2602.22347] Enabling clinical use of foundation models in histopathology
Llms

[2602.22347] Enabling clinical use of foundation models in histopathology

This article discusses the application of foundation models in histopathology, highlighting a novel approach that improves robustness and...

arXiv - AI · 4 min ·
[2602.22236] CrossLLM-Mamba: Multimodal State Space Fusion of LLMs for RNA Interaction Prediction
Llms

[2602.22236] CrossLLM-Mamba: Multimodal State Space Fusion of LLMs for RNA Interaction Prediction

The article presents CrossLLM-Mamba, a novel framework for RNA interaction prediction that utilizes multimodal state space fusion of larg...

arXiv - Machine Learning · 4 min ·
[2602.23353] SOTAlign: Semi-Supervised Alignment of Unimodal Vision and Language Models via Optimal Transport
Llms

[2602.23353] SOTAlign: Semi-Supervised Alignment of Unimodal Vision and Language Models via Optimal Transport

The paper introduces SOTAlign, a semi-supervised framework for aligning unimodal vision and language models using minimal paired data and...

arXiv - AI · 4 min ·
[2602.22258] Poisoned Acoustics
Machine Learning

[2602.22258] Poisoned Acoustics

The paper 'Poisoned Acoustics' explores training-data poisoning attacks on deep neural networks, demonstrating significant vulnerabilitie...

arXiv - AI · 3 min ·
[2602.23336] Differentiable Zero-One Loss via Hypersimplex Projections
Machine Learning

[2602.23336] Differentiable Zero-One Loss via Hypersimplex Projections

This paper presents a novel differentiable approximation to the zero-one loss, enhancing gradient-based optimization in machine learning ...

arXiv - Machine Learning · 3 min ·
[2602.23296] Conformalized Neural Networks for Federated Uncertainty Quantification under Dual Heterogeneity
Machine Learning

[2602.23296] Conformalized Neural Networks for Federated Uncertainty Quantification under Dual Heterogeneity

This article presents FedWQ-CP, a novel approach to federated uncertainty quantification that addresses dual heterogeneity in data and mo...

arXiv - Machine Learning · 4 min ·
Previous Page 34 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime