The state of AI safety in four fake graphs
submitted by /u/tekz [link] [comments]
Alignment, bias, regulation, and responsible AI
submitted by /u/tekz [link] [comments]
Abstract page for arXiv paper 2603.14267: DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and ...
Abstract page for arXiv paper 2601.22440: AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Value...
Abstract page for arXiv paper 2503.15477: What Makes a Reward Model a Good Teacher? An Optimization Perspective
Abstract page for arXiv paper 2601.22664: Real-Time Aligned Reward Model beyond Semantics
Abstract page for arXiv paper 2510.10285: Reallocating Attention Across Layers to Reduce Multimodal Hallucination
Abstract page for arXiv paper 2509.24159: RE-PO: Robust Enhanced Policy Optimization as a General Framework for LLM Alignment
Abstract page for arXiv paper 2507.19364: Integrating LLM in Agent-Based Social Simulation: Opportunities and Challenges
Abstract page for arXiv paper 2602.23518: Uncovering Physical Drivers of Dark Matter Halo Structures with Auxiliary-Variable-Guided Gener...
Abstract page for arXiv paper 2602.24245: Chunk-wise Attention Transducers for Fast and Accurate Streaming Speech-to-Text
Abstract page for arXiv paper 2602.24014: Interpretable Debiasing of Vision-Language Models for Social Fairness
Abstract page for arXiv paper 2602.23971: Ask don't tell: Reducing sycophancy in large language models
Abstract page for arXiv paper 2602.23887: Uncovering sustainable personal care ingredient combinations using scientific modelling
Abstract page for arXiv paper 2602.23947: Hierarchical Concept-based Interpretable Models
Abstract page for arXiv paper 2602.23652: 3D Modality-Aware Pre-training for Vision-Language Model in MRI Multi-organ Abnormality Detection
Abstract page for arXiv paper 2602.23638: FedRot-LoRA: Mitigating Rotational Misalignment in Federated LoRA
Abstract page for arXiv paper 2602.23636: FlexGuard: Continuous Risk Scoring for Strictness-Adaptive LLM Content Moderation
Abstract page for arXiv paper 2602.23588: Hyperdimensional Cross-Modal Alignment of Frozen Language and Image Models for Efficient Image ...
Abstract page for arXiv paper 2602.23580: BRIDGE the Gap: Mitigating Bias Amplification in Automated Scoring of English Language Learners...
Abstract page for arXiv paper 2602.23507: Sample Size Calculations for Developing Clinical Prediction Models: Overview and pmsims R package
Abstract page for arXiv paper 2602.23447: SALIENT: Frequency-Aware Paired Diffusion for Controllable Long-Tail CT Detection
Abstract page for arXiv paper 2602.23378: Now You See Me: Designing Responsible AI Dashboards for Early-Stage Health Innovation
Abstract page for arXiv paper 2602.23605: SleepLM: Natural-Language Intelligence for Human Sleep
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime