Why are we blindly trusting AI companies with our data?
Lately I’ve been seeing a story floating around that really made me pause. Apparently, there were claims that the US government asked Ant...
GPT, Claude, Gemini, and other LLMs
Lately I’ve been seeing a story floating around that really made me pause. Apparently, there were claims that the US government asked Ant...
Artificial intelligence is transforming every corner of industry, and television is no exception. Major networks in Korea have recently a...
Abstract page for arXiv paper 2603.16629: MLLM-based Textual Explanations for Face Comparison
Abstract page for arXiv paper 2505.22318: Flying Pigs, FaR and Beyond: Evaluating LLM Reasoning in Counterfactual Worlds
Abstract page for arXiv paper 2504.16956: GeneMamba: An Efficient and Effective Foundation Model on Single Cell Data
Abstract page for arXiv paper 2412.08686: LatentQA: Teaching LLMs to Decode Activations Into Natural Language
Abstract page for arXiv paper 2410.12164: Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator...
Abstract page for arXiv paper 2504.07396: Automating quantum feature map design via large language models
Abstract page for arXiv paper 2502.01969: Mitigating Object Hallucinations in Large Vision-Language Models via Attention Calibration
Abstract page for arXiv paper 2306.05036: Mapping the Challenges of HCI: An Application and Evaluation of ChatGPT for Mining Insights at ...
Abstract page for arXiv paper 2601.12138: DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants
Abstract page for arXiv paper 2511.22076: Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in ...
Abstract page for arXiv paper 2601.18858: Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer...
Abstract page for arXiv paper 2510.05318: BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynami...
Abstract page for arXiv paper 2510.00415: Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration und...
Abstract page for arXiv paper 2603.23501: MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage
Abstract page for arXiv paper 2603.23485: Failure of contextual invariance in gender inference with large language models
Abstract page for arXiv paper 2603.23482: ReqFusion: A Multi-Provider Framework for Automated PEGS Analysis Across Software Domains
Abstract page for arXiv paper 2510.16051: GUIrilla: A Scalable Framework for Automated Desktop UI Exploration
Abstract page for arXiv paper 2603.23447: 3DCity-LLM: Empowering Multi-modality Large Language Models for 3D City-scale Perception and Un...
Abstract page for arXiv paper 2603.23443: Evaluating LLM-Based Test Generation Under Software Evolution
Abstract page for arXiv paper 2603.23322: Leveraging LLMs and Social Media to Understand User Perception of Smartphone-Based Earthquake E...
Abstract page for arXiv paper 2507.00026: RedTopic: Toward Topic-Diverse Red Teaming of Large Language Models
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime