[2603.25112] Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory
Abstract page for arXiv paper 2603.25112: Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory
Open weights models, datasets, and frameworks
Abstract page for arXiv paper 2603.25112: Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory
Abstract page for arXiv paper 2603.24772: Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Val...
Abstract page for arXiv paper 2603.25325: How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models
This article details the process of training an AI to play Street Fighter 6 using imitation learning, showcasing both the gameplay and te...
An AI bot learns to play Mario from scratch using reinforcement learning, starting with no prior knowledge and improving through trial an...
The discussion explores the future of home-rolled language models versus those developed by large labs, emphasizing the potential for ope...
Ollama 0.17 has been released, featuring enhancements to the OpenClaw onboarding process, aimed at improving user experience and accessib...
The article discusses the potential for developing an on-device contextual intelligence engine for Android, inspired by Apple's intellige...
Antaris-suite 3.0 is an open-source tool designed for AI agents, offering zero-dependency memory management, routing, and context handlin...
The article discusses the creation of Sentinel, an open-source LLM gateway built in Rust to address common issues faced with LLMs in prod...
The article introduces 'optimize_anything', an open-source API designed to optimize various text artifacts, including code and prompts, b...
Vipune is a minimal memory layer for AI agents that allows for semantic memory storage and retrieval without requiring network dependenci...
GGML and its project llama.cpp join Hugging Face to enhance local AI development, ensuring continued open-source progress and community s...
A Reddit user shares their frustrating experience of troubleshooting a Gradio app, only to discover that a missing line in the .env file ...
The article introduces Sentinel, an open-source LLM gateway in Rust designed to streamline interactions with multiple LLM APIs, focusing ...
The article presents 'genriesz', an open-source Python package designed for automatic debiased machine learning using generalized Riesz r...
The paper presents SoftDTW-CUDA-Torch, an open-source PyTorch library that enhances Soft Dynamic Time Warping (SoftDTW) by improving memo...
MolmoSpaces introduces a large-scale open ecosystem designed for benchmarking robot navigation and manipulation, featuring over 230k dive...
Makimus-AI is a free, open-source local app that enables users to search their image libraries using natural language queries, functionin...
This article presents an interactive knowledge graph mapping the lineage of transformer papers, illustrating the connections between key ...
Wizwand, an alternative to PaperWithCode, has launched its second version, addressing dataset inconsistencies and improving leaderboard a...
This article discusses how to train AI models using Unsloth and Hugging Face Jobs, highlighting the benefits of faster training and lower...
This article discusses the development of an open-source credit card fraud detection system utilizing Random Forest to address class imba...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime