Open Source AI

Open weights models, datasets, and frameworks

Top This Week

[2603.25112] Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory
Llms

[2603.25112] Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory

Abstract page for arXiv paper 2603.25112: Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory

arXiv - AI · 4 min ·
[2603.24772] Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Validated Dataset
Llms

[2603.24772] Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Validated Dataset

Abstract page for arXiv paper 2603.24772: Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Val...

arXiv - Machine Learning · 4 min ·
[2603.25325] How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models
Llms

[2603.25325] How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

Abstract page for arXiv paper 2603.25325: How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

arXiv - Machine Learning · 4 min ·

All Content

[2603.25112] Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory
Llms

[2603.25112] Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory

Abstract page for arXiv paper 2603.25112: Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory

arXiv - AI · 4 min ·
[2603.24772] Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Validated Dataset
Llms

[2603.24772] Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Validated Dataset

Abstract page for arXiv paper 2603.24772: Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Val...

arXiv - Machine Learning · 4 min ·
[2603.25325] How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models
Llms

[2603.25325] How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

Abstract page for arXiv paper 2603.25325: How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

arXiv - Machine Learning · 4 min ·
Llms

[D] Why evaluating only final outputs is misleading for local LLM agents

Been running local agents with Ollama + LangChain lately and noticed something kind of uncomfortable — you can get a completely correct f...

Reddit - Machine Learning · 1 min ·
Mistral releases a new open-source model for speech generation | TechCrunch
Llms

Mistral releases a new open-source model for speech generation | TechCrunch

Mistral's new speech model can run on a smartwatch or a smartphone.

TechCrunch - AI · 4 min ·
[2410.12164] Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning
Llms

[2410.12164] Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning

Abstract page for arXiv paper 2410.12164: Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator...

arXiv - Machine Learning · 4 min ·
[2603.23308] Curriculum-Driven 3D CT Report Generation via Language-Free Visual Grafting and Zone-Constrained Compression
Llms

[2603.23308] Curriculum-Driven 3D CT Report Generation via Language-Free Visual Grafting and Zone-Constrained Compression

Abstract page for arXiv paper 2603.23308: Curriculum-Driven 3D CT Report Generation via Language-Free Visual Grafting and Zone-Constraine...

arXiv - AI · 4 min ·
[2603.22287] Founder effects shape the evolutionary dynamics of multimodality in open LLM families
Llms

[2603.22287] Founder effects shape the evolutionary dynamics of multimodality in open LLM families

Abstract page for arXiv paper 2603.22287: Founder effects shape the evolutionary dynamics of multimodality in open LLM families

arXiv - AI · 4 min ·
[2603.22339] Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits
Llms

[2603.22339] Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits

Abstract page for arXiv paper 2603.22339: Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits

arXiv - Machine Learning · 4 min ·
[2603.17074] PRISM: Demystifying Retention and Interaction in Mid-Training
Llms

[2603.17074] PRISM: Demystifying Retention and Interaction in Mid-Training

Abstract page for arXiv paper 2603.17074: PRISM: Demystifying Retention and Interaction in Mid-Training

arXiv - Machine Learning · 4 min ·
[2603.20854] SozKZ: Training Efficient Small Language Models for Kazakh from Scratch
Llms

[2603.20854] SozKZ: Training Efficient Small Language Models for Kazakh from Scratch

Abstract page for arXiv paper 2603.20854: SozKZ: Training Efficient Small Language Models for Kazakh from Scratch

arXiv - AI · 3 min ·
[2603.20531] Epistemic Observability in Language Models
Llms

[2603.20531] Epistemic Observability in Language Models

Abstract page for arXiv paper 2603.20531: Epistemic Observability in Language Models

arXiv - Machine Learning · 4 min ·
[2603.20514] Evaluating Large Language Models on Historical Health Crisis Knowledge in Resource-Limited Settings: A Hybrid Multi-Metric Study
Llms

[2603.20514] Evaluating Large Language Models on Historical Health Crisis Knowledge in Resource-Limited Settings: A Hybrid Multi-Metric Study

Abstract page for arXiv paper 2603.20514: Evaluating Large Language Models on Historical Health Crisis Knowledge in Resource-Limited Sett...

arXiv - AI · 3 min ·
A New Framework for Evaluation of Voice Agents (EVA)
Open Source Ai

A New Framework for Evaluation of Voice Agents (EVA)

A Blog post by ServiceNow-AI on Hugging Face

Hugging Face Blog · 11 min ·
Llms

Xiaomi's MiMo models are making the AI pricing conversation uncomfortable

MiMo-V2-Flash is open source, scores 73.4% on SWE-Bench (#1 among open source models), and costs $0.10 per million input tokens. That's c...

Reddit - Artificial Intelligence · 1 min ·
[2507.18014] Predictive Scaling Laws for Efficient GRPO Training of Large Reasoning Models
Llms

[2507.18014] Predictive Scaling Laws for Efficient GRPO Training of Large Reasoning Models

Abstract page for arXiv paper 2507.18014: Predictive Scaling Laws for Efficient GRPO Training of Large Reasoning Models

arXiv - Machine Learning · 3 min ·
[2603.19265] When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models
Llms

[2603.19265] When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models

Abstract page for arXiv paper 2603.19265: When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the ...

arXiv - AI · 4 min ·
[2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2
Llms

[2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

Abstract page for arXiv paper 2603.19253: A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

arXiv - AI · 4 min ·
Llms

[P] Inferencing Llama3.2-1B-Instruct on 3xMac Minis M4 with Data Parallelism using allToall architecture! | smolcluster

Here's another sneak-peek into inference of Llama3.2-1B-Instruct model, on 3xMac Mini 16 gigs each M4 with smolcluster! Today's the demo ...

Reddit - Machine Learning · 1 min ·
Open Source Ai

I am a painter with work at MoMA and the Met. I just published 50 years of my work as an open AI dataset. Here is what I learned.

I am a painter with work at MoMA and the Met. I just published 50 years of my work as an open AI dataset. Here is what I learned. I have ...

Reddit - Artificial Intelligence · 1 min ·
Page 1 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime