Open Source AI

Open weights models, datasets, and frameworks

Top This Week

[2603.25112] Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory
Llms

[2603.25112] Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory

Abstract page for arXiv paper 2603.25112: Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory

arXiv - AI · 4 min ·
[2603.24772] Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Validated Dataset
Llms

[2603.24772] Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Validated Dataset

Abstract page for arXiv paper 2603.24772: Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Val...

arXiv - Machine Learning · 4 min ·
[2603.25325] How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models
Llms

[2603.25325] How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

Abstract page for arXiv paper 2603.25325: How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

arXiv - AI · 4 min ·

All Content

[2602.19339] SplitLight: An Exploratory Toolkit for Recommender Systems Datasets and Splits
Machine Learning

[2602.19339] SplitLight: An Exploratory Toolkit for Recommender Systems Datasets and Splits

SplitLight is an open-source toolkit designed to enhance the evaluation of recommender systems by providing measurable and comparable dat...

arXiv - Machine Learning · 3 min ·
[2602.19509] Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference
Llms

[2602.19509] Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference

The article presents Pyramid MoA, a probabilistic framework designed to optimize inference costs in large language models (LLMs) while ma...

arXiv - Machine Learning · 3 min ·
[2602.18767] Nazrin: Atomic Tactics for Graph Neural Networks for Theorem Proving in Lean 4
Machine Learning

[2602.18767] Nazrin: Atomic Tactics for Graph Neural Networks for Theorem Proving in Lean 4

The paper presents Nazrin, a graph neural network-based theorem proving agent that utilizes atomic tactics to enhance machine-assisted th...

arXiv - Machine Learning · 3 min ·
[2602.19489] Federated Learning Playground
Machine Learning

[2602.19489] Federated Learning Playground

The article presents the Federated Learning Playground, an interactive platform designed to teach core concepts of Federated Learning thr...

arXiv - AI · 3 min ·
[2602.20048] CodeCompass: Navigating the Navigation Paradox in Agentic Code Intelligence
Nlp

[2602.20048] CodeCompass: Navigating the Navigation Paradox in Agentic Code Intelligence

The paper presents CodeCompass, a solution to the Navigation Paradox in code intelligence, highlighting the distinction between navigatio...

arXiv - AI · 3 min ·
[2602.19810] OpenClaw, Moltbook, and ClawdLab: From Agent-Only Social Networks to Autonomous Scientific Research
Robotics

[2602.19810] OpenClaw, Moltbook, and ClawdLab: From Agent-Only Social Networks to Autonomous Scientific Research

The paper discusses OpenClaw, Moltbook, and ClawdLab, highlighting their role in creating a dataset for AI interactions and proposing Cla...

arXiv - AI · 4 min ·
[2602.19160] Reasoning Capabilities of Large Language Models. Lessons Learned from General Game Playing
Llms

[2602.19160] Reasoning Capabilities of Large Language Models. Lessons Learned from General Game Playing

This paper evaluates the reasoning capabilities of Large Language Models (LLMs) through General Game Playing tasks, revealing performance...

arXiv - AI · 4 min ·
[2602.19109] Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions
Llms

[2602.19109] Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions

The paper examines three-digit addition in Meta-Llama-3-8B, focusing on how arithmetic results are determined post-routing, emphasizing t...

arXiv - AI · 3 min ·
Deploying Open Source Vision Language Models (VLM) on Jetson
Llms

Deploying Open Source Vision Language Models (VLM) on Jetson

This article provides a comprehensive guide on deploying Open Source Vision Language Models (VLMs) on NVIDIA Jetson devices, detailing th...

Hugging Face Blog · 8 min ·
Llms

Inference at 16k tokens/second

The article discusses a remarkable achievement in AI inference speed, showcasing a chatbot that processes 17k tokens per second using a l...

Reddit - Artificial Intelligence · 1 min ·
Llms

[P] OpenLanguageModel (OLM): A modular, readable PyTorch LLM library — feedback & contributors welcome

OpenLanguageModel (OLM) is an open-source PyTorch library designed for training language models, emphasizing simplicity and modularity fo...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] I wish papers could at least be judged in part by code quality (usability) for conference submissions. Given that most people want to get a job in industry later, this could also help their technical legitimacy.

The article discusses the need for evaluating conference paper submissions based on code quality, emphasizing its importance for job read...

Reddit - Machine Learning · 1 min ·
Guide Labs debuts a new kind of interpretable LLM | TechCrunch
Llms

Guide Labs debuts a new kind of interpretable LLM | TechCrunch

Guide Labs introduces Steerling-8B, an open-sourced interpretable LLM designed to enhance understanding of AI model outputs by tracing to...

TechCrunch - AI · 6 min ·
📚 3LM: A Benchmark for Arabic LLMs in STEM and Code
Llms

📚 3LM: A Benchmark for Arabic LLMs in STEM and Code

The article introduces 3LM, a benchmark designed to evaluate Arabic LLMs in STEM and coding, addressing gaps in existing assessments focu...

Hugging Face Blog · 6 min ·
Machine Learning

[P] torch-continuum — one-line PyTorch acceleration, benchmarked on H100

The article discusses the development of torch-continuum, a library that optimizes PyTorch performance by auto-detecting GPU settings, ai...

Reddit - Machine Learning · 1 min ·
[2505.17592] AstroMLab 4: Benchmark-Topping Performance in Astronomy Q&A with a 70B-Parameter Domain-Specialized Reasoning Model
Llms

[2505.17592] AstroMLab 4: Benchmark-Topping Performance in Astronomy Q&A with a 70B-Parameter Domain-Specialized Reasoning Model

AstroMLab 4 introduces a 70B-parameter AI model specialized for astronomy, achieving benchmark-topping performance in Q&A tasks, surpassi...

arXiv - Machine Learning · 4 min ·
[2601.10161] AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers
Machine Learning

[2601.10161] AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers

AWED-FiNER introduces an innovative tool for Fine-grained Named Entity Recognition (FgNER) across 36 languages, enhancing NLP capabilitie...

arXiv - AI · 4 min ·
[2601.01944] The Invisible Hand of AI Libraries Shaping Open Source Projects and Communities
Open Source Ai

[2601.01944] The Invisible Hand of AI Libraries Shaping Open Source Projects and Communities

This article examines the impact of AI libraries on open source software (OSS) projects, analyzing their adoption in Python and Java to u...

arXiv - AI · 4 min ·
[2602.18307] VeriSoftBench: Repository-Scale Formal Verification Benchmarks for Lean
Llms

[2602.18307] VeriSoftBench: Repository-Scale Formal Verification Benchmarks for Lean

The paper introduces VeriSoftBench, a benchmark for formal verification in Lean, highlighting its limitations and performance insights fr...

arXiv - Machine Learning · 3 min ·
[2602.17861] JAX-Privacy: A library for differentially private machine learning
Machine Learning

[2602.17861] JAX-Privacy: A library for differentially private machine learning

JAX-Privacy is a new library aimed at simplifying the implementation of differentially private machine learning, offering both customizat...

arXiv - Machine Learning · 3 min ·
Previous Page 6 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime