Open Source AI

Open weights models, datasets, and frameworks

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

[2603.25112] Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory

Abstract page for arXiv paper 2603.25112: Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory

arXiv - AI · 4 min · 2 days ago

Llms

[2603.24772] Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Validated Dataset

Abstract page for arXiv paper 2603.24772: Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Val...

arXiv - Machine Learning · 4 min · 2 days ago

Llms

[2603.25325] How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

Abstract page for arXiv paper 2603.25325: How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

arXiv - AI · 4 min · 2 days ago

All Content

Machine Learning

[2602.19339] SplitLight: An Exploratory Toolkit for Recommender Systems Datasets and Splits

SplitLight is an open-source toolkit designed to enhance the evaluation of recommender systems by providing measurable and comparable dat...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.19509] Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference

The article presents Pyramid MoA, a probabilistic framework designed to optimize inference costs in large language models (LLMs) while ma...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.18767] Nazrin: Atomic Tactics for Graph Neural Networks for Theorem Proving in Lean 4

The paper presents Nazrin, a graph neural network-based theorem proving agent that utilizes atomic tactics to enhance machine-assisted th...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.19489] Federated Learning Playground

The article presents the Federated Learning Playground, an interactive platform designed to teach core concepts of Federated Learning thr...

arXiv - AI · 3 min · about 1 month ago

Nlp

[2602.20048] CodeCompass: Navigating the Navigation Paradox in Agentic Code Intelligence

The paper presents CodeCompass, a solution to the Navigation Paradox in code intelligence, highlighting the distinction between navigatio...

arXiv - AI · 3 min · about 1 month ago

Robotics

[2602.19810] OpenClaw, Moltbook, and ClawdLab: From Agent-Only Social Networks to Autonomous Scientific Research

The paper discusses OpenClaw, Moltbook, and ClawdLab, highlighting their role in creating a dataset for AI interactions and proposing Cla...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.19160] Reasoning Capabilities of Large Language Models. Lessons Learned from General Game Playing

This paper evaluates the reasoning capabilities of Large Language Models (LLMs) through General Game Playing tasks, revealing performance...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.19109] Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions

The paper examines three-digit addition in Meta-Llama-3-8B, focusing on how arithmetic results are determined post-routing, emphasizing t...

arXiv - AI · 3 min · about 1 month ago

Llms

Deploying Open Source Vision Language Models (VLM) on Jetson

This article provides a comprehensive guide on deploying Open Source Vision Language Models (VLMs) on NVIDIA Jetson devices, detailing th...

Hugging Face Blog · 8 min · about 1 month ago

Llms

Inference at 16k tokens/second

The article discusses a remarkable achievement in AI inference speed, showcasing a chatbot that processes 17k tokens per second using a l...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Llms

[P] OpenLanguageModel (OLM): A modular, readable PyTorch LLM library — feedback & contributors welcome

OpenLanguageModel (OLM) is an open-source PyTorch library designed for training language models, emphasizing simplicity and modularity fo...

Reddit - Machine Learning · 1 min · about 1 month ago

Machine Learning

[D] I wish papers could at least be judged in part by code quality (usability) for conference submissions. Given that most people want to get a job in industry later, this could also help their technical legitimacy.

The article discusses the need for evaluating conference paper submissions based on code quality, emphasizing its importance for job read...

Reddit - Machine Learning · 1 min · about 1 month ago

Llms

Guide Labs debuts a new kind of interpretable LLM | TechCrunch

Guide Labs introduces Steerling-8B, an open-sourced interpretable LLM designed to enhance understanding of AI model outputs by tracing to...

TechCrunch - AI · 6 min · about 1 month ago

Llms

📚 3LM: A Benchmark for Arabic LLMs in STEM and Code

The article introduces 3LM, a benchmark designed to evaluate Arabic LLMs in STEM and coding, addressing gaps in existing assessments focu...

Hugging Face Blog · 6 min · about 1 month ago

Machine Learning

[P] torch-continuum — one-line PyTorch acceleration, benchmarked on H100

The article discusses the development of torch-continuum, a library that optimizes PyTorch performance by auto-detecting GPU settings, ai...

Reddit - Machine Learning · 1 min · about 1 month ago

Llms

[2505.17592] AstroMLab 4: Benchmark-Topping Performance in Astronomy Q&A with a 70B-Parameter Domain-Specialized Reasoning Model

AstroMLab 4 introduces a 70B-parameter AI model specialized for astronomy, achieving benchmark-topping performance in Q&A tasks, surpassi...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2601.10161] AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers

AWED-FiNER introduces an innovative tool for Fine-grained Named Entity Recognition (FgNER) across 36 languages, enhancing NLP capabilitie...

arXiv - AI · 4 min · about 1 month ago

Open Source Ai

[2601.01944] The Invisible Hand of AI Libraries Shaping Open Source Projects and Communities

This article examines the impact of AI libraries on open source software (OSS) projects, analyzing their adoption in Python and Java to u...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.18307] VeriSoftBench: Repository-Scale Formal Verification Benchmarks for Lean

The paper introduces VeriSoftBench, a benchmark for formal verification in Lean, highlighting its limitations and performance insights fr...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.17861] JAX-Privacy: A library for differentially private machine learning

JAX-Privacy is a new library aimed at simplifying the implementation of differentially private machine learning, offering both customizat...

arXiv - Machine Learning · 3 min · about 1 month ago

Previous Page 6 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Open Source AI

Top This Week

[2603.25112] Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory

[2603.24772] Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Validated Dataset

[2603.25325] How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

All Content

[2602.19339] SplitLight: An Exploratory Toolkit for Recommender Systems Datasets and Splits

[2602.19509] Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference

[2602.18767] Nazrin: Atomic Tactics for Graph Neural Networks for Theorem Proving in Lean 4

[2602.19489] Federated Learning Playground

[2602.20048] CodeCompass: Navigating the Navigation Paradox in Agentic Code Intelligence

[2602.19810] OpenClaw, Moltbook, and ClawdLab: From Agent-Only Social Networks to Autonomous Scientific Research

[2602.19160] Reasoning Capabilities of Large Language Models. Lessons Learned from General Game Playing

[2602.19109] Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions

Deploying Open Source Vision Language Models (VLM) on Jetson

Inference at 16k tokens/second

[P] OpenLanguageModel (OLM): A modular, readable PyTorch LLM library — feedback & contributors welcome

[D] I wish papers could at least be judged in part by code quality (usability) for conference submissions. Given that most people want to get a job in industry later, this could also help their technical legitimacy.

Guide Labs debuts a new kind of interpretable LLM | TechCrunch

📚 3LM: A Benchmark for Arabic LLMs in STEM and Code

[P] torch-continuum — one-line PyTorch acceleration, benchmarked on H100

[2505.17592] AstroMLab 4: Benchmark-Topping Performance in Astronomy Q&A with a 70B-Parameter Domain-Specialized Reasoning Model

[2601.10161] AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers

[2601.01944] The Invisible Hand of AI Libraries Shaping Open Source Projects and Communities

[2602.18307] VeriSoftBench: Repository-Scale Formal Verification Benchmarks for Lean

[2602.17861] JAX-Privacy: A library for differentially private machine learning

Related Topics

Stay updated with AI News