Top Open Source AI This Week
The most engaging open source ai content from this week, curated by AI News.
-
1
[P] Inferencing Llama3.2-1B-Instruct on 3xMac Minis M4 with Data Parallelism using allToall architecture! | smolcluster
Here's another sneak-peek into inference of Llama3.2-1B-Instruct model, on 3xMac Mini 16 gigs each M4 with smolcluster! Today's the demo for my Data Parallelism implementation using allToall archit...
Reddit - Machine Learning · 5 days ago -
2
Xiaomi's MiMo models are making the AI pricing conversation uncomfortable
MiMo-V2-Flash is open source, scores 73.4% on SWE-Bench (#1 among open source models), and costs $0.10 per million input tokens. That's comparable to Claude Sonnet at 3.5% of the price. MiMo-V2-Pro...
Reddit - Artificial Intelligence · 4 days ago -
3
Build a Domain-Specific Embedding Model in Under a Day
A Blog post by NVIDIA on Hugging Face
Hugging Face Blog · 7 days ago -
4
What's New in Mellea 0.4.0 + Granite Libraries Release
A Blog post by IBM Granite on Hugging Face
Hugging Face Blog · 7 days ago -
5
[2603.22339] Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits
Abstract page for arXiv paper 2603.22339: Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits
arXiv - Machine Learning · 2 days ago -
6
[2603.22287] Founder effects shape the evolutionary dynamics of multimodality in open LLM families
Abstract page for arXiv paper 2603.22287: Founder effects shape the evolutionary dynamics of multimodality in open LLM families
arXiv - AI · 2 days ago -
7
[D] Single-artist longitudinal fine art dataset spanning 5 decades now on Hugging Face — potential applications in style evolution, figure representation, and ethical training data
I am a figurative artist based in New York with work in the collections of the Metropolitan Museum of Art, MoMA, SFMOMA, and the British Museum. I recently published my catalog raisonne as an open ...
Reddit - Machine Learning · 5 days ago -
8
I am a painter with work at MoMA and the Met. I just published 50 years of my work as an open AI dataset. Here is what I learned.
I am a painter with work at MoMA and the Met. I just published 50 years of my work as an open AI dataset. Here is what I learned. I have been making figurative art since the 1970s. Oil on canvas, w...
Reddit - Artificial Intelligence · 5 days ago -
9
[2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2
Abstract page for arXiv paper 2603.19253: A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2
arXiv - AI · 4 days ago -
10
[2603.19265] When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models
Abstract page for arXiv paper 2603.19265: When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models
arXiv - AI · 4 days ago -
11
[2507.18014] Predictive Scaling Laws for Efficient GRPO Training of Large Reasoning Models
Abstract page for arXiv paper 2507.18014: Predictive Scaling Laws for Efficient GRPO Training of Large Reasoning Models
arXiv - Machine Learning · 4 days ago -
12
A New Framework for Evaluation of Voice Agents (EVA)
A Blog post by ServiceNow-AI on Hugging Face
Hugging Face Blog · 3 days ago -
13
[2603.17074] PRISM: Demystifying Retention and Interaction in Mid-Training
Abstract page for arXiv paper 2603.17074: PRISM: Demystifying Retention and Interaction in Mid-Training
arXiv - Machine Learning · 3 days ago -
14
[2603.20531] Epistemic Observability in Language Models
Abstract page for arXiv paper 2603.20531: Epistemic Observability in Language Models
arXiv - Machine Learning · 3 days ago -
15
[2603.23308] Curriculum-Driven 3D CT Report Generation via Language-Free Visual Grafting and Zone-Constrained Compression
Abstract page for arXiv paper 2603.23308: Curriculum-Driven 3D CT Report Generation via Language-Free Visual Grafting and Zone-Constrained Compression
arXiv - AI · 2 days ago -
16
[2603.20514] Evaluating Large Language Models on Historical Health Crisis Knowledge in Resource-Limited Settings: A Hybrid Multi-Metric Study
Abstract page for arXiv paper 2603.20514: Evaluating Large Language Models on Historical Health Crisis Knowledge in Resource-Limited Settings: A Hybrid Multi-Metric Study
arXiv - AI · 3 days ago -
17
[2603.20854] SozKZ: Training Efficient Small Language Models for Kazakh from Scratch
Abstract page for arXiv paper 2603.20854: SozKZ: Training Efficient Small Language Models for Kazakh from Scratch
arXiv - AI · 3 days ago -
18
[2410.12164] Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning
Abstract page for arXiv paper 2410.12164: Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning
arXiv - Machine Learning · 2 days ago -
19
Mistral releases a new open-source model for speech generation | TechCrunch
Mistral's new speech model can run on a smartwatch or a smartphone.
TechCrunch - AI · 1 day ago -
20
[D] Why evaluating only final outputs is misleading for local LLM agents
Been running local agents with Ollama + LangChain lately and noticed something kind of uncomfortable — you can get a completely correct final answer while the agent is doing absolute nonsense inter...
Reddit - Machine Learning · about 17 hours ago
Stay updated with AI News
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime