Top Open Source AI This Week

The most engaging open source ai content from this week, curated by AI News.

  1. 1

    [P] Inferencing Llama3.2-1B-Instruct on 3xMac Minis M4 with Data Parallelism using allToall architecture! | smolcluster

    Here's another sneak-peek into inference of Llama3.2-1B-Instruct model, on 3xMac Mini 16 gigs each M4 with smolcluster! Today's the demo for my Data Parallelism implementation using allToall archit...

    Reddit - Machine Learning · 5 days ago
  2. 2

    Xiaomi's MiMo models are making the AI pricing conversation uncomfortable

    MiMo-V2-Flash is open source, scores 73.4% on SWE-Bench (#1 among open source models), and costs $0.10 per million input tokens. That's comparable to Claude Sonnet at 3.5% of the price. MiMo-V2-Pro...

    Reddit - Artificial Intelligence · 4 days ago
  3. 3

    Build a Domain-Specific Embedding Model in Under a Day

    A Blog post by NVIDIA on Hugging Face

    Hugging Face Blog · 7 days ago
  4. 4

    What's New in Mellea 0.4.0 + Granite Libraries Release

    A Blog post by IBM Granite on Hugging Face

    Hugging Face Blog · 7 days ago
  5. 5

    [2603.22339] Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits

    Abstract page for arXiv paper 2603.22339: Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits

    arXiv - Machine Learning · 2 days ago
  6. 6

    [2603.22287] Founder effects shape the evolutionary dynamics of multimodality in open LLM families

    Abstract page for arXiv paper 2603.22287: Founder effects shape the evolutionary dynamics of multimodality in open LLM families

    arXiv - AI · 2 days ago
  7. 7

    [D] Single-artist longitudinal fine art dataset spanning 5 decades now on Hugging Face — potential applications in style evolution, figure representation, and ethical training data

    I am a figurative artist based in New York with work in the collections of the Metropolitan Museum of Art, MoMA, SFMOMA, and the British Museum. I recently published my catalog raisonne as an open ...

    Reddit - Machine Learning · 5 days ago
  8. 8

    I am a painter with work at MoMA and the Met. I just published 50 years of my work as an open AI dataset. Here is what I learned.

    I am a painter with work at MoMA and the Met. I just published 50 years of my work as an open AI dataset. Here is what I learned. I have been making figurative art since the 1970s. Oil on canvas, w...

    Reddit - Artificial Intelligence · 5 days ago
  9. 9

    [2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

    Abstract page for arXiv paper 2603.19253: A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

    arXiv - AI · 4 days ago
  10. 10

    [2603.19265] When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models

    Abstract page for arXiv paper 2603.19265: When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models

    arXiv - AI · 4 days ago
  11. 11

    [2507.18014] Predictive Scaling Laws for Efficient GRPO Training of Large Reasoning Models

    Abstract page for arXiv paper 2507.18014: Predictive Scaling Laws for Efficient GRPO Training of Large Reasoning Models

    arXiv - Machine Learning · 4 days ago
  12. 12

    A New Framework for Evaluation of Voice Agents (EVA)

    A Blog post by ServiceNow-AI on Hugging Face

    Hugging Face Blog · 3 days ago
  13. 13

    [2603.17074] PRISM: Demystifying Retention and Interaction in Mid-Training

    Abstract page for arXiv paper 2603.17074: PRISM: Demystifying Retention and Interaction in Mid-Training

    arXiv - Machine Learning · 3 days ago
  14. 14

    [2603.20531] Epistemic Observability in Language Models

    Abstract page for arXiv paper 2603.20531: Epistemic Observability in Language Models

    arXiv - Machine Learning · 3 days ago
  15. 15

    [2603.23308] Curriculum-Driven 3D CT Report Generation via Language-Free Visual Grafting and Zone-Constrained Compression

    Abstract page for arXiv paper 2603.23308: Curriculum-Driven 3D CT Report Generation via Language-Free Visual Grafting and Zone-Constrained Compression

    arXiv - AI · 2 days ago
  16. 16

    [2603.20514] Evaluating Large Language Models on Historical Health Crisis Knowledge in Resource-Limited Settings: A Hybrid Multi-Metric Study

    Abstract page for arXiv paper 2603.20514: Evaluating Large Language Models on Historical Health Crisis Knowledge in Resource-Limited Settings: A Hybrid Multi-Metric Study

    arXiv - AI · 3 days ago
  17. 17

    [2603.20854] SozKZ: Training Efficient Small Language Models for Kazakh from Scratch

    Abstract page for arXiv paper 2603.20854: SozKZ: Training Efficient Small Language Models for Kazakh from Scratch

    arXiv - AI · 3 days ago
  18. 18

    [2410.12164] Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning

    Abstract page for arXiv paper 2410.12164: Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning

    arXiv - Machine Learning · 2 days ago
  19. 19

    Mistral releases a new open-source model for speech generation | TechCrunch

    Mistral's new speech model can run on a smartwatch or a smartphone.

    TechCrunch - AI · 1 day ago
  20. 20

    [D] Why evaluating only final outputs is misleading for local LLM agents

    Been running local agents with Ollama + LangChain lately and noticed something kind of uncomfortable — you can get a completely correct final answer while the agent is doing absolute nonsense inter...

    Reddit - Machine Learning · about 17 hours ago

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime