SyGra: The One-Stop Framework for Building Data for LLMs and SLMs

Hugging Face Blog February 15, 2026 4 min read

About this article

A Blog post by ServiceNow-AI on Hugging Face

Back to Articles SyGra: The One-Stop Framework for Building Data for LLMs and SLMs Enterprise Article Published September 22, 2025 Upvote 13 +7 Bidyapati Pradhan bidyapati Follow ServiceNow-AI Vipul Mittal vipulmitt Follow ServiceNow-AI Amit Kumar Saha amitsaha Follow ServiceNow-AI Surajit Dasgupta surajit Follow ServiceNow-AI When we think about building a model - be it a Large Language Model (LLM) or a Small Language Model (SLM) - the first thing we need is data. While a vast amount of open data is available, it rarely comes in the exact format required to train or align models. In practice, we often face scenarios where the raw data isn't enough. We need data that is more structured, domain-specific, complex, or aligned with the task at hand. Let's look at some common situations: Complex Scenarios Missing You start with a simple dataset, but the model fails on advanced reasoning tasks. How do you generate more complex datasets to strengthen performance? Knowledge Base to Q&A You already have a knowledge base, but it's not in Q&A format. How can you transform it into a usable question-answering dataset? From SFT to DPO You've prepared a supervised fine-tuning (SFT) dataset. But now you want to align your model using Direct Preference Optimization (DPO). How can you generate preference pairs? Depth of Questions You have a Q&A dataset, but the questions are shallow. How can you create in-depth, multi-turn, or reasoning-heavy questions? Domain-Specific Mid-Training You...

Originally published on February 15, 2026. Curated by AI News.

Llms

Have Companies Began Adopting Claude Co-Work at an Enterprise Level?

Hi Guys, My company is considering purchasing the Claude Enterprise plan. The main two constraints are: - Being able to block usage of Cl...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min · about 3 hours ago

Llms

Shifting to AI model customization is an architectural imperative | MIT Technology Review

In the early days of large language models (LLMs), we grew accustomed to massive 10x jumps in reasoning and coding capability with every ...

MIT Technology Review · 6 min · about 4 hours ago

SyGra: The One-Stop Framework for Building Data for LLMs and SLMs

About this article

Related Articles

Have Companies Began Adopting Claude Co-Work at an Enterprise Level?

What I learned about multi-agent coordination running 9 specialized Claude agents

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

Shifting to AI model customization is an architectural imperative | MIT Technology Review

No comments

Stay updated with AI News