[2602.19237] Evaluating SAP RPT-1 for Enterprise Business Process Prediction: In-Context Learning vs. Traditional Machine Learning on Structured SAP Data
Summary
This article evaluates SAP's RPT-1 model for enterprise business process prediction, comparing its performance against traditional machine learning methods on structured SAP data.
Why It Matters
As businesses increasingly rely on data-driven decision-making, understanding the effectiveness of new machine learning models like SAP RPT-1 is crucial. This evaluation provides insights into how in-context learning can enhance predictive accuracy, potentially transforming enterprise operations and workflows.
Key Takeaways
- RPT-1 achieves 91-96% accuracy of tuned GBDT models without training examples.
- Performance crossover occurs with 75-100 context rows, where RPT-1 outperforms traditional methods.
- A hybrid workflow is proposed: use RPT-1 for rapid screening and train GBDT selectively for improved accuracy.
Computer Science > Machine Learning arXiv:2602.19237 (cs) [Submitted on 22 Feb 2026] Title:Evaluating SAP RPT-1 for Enterprise Business Process Prediction: In-Context Learning vs. Traditional Machine Learning on Structured SAP Data Authors:Amit Lal (Microsoft Corporation) View a PDF of the paper titled Evaluating SAP RPT-1 for Enterprise Business Process Prediction: In-Context Learning vs. Traditional Machine Learning on Structured SAP Data, by Amit Lal (Microsoft Corporation) View PDF Abstract:Tabular foundation models aim to make machine learning accessible for enterprise data without task-specific training. This paper presents the first independent evaluation of SAP's Retrieval Pretrained Transformer (RPT-1) from a practitioner perspective. RPT-1 is a compact 64.6 MB model pretrained on 1.34 TB of structured data across 3.1 million tables. We benchmark it against tuned gradient-boosted decision trees (XGBoost, LightGBM, CatBoost) on three SAP business scenarios: demand forecasting across SD/MM/PP modules, predictive data integrity in BC/MM/QM, and financial risk classification in FI/CO/AR. Across five-fold cross-validation on datasets ranging from 2,500 to 3,200 rows, RPT-1 reaches 91-96% of tuned GBDT accuracy without any training examples. The classification gap is modest at 3.6-4.1 percentage points on AUC-ROC, though regression tasks show wider gaps of 8.9-11.1 percentage points on R-squared. An interesting finding is a crossover at roughly 75-100 context rows where...