[2603.03280] How to Peel with a Knife: Aligning Fine-Grained

[2603.03280] How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference

arXiv - Machine Learning March 04, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.03280: How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference

Computer Science > Robotics arXiv:2603.03280 (cs) [Submitted on 3 Mar 2026] Title:How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference Authors:Toru Lin, Shuying Deng, Zhao-Heng Yin, Pieter Abbeel, Jitendra Malik View a PDF of the paper titled How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference, by Toru Lin and 4 other authors View PDF HTML (experimental) Abstract:Many essential manipulation tasks - such as food preparation, surgery, and craftsmanship - remain intractable for autonomous robots. These tasks are characterized not only by contact-rich, force-sensitive dynamics, but also by their "implicit" success criteria: unlike pick-and-place, task quality in these domains is continuous and subjective (e.g. how well a potato is peeled), making quantitative evaluation and reward engineering difficult. We present a learning framework for such tasks, using peeling with a knife as a representative example. Our approach follows a two-stage pipeline: first, we learn a robust initial policy via force-aware data collection and imitation learning, enabling generalization across object variations; second, we refine the policy through preference-based finetuning using a learned reward model that combines quantitative task metrics with qualitative human feedback, aligning policy behavior with human notions of task quality. Using only 50-200 peeling trajectories, our system achieves over 90% average success rates on challengin...

Originally published on March 04, 2026. Curated by AI News.

Llms

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

Inspired by Andrej Karpathy's AutoResearch, I built a system where Claude Code acts as an autonomous ML researcher on tabular binary clas...

Reddit - Machine Learning · 1 min · about 5 hours ago

Llms

HALO - Hierarchical Autonomous Learning Organism

The idea is called HALO - Hierarchical Autonomous Learning Organism. The core premise is simple: what if instead of just making LLMs bigg...

Reddit - Artificial Intelligence · 1 min · 1 day ago

Llms

HALO - Hierarchical Autonomous Learning Organism

The idea is called HALO - Hierarchical Autonomous Learning Organism. The core premise is simple: what if instead of just making LLMs bigg...

Reddit - Artificial Intelligence · 1 min · 2 days ago

Robotics

What Cities Need To Consider Before Allowing Self-Driving Cars

submitted by /u/timemagazine [link] [comments]

Reddit - Artificial Intelligence · 1 min · 2 days ago

[2603.03280] How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference

About this article

Related Articles

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

HALO - Hierarchical Autonomous Learning Organism

HALO - Hierarchical Autonomous Learning Organism

What Cities Need To Consider Before Allowing Self-Driving Cars

No comments

Stay updated with AI News