[2603.03280] How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference

[2603.03280] How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2603.03280: How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference

Computer Science > Robotics arXiv:2603.03280 (cs) [Submitted on 3 Mar 2026] Title:How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference Authors:Toru Lin, Shuying Deng, Zhao-Heng Yin, Pieter Abbeel, Jitendra Malik View a PDF of the paper titled How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference, by Toru Lin and 4 other authors View PDF HTML (experimental) Abstract:Many essential manipulation tasks - such as food preparation, surgery, and craftsmanship - remain intractable for autonomous robots. These tasks are characterized not only by contact-rich, force-sensitive dynamics, but also by their "implicit" success criteria: unlike pick-and-place, task quality in these domains is continuous and subjective (e.g. how well a potato is peeled), making quantitative evaluation and reward engineering difficult. We present a learning framework for such tasks, using peeling with a knife as a representative example. Our approach follows a two-stage pipeline: first, we learn a robust initial policy via force-aware data collection and imitation learning, enabling generalization across object variations; second, we refine the policy through preference-based finetuning using a learned reward model that combines quantitative task metrics with qualitative human feedback, aligning policy behavior with human notions of task quality. Using only 50-200 peeling trajectories, our system achieves over 90% average success rates on challengin...

Originally published on March 04, 2026. Curated by AI News.

Related Articles

Llms

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

Inspired by Andrej Karpathy's AutoResearch, I built a system where Claude Code acts as an autonomous ML researcher on tabular binary clas...

Reddit - Machine Learning · 1 min ·
Llms

HALO - Hierarchical Autonomous Learning Organism

The idea is called HALO - Hierarchical Autonomous Learning Organism. The core premise is simple: what if instead of just making LLMs bigg...

Reddit - Artificial Intelligence · 1 min ·
Llms

HALO - Hierarchical Autonomous Learning Organism

The idea is called HALO - Hierarchical Autonomous Learning Organism. The core premise is simple: what if instead of just making LLMs bigg...

Reddit - Artificial Intelligence · 1 min ·
Robotics

What Cities Need To Consider Before Allowing Self-Driving Cars

submitted by /u/timemagazine [link] [comments]

Reddit - Artificial Intelligence · 1 min ·
More in Robotics: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime