Learning-Based Segmentation & Grasping on Fetch

Learning-Based Segmentation & Grasping on Fetch

Implemented PointNet++ and AnyGrasp for object segmentation and grasp detection on the Fetch robot.

Implemented PointNet++ and AnyGrasp for object segmentation and grasp detection on the Fetch robot.

Category

May 15, 2024

Robotic Perception and Manipulation

Robotic Perception and Manipulation

Services

May 15, 2024

Grasping Transparent, Opaque and Deformable Objects

Grasping Transparent, Opaque and Deformable Objects

Client

May 15, 2024

DeepRob | University of Michigan

DeepRob | University of Michigan

Year

May 15, 2024

2025

2025

Abstract

Go-Fetch computes risk-aware trajectories by combining reachability analysis with a Normalized 3D Gaussian Splat. This combination allows Go-Fetch to constrain the probability of collision between the robot’s reachable set and the scene and plan an optimal collision free path to Navigate, Pick and Place objects in the scene.


Introduction

Robotic grasping in unstructured environments remains a significant challenge due to perception uncertainty, object diversity, and planning constraints. In this project, we develop a real-world grasping pipeline for the FETCH mobile manipulator using RGB-D data from a Kinect sensor. The system integrates motion planning with MoveIt, grasp candidate generation with AnyGrasp, object segmentation with PointNet++, and point cloud preprocessing through Open3D filtering, RANSAC, and DBSCAN. The focus is on the implementation, deployment, and validation of the system on real hardware. Performance is evaluated on both rigid and transparent objects in isolated and cluttered settings, highlighting the strengths and limitations of the approach in practical robotic manipulation tasks.

Methods

Robust Grasp Perception: AnyGrasp

AnyGrasp (Fang et al., 2022) provides dense and temporally consistent 7-DoF grasp pose predictions from partial point clouds. The system consists of a Geometry Processing Module, which samples and predicts stable grasp candidates using point-wise features, and a Temporal Association Module, which tracks grasps across frames for dynamic scenes using feature-based matching. Trained on real-world GraspNet-1Billion data with randomized point dropout, AnyGrasp achieves a 93.3% success rate in bin-picking tasks involving unseen objects.

Transparent Object Depth Completion: TransCG

To address the challenge of incomplete depth perception for transparent objects, TransCG (Fang et al., 2022) introduces a real-world dataset and an efficient depth completion model, DFNet. DFNet refines noisy RGB-D inputs using dense blocks and a dual-loss strategy focusing on depth and surface normals. Trained on 57,715 images with augmentation, DFNet outperforms prior methods like ClearGrasp and demonstrates real-time performance, improving grasping reliability for transparent and translucent objects.

Results

Effect of Object Rigidity

Grasping performance differs notably between rigid and deformable objects. Rigid objects tend to yield higher success rates and lower collision rates, likely due to their stable structural properties that are better aligned with traditional robotic grasping strategies. In contrast, deformable objects introduce complexities such as unpredictable shape changes and instability during grasping, often resulting in increased accidental contacts and reduced reliability. These challenges highlight the limitations of current grasping systems, which historically have been tuned for rigid environments. Advancements in perception, modeling, and adaptive control will be essential to more effectively handle the nuances of deformable object manipulation moving forward.

Object Type

Success Rate

Collision Rate

Rigid

70%

15%

Deformable

67%

27%

Effect of Environment Clutter

Grasping performance varies significantly between isolated and cluttered environments. In isolated settings, success rates are notably higher, reflecting the reduced complexity and minimal interference from surrounding objects. In contrast, cluttered environments introduce challenges such as the need to navigate around neighboring objects, resulting in lower grasp success rates and the occurrence of collisions. The close proximity of multiple objects, combined with irregular surfaces—especially when deformable objects are involved—complicates grasp planning and reduces overall stability. These observations reinforce the necessity for more robust grasping strategies capable of dynamically adapting to cluttered and unstructured scenes, an essential consideration for advancing practical robotic manipulation systems.

Scene Type

Success Rate

Collision Rate

Isolated

75%

0%

Cluttered

70%

Clutter Collisions

Effect of Object Transparency

Grasping transparent objects presents substantial challenges compared to opaque ones. Performance metrics reveal that success rates are considerably lower for transparent objects, primarily due to limitations in depth sensing and perception. Despite the application of depth completion techniques, incomplete or noisy depth data frequently undermines the stability of initial grasp attempts. Interestingly, while grasp failures were common, collision rates remained low, suggesting that the primary difficulty lies not in avoiding obstacles but in establishing reliable grasp points. These results emphasize a critical gap in current robotic perception systems and highlight the pressing need for more advanced methods tailored specifically to handling transparent materials in complex environments.

Object Material

Success Rate

Collision Rate

Opaque

70%

15%

Transparent

30%

0%


Conclusion

In this project, we developed a modular robotic grasping system on the FETCH platform, capable of handling rigid, deformable, and transparent objects using RGB-D sensing, depth completion, PointNet++ segmentation, and AnyGrasp. While the system achieved strong performance on rigid and isolated objects, challenges remain in cluttered and transparent scenarios due to incomplete depth data and planning limitations. Future work will focus on enhancing transparent object perception, incorporating closed-loop feedback, and improving grasp planning to better approach human-level manipulation performance.


Let's talk

Time for me:

Email:

marzukkp@gmail.com

Reach out:

Made with Framer

© Copyright 2024

Let's talk

Time for me:

Email:

marzukkp@gmail.com

Reach out:

Made with Framer

© Copyright 2024

Create a free website with Framer, the website builder loved by startups, designers and agencies.