March 23, 2026 - Daily Observation

1

MeanFlow Meets Control: Scaling Sampled-Data Control for Swarms

Anqi Dong, Yongxin Chen, Karl H. Johansson, Johan Karlsson

This is a recently published paper in the field of robotics. It contributes new research findings to the community.

arXiv Robotics

2

IndoorR2X: Indoor Robot-to-Everything Coordination with LLM-Driven Planning

Fan Yang, Soumya Teotia, Shaunak A. Mehta, Prajit KrisshnaKumar, Quanting Xie, Jun Liu, Yueqi Song, Li Wenkai, Atsunori ...

This is a recently published paper in the field of robotics. It contributes new research findings to the community.

arXiv Robotics

3

The Robot's Inner Critic: Self-Refinement of Social Behaviors through VLM-based Replanning

Jiyu Lim, Youngwoo Yoon, Kwanghyun Park

This is a recently published paper in the field of robotics. It contributes new research findings to the community.

arXiv Robotics

4

HortiMulti: A Multi-Sensor Dataset for Localisation and Mapping in Horticultural Polytunnels

Shuoyuan Xu, Zhipeng Zhong, Tiago Barros, Matthew Coombes, Cristiano Premebida, Hao Wu, Cunjia Liu

This is a recently published paper in the field of robotics. It contributes new research findings to the community.

arXiv Robotics

5

AGILE: A Comprehensive Workflow for Humanoid Loco-Manipulation Learning

Huihua Zhao, Rafael Cathomen, Lionel Gulich, Wei Liu, Efe Arda Ongan, Michael Lin, Shalin Jain, Soha Pouya, Yan Chang

This is a recently published paper in the field of robotics. It contributes new research findings to the community.

arXiv Robotics

6

6D Robotic OCT Scanning of Curved Tissue Surfaces

Suresh Guttikonda, Maximilian Neidhardt, Vidas Raudonis, Alexander Schlaefer

Robotic optical coherence tomography (OCT) scanning of curved tissue surfaces has been limited by existing translational-only scanning approaches, which cannot handle non-planar geometry. This work introduces a new marker for full six-dimensional hand-eye calibration of robot-mounted OCT probes, achieving highly repeatable transformation estimates and enabling consistent scanning of large curved tissue phantoms. This advance unlocks more flexible robotic OCT imaging for clinical and pre-clinical applications.

arXiv Robotics

7

VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models

Zixuan Wang, Yuxin Chen, Yuqi Liu, Jinhui Ye, Pengguang Chen, Changsheng Lu, Shu Liu, Jiaya Jia

Existing vision-language-action (VLA) models combine instruction interpretation, spatial grounding, and control into a single black-box forward pass, leading to poor spatial precision and limited out-of-distribution robustness. This work proposes VP-VLA, a dual-system framework that decouples high-level reasoning from low-level execution via a structured visual prompting interface. A high-level planner generates spatial anchors overlaid on input images, while a low-level controller uses these prompts to generate precise actions, improving performance on challenging robotic manipulation tasks.

arXiv Robotics

8

Sim-to-Real of Humanoid Locomotion Policies via Joint Torque Space Perturbation Injection

Junhyeok Rui Cha, Woohyun Cha, Jaeyong Shin, Donghyeon Kim, Jaeheung Park

Existing sim-to-real methods for humanoid locomotion rely on fixed finite parameter domain randomization, which fails to capture complex state-dependent reality gaps like nonlinear actuator dynamics. This work introduces a new approach that injects state-dependent perturbations into joint torque inputs during simulation, using neural networks to model complex uncertainties that parametric randomization cannot capture. Experiments show the method produces humanoid locomotion policies with superior robustness to unseen reality gaps in both simulation and real-world deployment.

arXiv Robotics

9

Directional Mollification for Controlled Smooth Path Generation

Alfredo González-Calvin, Juan F. Jiménez, Héctor García de Marina

Smooth path generation from discrete waypoints is a fundamental requirement for stable robot control, but existing mollification methods confine smoothed paths to the convex hull of the original waypoints, preventing exact waypoint interpolation when required. This work introduces directional mollification, a novel extension of mollification that removes the convex hull constraint while retaining the computational efficiency, formal smoothness, and curvature guarantees of existing methods. The approach offers an improved alternative to spline interpolation and optimization-based path smoothing for autonomous and industrial robots.

arXiv Robotics

10

Partial Attention in Deep Reinforcement Learning for Safe Multi-Agent Control

Turki Bin Mohaya, Peter Seiler

Attention mechanisms have shown strong performance in sequential learning tasks, but have not been widely adapted for safe multi-agent autonomous vehicle control. This work applies partial attention to the QMIX multi-agent reinforcement learning framework, allowing each autonomous vehicle to focus only on the most relevant neighboring vehicles during highway merging scenarios. A multi-objective reward function that balances global safety and flow with individual agent interests improves overall performance over baseline deep reinforcement learning methods in SUMO simulations.

arXiv Robotics

11

Memory-Efficient Boundary Map for Large-Scale Occupancy Grid Mapping

Benxu Tang, Yunfan Ren, Yixi Cai, Fanze Kong, Wenyi Liu, Fangcheng Zhu, Longji Yin, Liuyu Shi, Fu Zhang

Traditional high-resolution large-scale occupancy grid mapping requires storing all voxels in the mapped volume, leading to prohibitive memory usage for many robotic applications. This work introduces a novel memory-efficient representation that only stores boundary voxels (occupied and frontier voxels), with free and unknown voxels automatically represented by regions inside and outside the boundary, respectively. The approach drastically reduces memory requirements for large-scale high-resolution mapping without sacrificing accuracy, enabling deployment on resource-constrained robotic platforms.

arXiv Robotics

12

Can a Robot Walk the Robotic Dog: Triple-Zero Collaborative Navigation for Heterogeneous Multi-Agent Systems

Yaxuan Wang, Yifan Xiang, Ke Li, Xun Zhang, BoWen Ye, Zhuochen Fan, Fei Wei, Tong Yang

Existing collaborative navigation frameworks for heterogeneous multi-robot systems typically require extensive training or simulation pre-deployment, limiting their real-world adaptability. This work presents Triple Zero Path Planning (TZPP), a zero-training, zero-prior-knowledge, and zero-simulation collaborative navigation framework that uses a coordinator-explorer architecture with multimodal large language model guidance. Implemented on Unitree G1 humanoid and Go2 quadruped robots, TZPP achieves robust human-comparable efficiency across diverse unseen indoor and outdoor environments, offering a practical path for immediate real-world deployment.

arXiv Robotics

13

BiPreManip: Learning Affordance-Based Bimanual Preparatory Manipulation through Anticipatory Collaboration

Yan Shen, Feng Jiang, Zichen He, Xiaoqi Li, Yuchen Liu, Zhiyu Li, Ruihai Wu, Hao Dong

Many everyday bimanual manipulation tasks require one arm to perform preparatory actions that enable the other arm's final goal-directed grasp or operation, such as pushing an iPad to a table edge before picking it up, but most existing frameworks do not explicitly address this asymmetric collaborative task setting. This work introduces BiPreManip, a visual affordance-based framework that first envisions the final goal action, then generates a sequence of preparatory manipulations for one arm to enable the second arm's operation. The approach advances capabilities for sequential coordinated bimanual manipulation of everyday objects.

arXiv Robotics

14

PRM-as-a-Judge: A Dense Evaluation Paradigm for Fine-Grained Robotic Auditing

Yuheng Ji, Yuyang Liu, Huajie Tan, Xuchuan Huang, Fanding Huang, Yijie Xu, Cheng Chi, Yuting Zhao, Huaihai Lyu, Peterson...

Most robotic policy evaluation relies on binary success rates, which collapse the entire execution trajectory into a single outcome and hide critical qualities like progress, efficiency, and stability. This work proposes PRM-as-a-Judge, a dense evaluation paradigm that uses Process Reward Models to audit robotic policy execution directly from trajectory videos by estimating continuous task progress. The accompanying OPD metric system provides fine-grained insight into execution quality, enabling more detailed diagnosis of policy performance than standard success-rate evaluation.

arXiv Robotics

15

CataractSAM-2: A Domain-Adapted Model for Anterior Segment Surgery Segmentation and Scalable Ground-Truth Annotation

Mohammad Eslami, Dhanvinkumar Ganeshkumar, Saber Kazeminasab, Michael G. Morley, Michael V. Boland, Michael M. Lin, John...

Robotic-assisted cataract surgery requires accurate real-time semantic segmentation of surgical video, but existing models lack the accuracy and annotation tools required for scalable medical robotic perception development. This work introduces CataractSAM-2, a domain-adapted extension of Meta's Segment Anything Model 2 optimized for real-time segmentation of anterior segment cataract surgery. The work also releases an interactive annotation framework that reduces manual labeling effort for scalable ground-truth creation, and the model demonstrates strong zero-shot generalization to glaucoma trabeculectomy procedures.

arXiv Robotics

16

Auction-Based Task Allocation with Energy-Conscientious Trajectory Optimization for AMR Fleets

Jiachen Li, Soovadeep Bakshi, Jian Chu, Shihao Li, Dongmei Chen

Multi-AMR (autonomous mobile robot) fleet task allocation and trajectory optimization typically does not explicitly account for energy consumption, leading to unnecessary battery use in industrial settings. This work presents a hierarchical two-stage framework that combines sequential auction-based task allocation with energy-conscious trajectory optimization using a physics-based battery model. Large-scale experiments across hundreds of factory scenarios show the framework delivers an average 11.8% energy savings over standard nearest-task allocation, with rescheduling latency under 10ms for dynamic fault and priority handling.

arXiv Robotics

17

SafePilot: A Framework for Assuring LLM-enabled Cyber-Physical Systems

Weizhe Xu, Mengyu Liu, Fanxin Kong

LLM integration into robotic and other cyber-physical systems brings advanced reasoning capabilities, but LLM hallucinations can lead to unsafe or undesirable actions that are not caught by existing system assurance frameworks. This work proposes SafePilot, a hierarchical neuro-symbolic framework that provides end-to-end safety assurance for LLM-enabled cyber-physical systems against attribute-based and temporal task specifications. The framework addresses the core risk of hallucinations in LLM-guided robotics, enabling safer deployment of LLM-powered autonomous systems.

arXiv Robotics

18

A Framework for Closed-Loop Robotic Assembly, Alignment and Self-Recovery of Precision Optical Systems

Seou Choi, Sachin Vaidya, Caio Silva, Shiekh Zia Uddin, Sajib Biswas Shuvo, Shrish Choudhary, Marin Soljačić

While robotic automation has transformed many scientific workflows, high-precision free-space optical system assembly and alignment remains largely manual due to strict spatial and angular tolerances. This work introduces a complete robotics framework for autonomous construction, alignment, and self-recovery of precision optical systems, integrating hierarchical computer vision, optimization routines, and custom end-of-arm tools. The framework demonstrates fully autonomous assembly of a tabletop laser cavity from randomly distributed components, including self-recovery from induced misalignment, opening the door to automated optical experimentation.

arXiv Robotics

19

GaussianSSC: Triplane-Guided Directional Gaussian Fields for 3D Semantic Completion

Ruiqi Xian, Jing Liang, He Yin, Xuewei Qi, Dinesh Manocha

Existing 3D semantic scene completion methods struggle with poor voxel-image alignment and fail to efficiently capture fine-grained geometric details like surface tangency and occlusion asymmetry. This work presents GaussianSSC, a two-stage triplane-guided approach that integrates Gaussian representation benefits into standard voxel grid frameworks without additional memory overhead. On the SemanticKITTI benchmark, GaussianSSC improves occupancy recall by 1.0% and precision by 2.0% over baseline methods, advancing state-of-the-art monocular semantic scene completion for robotic perception.

arXiv Robotics

Daily Robotics Digest

New Research Papers

MeanFlow Meets Control: Scaling Sampled-Data Control for Swarms

IndoorR2X: Indoor Robot-to-Everything Coordination with LLM-Driven Planning

The Robot's Inner Critic: Self-Refinement of Social Behaviors through VLM-based Replanning

HortiMulti: A Multi-Sensor Dataset for Localisation and Mapping in Horticultural Polytunnels

AGILE: A Comprehensive Workflow for Humanoid Loco-Manipulation Learning

6D Robotic OCT Scanning of Curved Tissue Surfaces

VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models

Sim-to-Real of Humanoid Locomotion Policies via Joint Torque Space Perturbation Injection

Directional Mollification for Controlled Smooth Path Generation

Partial Attention in Deep Reinforcement Learning for Safe Multi-Agent Control

Memory-Efficient Boundary Map for Large-Scale Occupancy Grid Mapping

Can a Robot Walk the Robotic Dog: Triple-Zero Collaborative Navigation for Heterogeneous Multi-Agent Systems

BiPreManip: Learning Affordance-Based Bimanual Preparatory Manipulation through Anticipatory Collaboration

PRM-as-a-Judge: A Dense Evaluation Paradigm for Fine-Grained Robotic Auditing

CataractSAM-2: A Domain-Adapted Model for Anterior Segment Surgery Segmentation and Scalable Ground-Truth Annotation

Auction-Based Task Allocation with Energy-Conscientious Trajectory Optimization for AMR Fleets

SafePilot: A Framework for Assuring LLM-enabled Cyber-Physical Systems

A Framework for Closed-Loop Robotic Assembly, Alignment and Self-Recovery of Precision Optical Systems

GaussianSSC: Triplane-Guided Directional Gaussian Fields for 3D Semantic Completion

Enjoying these digests?