March 26, 2026 - Daily Observation

1

VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs

Haoran Yuan, Weigang Yi, Zhenyu Zhang, Wendi Chen, Yuchen Mo, Jiashi Yin, Xinzhuo Li, Xiangyu Zeng, Chuan Wen, Cewu Lu, ...

Video-Action Models (VAMs) enable strong long-horizon task performance via visual reasoning, but fail to capture fine-grained force and contact information critical for contact-rich physical interactions. This work introduces VTAM, a Video-Tactile-Action Model that integrates tactile sensing to address the limitations of vision-only VAMs. The model enables more precise and stable behavior in scenarios where critical interaction states are not fully observable from vision alone.

arXiv Robotics

2

Planning over MAPF Agent Dependencies via Multi-Dependency PIBT

Zixiang Jiang, Yulun Zhang, Rishi Veerapaneni, Jiaoyang Li

Modern Multi-Agent Path Finding (MAPF) requires efficient algorithms that can plan for hundreds to thousands of agents in congested environments within tight time constraints. The popular PIBT and its extension EPIBT are limited by their rule-based design, which restricts planning to conflicts involving at most one other agent, reducing generality. This work presents Multi-Dependency PIBT, a new approach that expands PIBT to handle multiple concurrent agent dependencies for more flexible multi-agent planning.

arXiv Robotics

3

Rectify, Don't Regret: Avoiding Pitfalls of Differentiable Simulation in Trajectory Prediction

Harsh Yadav, Christian Bohn, Tobias Meisen

Open-loop trajectory prediction models for autonomous driving suffer from cascading compounding errors caused by small initial deviations. While differentiable closed-loop simulators aim to solve this problem, they suffer from shortcut learning where future ground truth information leaks into model predictions via gradient flow. This work argues for rectifying this leakage rather than training around it, addressing the core issue of non-causal error correction in differentiable simulation.

arXiv Robotics

4

SIMART: Decomposing Monolithic Meshes into Sim-ready Articulated Assets via MLLM

Chuanrui Zhang, Minghan Qin, Yuang Wang, Baifeng Xie, Hang Li, Ziwei Wang

High-quality sim-ready articulated 3D assets are critical for embodied AI and physical simulation, but modern 3D generation focuses primarily on static meshes, creating a supply gap. Existing articulated asset creation methods use multi-stage pipelines that accumulate error, while unified MLLM-based approaches face high memory overhead from dense voxel tokenization that limits scalability. This work introduces SIMART, a method that decomposes monolithic static meshes into sim-ready articulated assets using an MLLM-based approach optimized for lower memory usage and scalability.

arXiv Robotics

5

ABot-PhysWorld: Interactive World Foundation Model for Robotic Manipulation with Physics Alignment

Yuzhi Chen, Ronghan Chen, Dongjie Huo, Yandan Yang, Dekang Qi, Haoyun Liu, Tong Lin, Shuang Zeng, Junjin Xiao, Xinyuan C...

Existing video-based world models for robotic manipulation often generate physically implausible behavior such as object penetration and anti-gravity motion, due to training on generic visual data and likelihood objectives that ignore physical constraints. This work presents ABot-PhysWorld, a 14B-parameter Diffusion Transformer model designed to generate physically plausible, action-controllable manipulation videos. The model is trained on a curated dataset of 3 million physics-annotated manipulation clips and uses a novel DPO-based post-training alignment to enforce physical consistency.

arXiv Robotics

6

PinPoint: Monocular Needle Pose Estimation for Robotic Suturing via Stein Variational Newton and Geometric Residuals

Jesse F. d'Almeida, Tanner Watts, Susheela Sharma Stern, James Ferguson, Alan Kuntz, Robert J. Webster

Reliable 3D needle pose estimation is critical for autonomous robotic suturing, but nearly all existing methods rely on stereoscopic vision. In common monocular endoscopic settings, depth ambiguity and rotational symmetry create a multimodal distribution of feasible poses rather than a single well-defined estimate, making the problem inherently ill-posed. This work introduces PinPoint, a probabilistic variational inference framework based on Stein Variational Newton and geometric residuals that directly accounts for pose ambiguity to enable accurate monocular needle pose estimation.

arXiv Robotics

7

Edge Radar Material Classification Under Geometry Shifts

Jannik Hohmann, Dong Wang, Andreas Nüchter

Material classification improves robotic navigation and interaction in conditions where cameras and LiDAR degrade in performance. This work presents a lightweight mmWave radar material classification pipeline optimized for ultra-low-power edge devices that achieves 94.2% macro-F1 score under nominal training geometry. The work also identifies a significant performance drop under realistic geometry shifts such as sensor height changes and small tilts, highlighting a key open challenge for edge radar perception.

arXiv Robotics

8

Strain-Parameterized Coupled Dynamics and Dual-Camera Visual Servoing for Aerial Continuum Manipulators

Niloufar Amiri, Farrokh Janabi-Sharifi

Tendon-driven aerial continuum manipulators combine the maneuverability of UAVs with the compliance of continuum robots, but existing coupled dynamic models have high computational cost and do not explicitly account for underactuation of the aerial base. This work presents a generalized dynamic formulation for underactuated coupled TD-ACMs that integrates a strain-parameterized Cosserat rod model with a rigid-body UAV model into a unified framework. The approach also includes a dual-camera visual servoing control scheme for this class of manipulators.

arXiv Robotics

9

Learning Multi-Agent Local Collision-Avoidance for Collaborative Carrying tasks with Coupled Quadrupedal Robots

Francesca Bray, Simone Tolomei, Andrei Cramariuc, Cesar Cadena, Marco Hutter

Collaborative carrying by multiple quadrupedal robots has great potential for warehouse and construction applications, but existing coordination methods mostly assume obstacle-free environments or rely on pre-recorded maps and off-line planning, making them unsuitable for most real-world scenarios. This work focuses on local collision avoidance for two mechanically coupled quadrupedal robots performing collaborative carrying. The work proposes a learned approach that enables adaptive, on-the-fly collision avoidance without prior maps, supporting deployment in unstructured real environments.

arXiv Robotics

10

A Multimodal Framework for Human-Multi-Agent Interaction

Shaid Hasan, Breenice Lee, Sujan Sarker, Tariq Iqbal

Human-robot interaction is increasingly moving toward multi-robot socially interactive environments, but existing systems struggle to unify multimodal perception, embodied expression, and coordinated decision-making into a single scalable framework. This work introduces a multimodal framework for human-multi-agent interaction where each individual robot acts as an autonomous cognitive agent with integrated multimodal perception and LLM-driven planning grounded in embodiment. A central team-level coordination module manages shared interaction goals to enable natural human interaction with a robot team in shared physical spaces.

arXiv Robotics

11

Efficient Hybrid SE(3)-Equivariant Visuomotor Flow Policy via Spherical Harmonics for Robot Manipulation

Qinglun Zhang, Shen Cheng, Tian Dan, Haoqiang Fan, Guanghui Liu, Shuaicheng Liu

SE(3)-equivariant policies improve data efficiency for robotic manipulation, but existing methods suffer from high computational cost, reliance on single-modality inputs, and instability when combined with fast sampling methods. This work introduces E3Flow, a hybrid SE(3)-equivariant visuomotor flow policy framework built on spherical harmonic representations that unifies efficient rectified flow with stable multi-modal equivariant learning for the first time. The approach addresses key limitations of existing equivariant diffusion policies, enabling more practical deployment for manipulation tasks.

arXiv Robotics

12

AeroScene: Progressive Scene Synthesis for Aerial Robotics

Nghia Vu, Tuong Do, Dzung Tran, Binh X. Nguyen, Hoan Nguyen, Erman Tjiputra, Quang D. Tran, Hai-Nguyen Nguyen, Anh Nguye...

Drone simulators currently rely heavily on manual scene creation, which is time-consuming and difficult to scale, despite the growing impact of generative models across robotics. This work introduces AeroScene, a hierarchical diffusion model for progressive 3D scene synthesis specifically for aerial robotics simulation. The approach uses hierarchy-aware tokenization and multi-branch feature extraction to jointly reason about global scene layout and local details, ensuring physical plausibility of generated scenes and reducing manual effort for simulation environment creation.

arXiv Robotics

13

Path Planning and Reinforcement Learning-Driven Control of On-Orbit Free-Flying Multi-Arm Robots

Álvaro Belmonte-Baeza, José Luis Ramón, Leonard Felicetti, Miguel Cazorla, Jorge Pomares

On-orbit servicing requires reliable motion planning and control for free-flying multi-arm robots, which must handle dynamic and kinematic constraints as well as uncertainty in the space environment. This work presents a hybrid approach that integrates trajectory optimization (TO) for feasible path generation with reinforcement learning (RL) for adaptive trajectory tracking under uncertainty. The multi-arm robot design includes thrusters for body control, enabling redundancy and stability for complex space operations, while the hybrid approach reduces tracking error compared to single-method baselines.

arXiv Robotics

14

LiZIP: An Auto-Regressive Compression Framework for LiDAR Point Clouds

Aditya Shibu, Kayvan Karim, Claudio Zito

The large data volume generated by LiDAR sensors in autonomous vehicles creates processing and V2X transmission bottlenecks. Existing lossless compression methods face a tradeoff between adaptability and computational cost: standard algorithms like LASzip lack adaptability, while deep learning approaches have prohibitive computational overhead. This work introduces LiZIP, a lightweight, near-lossless zero-drift compression framework based on neural predictive coding that uses a compact MLP to predict point coordinates from local context, balancing compression performance and computational efficiency.

arXiv Robotics

15

PHANTOM Hand

Teng Yan, Jiongxu Chen, Qixiang Hua, Yue Yu, Zihang Wang, Yaohua Liu, Bingzhuo Zhong

Tendon-driven underactuated robotic hands excel at adaptive grasping but suffer from kinematic unpredictability and nonlinear force transmission, limiting their ability to perform precise shaping and handle reliable payloads for complex manipulation. This work introduces the PHANTOM Hand, a 1:1 human-scale modular underactuated hand with 6 actuators and 15 degrees of freedom. The proposed unified framework bridges the gap between precise analytic motion shaping and robust compliant grasping, addressing the core limitations of traditional underactuated designs.

arXiv Robotics

16

Active Robotic Perception for Disease Detection and Mapping in Apple Trees

Hayden Feddock, Francisco Yandun, Srđan Aćimović, Abhisesh Silwal

Large-scale commercial apple orchards require timely disease monitoring, but manual scouting is labor-intensive, expensive, and often detects outbreaks too late at coarse spatial resolutions. This work presents an autonomous mobile active perception system for targeted detection and high-resolution mapping of fire blight, one of the most devastating diseases affecting apple trees, in dormant trees. The system integrates flash-illuminated stereo RGB sensing to enable automated, scalable disease monitoring that can improve orchard management outcomes.

arXiv Robotics

17

AirSimAG: A High-Fidelity Simulation Platform for Air-Ground Collaborative Robotics

Yangjie Cui, Xin Dong, Boyang Gao, Jinwu Xiang, Daochun Li, Zhan Tu

Heterogeneous air-ground collaborative robot systems have strong potential for applications like search and rescue, surveillance, and environmental monitoring, but existing simulation platforms are mostly designed for single-agent dynamics and lack dedicated tools for interactive air-ground collaboration. This work presents AirSimAG, a high-fidelity simulation platform specifically for air-ground collaborative robotics built on the existing AirSim framework. The platform enables realistic testing and development of heterogeneous multi-agent collaboration algorithms for real-world applications.

arXiv Robotics

18

Learning Actuator-Aware Spectral Submanifolds for Precise Control of Continuum Robots

Paul Leonard Wolff, Hugo Buurmeijer, Luis Pabon, John Irvin Alora, Mark Leone, Roshan S. Kaundinya, Amirhossein Kazemipo...

Continuum robots have high-dimensional nonlinear dynamics that are tightly coupled with their actuation mechanisms, making accurate and efficient control challenging. Spectral submanifold reduction is a leading method for reducing high-dimensional nonlinear systems to low-dimensional invariant manifolds, but existing approaches do not explicitly incorporate actuation. This work introduces control-augmented spectral submanifolds (caSSMs) that explicitly include control inputs in the state representation to capture nonlinear state-actuation couplings, enabling more precise control of continuum robots.

arXiv Robotics

19

YOLOv10 with Kolmogorov-Arnold networks and vision-language foundation models for interpretable object detection and trustworthy multimodal AI in computer vision perception

Marios Impraimakis, Daniel Vazquez, Feiyu Zhou

Autonomous vehicle perception systems lack transparency about the reliability of object detection confidence scores in visually degraded or ambiguous scenes, creating a safety challenge for deployment. This work examines a modified YOLOv10 detector that uses Kolmogorov-Arnold networks as an interpretable post-hoc surrogate to model detection trustworthiness using geometric and semantic features. The approach improves the interpretability and trustworthiness of object detection for autonomous driving and other robotic perception applications.

arXiv Robotics

20

Generative Event Pretraining with Foundation Model Alignment

Jianwen Cao, Jiaxu Xing, Nico Messikommer, Davide Scaramuzza

Event cameras offer robust visual sensing under fast motion and challenging illumination, but their unique data format and limited labeled data make it difficult to train transferable event-based visual foundation models. This work introduces Generative Event Pretraining (GEP), a two-stage framework that transfers semantic knowledge from large-scale internet image datasets to event data while learning event-specific temporal features. The approach addresses the data scarcity challenge for event-based perception, enabling better transfer learning across downstream robotics tasks.

arXiv Robotics

21

Task-Aware Positioning for Improvisational Tasks in Mobile Construction Robots via an AI Agent with Multi-LMM Modules

Seongju Jang, Francis Baek, SangHyun Lee

Construction sites are highly dynamic, requiring robots to handle improvisational tasks where task locations, timing, and context are not known in advance, but existing mobile construction robot work rarely addresses this class of tasks. This work proposes an LMM-based AI agent that understands natural language instructions for improvisational tasks, identifies the required task location, and positions the robot accordingly. The agent decomposes functionality into three parallel Large Multimodal Model modules, enabling robust performance on unstructured construction site tasks.

arXiv Robotics

22

Agile-VLA: Few-Shot Industrial Pose Rectification via Implicit Affordance Anchoring

Teng Yan, Zhengyang Pei, Chengyu Shi, Yue Yu, Yikun Chen, Zilong Zhu, Zelin Fang, Kaile Guo, Zihang Wang, Peigen Tian, B...

Deploying Vision-Language-Action (VLA) models on resource-constrained edge devices faces a fundamental conflict between high-latency semantic inference and the high-frequency control required for dynamic industrial manipulation. This work introduces Agile-VLA, a hierarchical framework for industrial pose rectification designed for edge devices like the NVIDIA Jetson Orin Nano. The core innovation is Implicit Affordance Anchoring, which maps geometric visual cues directly to structured parametric action predictions, reducing latency to enable real-time edge control.

arXiv Robotics

23

Grounding Sim-to-Real Generalization in Dexterous Manipulation: An Empirical Study with Vision-Language-Action Models

Ruixing Jin, Zicheng Zhu, Ruixiang Ouyang, Sheng Xu, Bo Yue, Zhizheng Wu, Guiliang Liu

Sim-to-real transfer is critical for learning dexterous manipulation policies, as real-world data collection is prohibitively expensive, but there is a lack of empirical research grounding sim-to-real methods in real-world dexterous manipulation tasks, especially for generalist Vision-Language-Action models. This work presents a systematic empirical study of how different sim-to-real generalization approaches perform for VLA-based dexterous manipulation. The study provides empirical insights to guide future development of more reliable sim-to-real methods for generalist dexterous manipulation policies.

arXiv Robotics

24

DecompGrind: A Decomposition Framework for Robotic Grinding via Cutting-Surface Planning and Contact-Force Adaptation

Shunsuke Araki, Takumi Hachimine, Yuki Saito, Kouhei Ohnishi, Jun Morimoto, Takamitsu Masubara

Robotic grinding is a widely used manufacturing process, but automating efficient grinding for workpieces of varying shapes and material hardness remains challenging, due to variable removal resistance and difficulties in modeling shape transitions and learning across diverse conditions. This work introduces DecompGrind, a decomposition framework for robotic grinding that splits the problem into cutting-surface planning and contact-force adaptation subproblems. The decomposition approach addresses the challenges of varying contact conditions, enabling more efficient and flexible automated grinding without requiring large amounts of task-specific training data.

arXiv Robotics

25

CATNAV: Cached Vision-Language Traversability for Efficient Zero-Shot Robot Navigation

Aditya Potnis, Francisco Affonso, Shreya Gummadi, Naveen Kumar Uppalapati, Girish Chowdhary

Zero-shot robot navigation in unstructured environments requires assessing traversability relative to a robot's specific embodiment, but existing approaches require task-specific training or high rates of expensive VLM inference. This work introduces CATNAV, a cost-aware traversability navigation framework that uses multimodal LLMs to enable zero-shot embodiment-aware costmap generation without task-specific training. A novel visuosemantic caching mechanism reduces online VLM queries by 85.7% by reusing prior risk assessments for semantically similar frames, enabling efficient real-time deployment.

arXiv Robotics

26

PhotoAgent: A Robotic Photographer with Spatial and Aesthetic Understanding

Lirong Che, Zhenfeng Gan, Yanbo Chen, Junbo Tan, Xueqian Wang

Embodied robotic photographers must bridge the semantic gap between high-level natural language aesthetic commands and low-level geometric camera control, a challenge that has not been well addressed in prior work. This work introduces PhotoAgent, which integrates LMM chain-of-thought reasoning with an analytical control framework to solve this problem. PhotoAgent first translates subjective aesthetic goals into geometric constraints to compute an initial high-quality viewpoint, then iteratively refines the pose via visual reflection in a photorealistic internal simulator, enabling high-quality robotic photography from natural language instructions.

arXiv Robotics

27

Instrument-Splatting++: Towards Controllable Surgical Instrument Digital Twin Using Gaussian Splatting

Shuojue Yang, Zijian Wu, Chengjiaao Liao, Qian Li, Daiyun Shen, Chang Han Low, Septimiu E. Salcudean, Yueming Jin

Controllable high-fidelity digital twins of surgical instruments are critical for Real2Sim transfer and synthetic data generation for robot-assisted surgery. This work presents Instrument-Splatting++, a monocular 3D Gaussian Splatting framework that reconstructs surgical instruments as fully controllable high-fidelity digital assets. The pipeline uses part-wise geometry pretraining to inject CAD priors into Gaussian primitives, enabling part-aware semantic rendering and controllable pose adjustment for simulation and training.

arXiv Robotics

28

DiSCo: Diffusion Sequence Copilots for Shared Autonomy

Andy Wang, Xu Yan, Brandon McMahan, Michael Zhou, Yuyang Yuan, Johannes Y. Lee, Ali Shreif, Matthew Li, Zhenghao Peng, B...

Shared autonomy combines human input with AI copilot correction to improve performance on complex control tasks like robotic teleoperation, but existing copilot methods struggle to generate action sequences that align consistently with past user behavior and goals. This work introduces DiSCo (Diffusion Sequence Copilots), a diffusion-based shared autonomy method that plans full action sequences consistent with past user actions. The approach significantly improves task performance for human control of high-dimensional robotic systems by generating context-aware corrective action sequences.

arXiv Robotics

29

SG-VLA: Learning Spatially-Grounded Vision-Language-Action Models for Mobile Manipulation

Ruisen Tu, Arth Shukla, Sohyun Yoo, Xuanlin Li, Junxi Li, Jianwen Xie, Hao Su, Zhuowen Tu

Vision-Language-Action models show promise for generalist robotic control, but their performance remains subpar for mobile manipulation in complex household environments, which requires reasoning about global scene layout, fine-grained geometry, and high-dimensional continuous actions that exceed the capabilities of standard imitation learning. This work introduces SG-VLA, a framework for learning spatially-grounded VLA models for mobile manipulation that strengthens perception and representation via auxiliary task co-training and multi-modal input enhancement. The approach improves performance on 13-dimensional continuous control for mobile manipulation tasks in complex household environments.

arXiv Robotics

30

Human vs. NAO: A Computational-Behavioral Framework for Quantifying Social Orienting in Autism and Typical Development

Vartika Narayani Srinet, Anirudha Bhattacharjee, Braj Bhushan, Bishakh Bhattacharya

None

arXiv Robotics

31

Fleet-Level Battery-Health-Aware Scheduling for Autonomous Mobile Robots

Jiachen Li, Shihao Li, Jian Chu, Wei Li, Dongmei Chen

None

arXiv Robotics

32

Learning Safe-Stoppability Monitors for Humanoid Robots

Yifan Sun, Yiyuan Pan, Shangtao Li, Caiwu Ding, Tao Cui, Lingyun Wang, Changliu Liu

None

arXiv Robotics

33

A Cognitive Architecture for Embodied AI

None

Google Scholar Robotics

34

DreamerAD: Efficient Reinforcement Learning via Latent World Model for Autonomous Driving

Pengxuan Yang, Yupeng Zheng, Deheng Qian, Zebin Xing, Qichao Zhang, Linbo Wang, Yichen Zhang, Shaoyu Guo, Zhongpu Xia, Q...

None

arXiv Robotics

35

TAG: Target-Agnostic Guidance for Stable Object-Centric Inference in Vision-Language-Action Models

Jiaying Zhou, Zhihao Zhan, Ruifeng Zhai, Qinhan Lyu, Hao Liu, Keze Wang, Liang Lin, Guangrun Wang

None

arXiv Robotics

36

Chameleon: Episodic Memory for Long-Horizon Robotic Manipulation

Xinying Guo, Chenxi Jiang, Hyun Bin Kim, Ying Sun, Yang Xiao, Yuhang Han, Jianfei Yang

None

arXiv Robotics

37

Towards Safe Learning-Based Non-Linear Model Predictive Control through Recurrent Neural Network Modeling

Mihaela-Larisa Clement, Mónika Farsang, Agnes Poks, Johannes Edelmann, Manfred Plöchl, Radu Grosu, Ezio Bartocci

None

arXiv Robotics

38

Design, Modelling and Characterisation of a Miniature Fibre-Reinforced Soft Bending Actuator for Endoluminal Interventions

Xiangyi Tan, Aoife McDonald-Bowyer, Danail Stoyanov, Agostino Stilli

None

arXiv Robotics

39

Enhancing Drone Light Shows Performances: Optimal Allocation and Trajectories for Swarm Drone Formations

Yunes Alqudsi

None

arXiv Robotics

40

3D-Mix for VLA: A Plug-and-Play Module for Integrating VGGT-based 3D Information into Vision-Language-Action Models

Bin Yu, Shijie Lian, Xiaopeng Lin, Zhaolong Shen, Yuliang Wei, Haishan Liu, Changti Wu, Hang Yuan, Bailing Wang, Cong Hu...

None

arXiv Robotics

41

CoordLight: Learning Decentralized Coordination for Network-Wide Traffic Signal Control

Yifeng Zhang, Harsh Goel, Peizhuo Li, Mehul Damani, Sandeep Chinchali, Guillaume Sartoretti

None

arXiv Robotics

42

LATS: Large Language Model Assisted Teacher-Student Framework for Multi-Agent Reinforcement Learning in Traffic Signal Control

Yifeng Zhang, Peizhuo Li, Tingguang Zhou, Mingfeng Fan, Guillaume Sartoretti

None

arXiv Robotics

43

A Sensorless, Inherently Compliant Anthropomorphic Musculoskeletal Hand Driven by Electrohydraulic Actuators

Misato Sonoda, Ronan Hinchet, Amirhossein Kazemipour, Yasunori Toshimitsu, Robert K. Katzschmann

None

arXiv Robotics

44

Evidence of an Emergent "Self" in Continual Robot Learning

Adidev Jhunjhunwala, Judah Goldfeder, Hod Lipson

None

arXiv Robotics

45

Toward Generalist Neural Motion Planners for Robotic Manipulators: Challenges and Opportunities

Davood Soleymanzadeh, Ivan Lopez-Sanchez, Hao Su, Yunzhu Li, Xiao Liang, Minghui Zheng

None

arXiv Robotics

46

Decentralized End-to-End Multi-AAV Pursuit Using Predictive Spatio-Temporal Observation via Deep Reinforcement Learning

Yude Li, Zhexuan Zhou, Huizhe Li, Yanke Sun, Yenan Wu, Yichen Lai, Yiming Wang, Youmin Gong, Jie Mei

None

arXiv Robotics

47

Environment-Grounded Multi-Agent Workflow for Autonomous Penetration Testing

Michael Somma, Markus Großpointner, Paul Zabalegui, Eppu Heilimo, Branka Stojanović

None

arXiv Robotics

48

Goal-Oriented Reactive Simulation for Closed-Loop Trajectory Prediction

Harsh Yadav, Tobias Meisen

None

arXiv Robotics

49

Accelerated Spline-Based Time-Optimal Motion Planning with Continuous Safety Guarantees for Non-Differentially Flat Systems

Dries Dirckx, Jan Swevers, Wilm Decré

None

arXiv Robotics

50

Knowledge-Guided Manipulation Using Multi-Task Reinforcement Learning

Aditya Narendra, Mukhammadrizo Maribjonov, Dmitry Makarov, Dmitry Yudin, Aleksandr Panov

None

arXiv Robotics

51

SOMA: Strategic Orchestration and Memory-Augmented System for Vision-Language-Action Model Robustness via In-Context Adaptation

Zhuoran Li, Zhiyang Li, Kaijun Zhou, Jinyu Gu

None

arXiv Robotics

52

PCHC: Enabling Preference Conditioned Humanoid Control via Multi-Objective Reinforcement Learning

Huanyu Li, Dewei Wang, Xinmiao Wang, Xinzhe Liu, Peng Liu, Chenjia Bai, Xuelong Li

None

arXiv Robotics

53

QuadFM: Foundational Text-Driven Quadruped Motion Dataset for Generation and Control

Li Gao, Fuzhi Yang, Jianhui Chen, Liu Liu, Yao Zheng, Yang Cai, Ziqiao Li

None

arXiv Robotics

54

MIRROR: Visual Motion Imitation via Real-time Retargeting and Teleoperation with Parallel Differential Inverse Kinematics

Junheng Li, Lizhi Yang, Aaron D. Ames

None

arXiv Robotics

55

SafeFlow: Real-Time Text-Driven Humanoid Whole-Body Control via Physics-Guided Rectified Flow and Selective Safety Gating

Hanbyel Cho, Sang-Hun Kim, Jeonguk Kang, Donghan Koo

None

arXiv Robotics

56

SLAT-Phys: Fast Material Property Field Prediction from Structured 3D Latents

Rocktim Jyoti Das, Dinesh Manocha

None

arXiv Robotics

57

Event-Driven Proactive Assistive Manipulation with Grounded Vision-Language Planning

Fengkai Liu, Hao Su, Haozhuang Chi, Rui Geng, Congzhi Ren, Xuqing Liu, Yucheng Xu, Yuichi Ohsita, Liyun Zhang

None

arXiv Robotics

58

Off-Policy Safe Reinforcement Learning with Constrained Optimistic Exploration

Guopeng Li, Matthijs T. J. Spaan, Julian F. P. Kooij

None

arXiv Robotics

59

AgentChemist: A Multi-Agent Experimental Robotic Platform Integrating Chemical Perception and Precise Control

Xiangyi Wei, Fei Wang, Haotian Zhang, Xin An, Haitian Zhu, Lianrui Hu, Yang Li, Changbo Wang, Xiao He

None

arXiv Robotics

60

Learning-guided Prioritized Planning for Lifelong Multi-Agent Path Finding in Warehouse Automation

Han Zheng, Yining Ma, Brandon Araki, Jingkai Chen, Cathy Wu

None

arXiv Robotics

61

Aesthetics of Robot-Mediated Applied Drama: A Case Study on REMind

Elaheh Sanoubari, Alicia Pan, Keith Rebello, Neil Fernandes, Andrew Houston, Kerstin Dautenhahn

None

arXiv Robotics

62

Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt Selection

Abhishek Paudel, Abhish Khanal, Raihan I. Arnob, Shahriar Hossain, Gregory J. Stein

None

arXiv Robotics

63

A Cognitive Architecture for Embodied AI

None

Google Scholar Robotics

New Research Papers

Enjoying these digests?