PhyInfoEQA: Towards Human-like Embodied QA with Integrated Physical and Informational Reasoning

Anonymous Authors
Embodied Question Answering · Physical & Informational Reasoning
Preprint · 2026
Replace button links later with official paper / GitHub / dataset URLs.

Key Features

  • First benchmark for integrated physical and informational reasoning. PhyInfoEQA is the first unified benchmark for active embodied question answering that requires agents to jointly reason over physical observations and dynamically acquired digital information.
  • Industrial embodied QA benchmark with diverse scenarios. Built on NVIDIA Isaac Sim, the benchmark contains 1,016 valid task instances spanning 22 industrial task scenarios, covering equipment diagnosis, workflow verification, resource management, and safety compliance.
  • TriPMA active cognitive framework. The proposed TriPlanner–Manager–Actor (TriPMA) is a problem-driven active cognitive framework that integrates planning, scene graph maintenance, and multi-source information acquisition.

PhyInfoEQA Demo Videos

This demo presents the full PhyInfoEQA pipeline, including embodied exploration, scene graph construction, physical-information fusion, active reasoning, and industrial question answering in complex environments.

PhyInfoEQA Demo — Embodied Physical & Informational Reasoning

The video demonstrates active industrial exploration, scene graph updating, digital information acquisition, and multi-step embodied reasoning using TriPMA.

TriPMA Framework Overview

Figure 3

Fig. 1. Overview of the proposed TriPMA framework.

A concrete workflow example of TriPMA. Given the question ``Why did conveyor belt A fail and stop?'', reasoning over physical and informational spaces, the agent navigates across regions (Assembly line work area ——> Electrical room), maintains an object information list, collects physical observations and digital information on demand, and dynamically updates the scene graph until the question becomes answerable.

PhyInfoEQA Dataset

Figure 2

Fig. 2. Dataset examples and statistics of PhyInfoEQA.

Task examples and dataset statistics of the PhyInfoEQA Dataset

Core Idea

PhyInfoEQA addresses a critical limitation in existing Embodied Question Answering systems: most current benchmarks rely solely on physical perception and lack support for dynamic informational reasoning.

We introduce PhyInfoEQA, the first unified benchmark that requires embodied agents to jointly reason across physical observations and acquired digital information.

We further propose TriPMA, a question-driven active cognitive framework integrating planning, scene understanding, active exploration, and information acquisition for industrial embodied intelligence.

Highlights

1,016
Industrial QA tasks
22
Industrial task scenarios
TriPMA
Active reasoning framework
67.52% / 70.28%
DS / RS

TriPlanner–Manager–Actor Architecture

Figure 3 Detail

Fig. 3 . Detailed architecture of TriPMA.

Overview of the proposed TriPMA framework for PhyInfoEQA task. TriPMA consists of three functional modules: TriPlanner (Planner 1 for answerability judgment, Planner 2 for target decomposition, Planner 3 for navigation decision), Manager (maintaining an object information list and a hierarchical scene graph), and Actor (executing cross-region navigation and in-region exploration via multi-sourced value maps).

Qualitative Reasoning Case

Figure 8

Fig. 4. Qualitative reasoning case of TriPMA.

The two RGB-D images used by TriPMA for information acquisition: (a) capturing the equipment status and intactness information of the robotic arm; and (b) capturing the occupancy status of the vehicle bays.

Main Contributions

  • Benchmark: First unified benchmark for active EQA with joint physical and informational reasoning.
  • Dataset: 1,016 industrial tasks on NVIDIA Isaac Sim covering four reasoning types and four layout types.
  • Framework: TriPMA, a question-driven active cognitive framework for integrated perception and information reasoning.

BibTeX

@article{phyinfoeqa2026, title={PhyInfoEQA: Towards Human-like Embodied QA with Integrated Physical and Informational Reasoning}, author={Anonymous Authors}, journal={arXiv preprint}, year={2026} }