We propose Wonderful Team, a zero-shot, single-model, multi-agent system for solving visual robotics tasks. Taking inspiration from recent advances in the multi-agent LLM literature, our system employs specialized agents to collaboratively manage different task aspects, from high-level planning to low-level execution, within a single integrated system. In particular, we develop a multi-agent LLM system wherein each agent is responsible for a separate component of task execution: including planning, object identification and location, action proposal, memory, and self-correction.

If you find Wonderful Team useful in your research or applications, please consider citing it using the following BibTeX entry:
@misc{wang2024wonderfulteam,
title={Wonderful Team: Zero-Shot Physical Task Planning with Visual LLMs},
author={Zidan Wang and Rui Shen and Bradly Stadie},
year={2024},
eprint={2407.19094},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2407.19094},
}