プロンプト可能な基盤モデルによる自ら失敗から復帰し止まらない汎用的な生活支援システムの実現 (RCJ2024)

1.6K Views

May 05, 24

#GPSR #生活支援ロボット #プロンプトエンジニアリング #基盤モデル #失敗からの復帰

スライド概要

RoboCup@Home Japan Open 2024 TRAIL オープンチャレンジプレゼンテーション資料

TRAIL | 松尾・岩澤研究室

@trail_ut

スライド一覧

TRAIL (Tokyo Robotics and AI Lab) は東京大学松尾・岩澤研究室のサブグループです．実世界での知能の実現を目指して，ロボット学習を中心とした研究開発活動を行なっています．

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

（ダウンロード不可）

関連スライド

松尾研TRAIL第2回ロボナイト（2022年度ロボカップ成果報告会）

TRAIL | 松尾・岩澤研究室 9.8K

学振特別研究員になるために～2025年度申請版

学振 dc1 dc2 jsps pd

大上雅史 754.6K

研究に使える便利なフリーソフト ImageJ

imagej 放射線技師

片山豊 351.1K

StampFlyで学ぶマルチコプタ制御

伊藤恒平 323.5K

大規模言語モデルに追加学習で専門知識を教える試み (2023, arXiv:2312.03360)

Kan Hatakeyama 292.3K

ZAZA株式会社_会社紹介

ZAZA株式会社 266.4K

各ページのテキスト

RoboCup JapanOpen 2024 Technical Challenge プロンプト可能な基盤モデルによる⾃ら失敗から復帰し⽌まらない汎⽤的な⽣活⽀援システムの実現 ICRA 2024 採択 5/16 デモ予定東京⼤学 TRAIL Website 1

GPSR (General Purpose Service Robot) 様々なタスクや場⾯に柔軟に適応し、⼈間の要望に応じたサービスを提供することができるロボット 2

GPSRの競技会と実環境のギャップ RoboCup（競技会） GPSRに関する競技が存在 • 複数のタスクをどれだけこなせるかを競う • 得点のために失敗したら諦めて次のタスクに進む戦略が有効 3

GPSRの競技会と実環境のギャップ RoboCup（競技会）実際の家庭内環境 GPSRに関する競技が存在 • 複数のタスクをどれだけこなせるかを競う • 得点のために失敗したら諦めて次のタスクに進む戦略が有効お願いしたタスクが実⾏されないと困る⼀度失敗しても、⾃動的に復帰し、⽌まらずサービスを続けられるシステムを作れないだろうか？ 4 https://global.toyota/jp/download/8725271

https://global.toyota/jp/download/8725271

実環境でのGPSRの難しい点家庭内実環境での課題複雑なコマンド - 不要な情報 - 必要な情報の⽋落ロボットに必要とされるスキル必要な情報の抽出不⾜情報の補完例「買い物に⾏ってくるから、その間に本を⽚付けておいてね」 🤖 不要？ 🤖 本棚に？ 5

実環境でのGPSRの難しい点家庭内実環境での課題動的な環境 - 物体は動かされる - ⼈間は移動するロボットに必要とされるスキル推論に基づく⾏動例 🤖「りんごが見つからない！」à 🤖「誰かが冷蔵庫に入れたかな？」 6

基盤モデルのプロンプトモデルから目的の出力を得るための（追加的な）入力例⼤規模⾔語モデル（LLM）での推論能⼒の向上*1 例画像認識モデル（VLM）での認識精度の向上 7 *1: “Large Language Models are Zero-Shot Reasoners”, NeurIPS2022 (本研究室の成果）

TRAILのGPSRシステムプロンプトで調整可能な基盤モデルを４つ使⽤（ Detic CLIP Environment ） Segmentation Classification CLIP-Fields Object Recognition Environmental Information Semantic Mapping Executable Plan Bring me a bowl. Command Silero VAD Segmentation GPT Whisper Transcription Speech Recognition Planning LLM Plan Foundation Model 8

自動的に失敗から復帰するシステムの実現失敗した状況の説明や、⼈間との対話などを⽂脈としてプロンプトを更新実⾏プランの再検討や認識の再実⾏を⾏い、失敗から復帰コマンド : りんごをとってきて... 人間からのフィードバックプロンプト更新プラン再生成 M2 スキル1 スキル2 スキル4 スキル5 違います！ go_to コマンドスキル3 誤ったスキル pick ? ... ... ... 失敗 M3 M1 聞く “Kitchen” プロンプト更新プラン再生成代替スキル ... ... LLM Human 9

10.

コマンドの聞き取り Bring me a bowl. Whisper: Command プロンプタブルな⾳声認識モデル Silero VADで発話区間をセグメンテーションし、 Whisperでセグメントされた⾳声を⽂字起こしする Silero VAD Segmentation Whisper Transcription Speech Recognition 聞き取りにくい単語はWhisper にプロンプトとして入れることで文字起こし精度が向上する 10

11.

Environment Bring me 実行プランの生成 a bowl. Object Recognition Semantic Mapping Executable Plan Silero VAD Segmentation GPT Whisper Transcription GPT-4： Speech Recognition Command プロンプタブルな⼤規模⾔語モデル Planning LLM Plan Foundation Model タスクの環境や設定、ハードウェアの制約条件などをプロンプトで与え、２段階で実⾏プランを⽣成する。 SystemMessage(content="You are a helpful assistant for a robot." " The robot is in a house." " Things are organized according to categories in the house. For example, …" " Your mission is to convert natural language command into a list of sentences." " The robot will execute the sentences in order to complete the task." " The operator is at the instruction point and the robot is initially there." " The robot has only one arm and can only carry one object at a time. So,…." " The robot knows the location of furniture." f" The known location names are {location_name_list}." " You are good at predicting where each object is located. " " So, if a specific furniture to look for the object is not provided, predict…." " Just tell the robot what how to perform in order…." " The robot has to find the object before grasping it." " The robot has to go to a location before finding an object." " The robot has to go back to the instruction point." ), 11

12.

実行プランの生成コマンド「デスクからリンゴを持ってきてくれませんか？」 1段階⽬の出⼒ ['Move to the desk.', 'Find apple.', 'Pick up apple.', 'Move to the instruction point.', 'Hand over apple to the operator.’] 2段階⽬の出⼒ [{'function': 'go_to_location', 'location': 'desk'}, {'function': 'find_concrete_name_objects', 'objects': 'apple'}, {'function': 'pick', 'object': 'apple'}, {'function': 'go_to_location', 'location': 'instruction point'}, {'function': 'hand_over', 'object': 'apple', 'person': 'operator'}] 12

13.

実行プランの生成コマンド「デスクからリンゴを持ってきてくれませんか？」 1段階⽬の出⼒ ['Move to the desk.', 'Find apple.', 'Pick up apple.', 'Move to the instruction point.', 'Hand over apple to the operator.’] ロボットの実行シーケンスを短文で表現 2段階⽬の出⼒タスクの小タスク化 [{'function': 'go_to_location', 'location': 'desk'}, {'function': 'find_concrete_name_objects', 'objects': 'apple'}, {'function': 'pick', 'object': 'apple'}, {'function': 'go_to_location', 'location': 'instruction point'}, {'function': 'hand_over', 'object': 'apple', 'person': 'operator'}] 13

14.

実行プランの生成コマンド「デスクからリンゴを持ってきてくれませんか？」 1段階⽬の出⼒ ['Move to the desk.' , 'Find apple.', 'Pick up apple.' , 'Move to the instruction 各小タスク（サブゴール / subgoal） point.', 'Hand over apple to the operator.’] を実行関数と対応づける 2段階⽬の出⼒ [{'function': 'go_to_location', 'location': 'desk'}, {'function': 'find_concrete_name_objects', 'objects': 'apple'}, {'function': 'pick', 'object': 'apple'}, {'function': 'go_to_location', 'location': 'instruction point'}, {'function': 'hand_over', 'object': 'apple', 'person': 'operator'}] 14

15.

物体認識 Detic： Environment プロンプト可能な画像セグメンテーションモデル Bring me CLIP: a bowl. プロンプト可能な画像と⾔語の類似度計算モデル Detic でセグメンテーションし、 CLIPでセグメントされた画像を分類する Command Detic Segmentation CLIP Classification Object Recognition Silero VAD Segmentation Whisper Transcription Speech Recognition プロンプトを入れることで認識しやすくなる Detic “rope”: “a photo of a tangled white rope”, “jump rope”: “a photo of a green jump rope, a type of toy” CLIP “white rope”: “a photo of a white rope”, “jump rope”: “a photo of a green jump rope” 15

16.

机にリンゴがなかった！→リカバリーコマンド : りんごをとってきて... 誤ったスキルコマンドスキル1 スキル2 スキル3 スキル4 スキル5 go_to Find apple 失敗 Pick apple Find Ashley ... M3 代替スキルプロンプト更新プラン再生成 Find apple Pick apple ... 最終的な⽬標が何で、どこの段階で失敗したという情報をプロンプトとして与えた上で、次の実⾏をどうすれば良いか？GPTに実⾏シーケンスを新たに⽣成させる 16

17.

リカバリー：実行プランの再生成プロンプトで更新されたコマンド「The robot is supposed to bring the apple from the desk. The robot tried to find the apple at the desk, but failed. What should the robot do next?」 1段階⽬の出⼒ ['Move to the table.', 'Find apple.', 'Pick up apple.', 'Move to the instruction point.', 'Hand over apple to the operator.’] 2段階⽬の出⼒ [{'function': 'go_to_location', 'location': 'table'}, {'function': 'find_concrete_name_objects', 'objects': 'apple'}, {'function': 'pick', 'object': 'apple'}, {'function': 'go_to_location', 'location': 'instruction point'}, {'function': 'hand_over', 'object': 'apple', 'person': 'operator'}] 17

18.

リカバリー：実行プランの再生成プロンプトで更新されたコマンド最終的な目標「The robot is supposed to bring the apple from the は何か desk. The robot tried to find the apple at the desk, but failed. What should the robot do next?」コマンド：The robot is supposed to bring the 1段階⽬の出⼒ ['Move to theapple table.', 'Find apple.' 'Pick upThe apple.' , 'Movetried to the to instruction point.', 'Hand from the, desk. robot find the over apple to the operator.’] apple at the desk, but failed. What should the robot do next? 2段階⽬の出⼒ [{'function': 'go_to_location', 'location': 'table'}, ロボットは次に {'function': 'find_concrete_name_objects', 'objects': 'apple'}, 何をすべきかどこで、何に {'function': 'pick', 'object': 'apple'}, 失敗したのか {'function': 'go_to_location', 'location': 'instruction point'}, {'function': 'hand_over', 'object': 'apple', 'person': 'operator'}] 18

19.

リカバリー：実行プランの再生成プロンプトで更新されたコマンド「The robot is supposed to bring the apple from the desk. The robot tried to find the apple at the desk, but failed. What should the robot do next?」 1段階⽬の出⼒ ['Move to the table.', 'Find apple.', 'Pick up apple.', 'Move to the instruction point.', 'Hand over apple to the operator.’] 違う机を見に行ってみる、など 2段階⽬の出⼒失敗を踏まえたプランが、生成される！ [{'function': 'go_to_location' , 'location': 'table'}, {'function': 'find_concrete_name_objects', 'objects': 'apple'}, {'function': 'pick', 'object': 'apple'}, {'function': 'go_to_location', 'location': 'instruction point'}, {'function': 'hand_over', 'object': 'apple', 'person': 'operator'}] 19

20.

コマンド「テーブルからリンゴをとってきてくれませんか？」 20

21.

15分間ほど動き続けることができた ICRA 2024 採択 5/16 デモ予定 Okay HSR, you have loads of missions. To start with, I bought a bottle of tea for Ashley, so please receive it and take it to the desk and encourage Ashley. Next, take Red Bull to Robin on the sofa, talk to him to ensure he's not sleeping and hand Red Bull over. If possible, ask if anything is bothering him and please inform his answer to me. Website 21