Faculty Publications

A comparison of contextual bandit approaches to human-in-the-loop robot task completion with infrequent feedback

Matt McNeill, Fordham UniversityFollow
Damian Lyons, Fordham UniversityFollow

Degree of Contribution

Lead

Document Type

Conference Proceeding

Keywords

Robotics, Machine Learning, Reinforcement learning

Disciplines

Computer Engineering | Robotics

Abstract

Artificially intelligent assistive agents are playing an increased role in our work and homes. In contrast with currently predominant conversational agents, whose intelligence derives from dialogue trees and external modules, a fully autonomous domestic or workplace robot must carry out more complex reasoning. Such a robot must make good decisions as soon as possible, learn from experience, respond to feedback, and rely on feedback only as much as necessary. In this research, we narrow the focus of a hypothetical robot assistant to a room tidying task in a simulated domestic environment. Given an item, the robot chooses where to put it among many destinations, then optionally receives feedback from a human operator. We frame the problem as a contextual bandit, a reinforcement learning approach frequently used in Web recommendation systems. We evaluate e-greedy and LinUCB action selection methods under a variety of infrequent feedback scenarios, with several methods for managing the lack of feedback. Our empirical results show that, while early-episode performance and overall accuracy of e-greedy action selection can be improved through learning from no-response feedback and careful management of remembered training episodes, a baseline LinUCB approach outperforms e- greedy action selection in early-episode performance, overall accuracy, and simplicity.

Publication Title

1st IEEE Int. Conf. on Tools with AI (ICTAI 2019), Nov 4-6 Portland Oregon, 2019.

Issue

Article Number

1075

Publication Date

11-2019

Language

United States

Peer Reviewed

Recommended Citation

Matt McNeill, Damian Lyons, “A Comparison of textual bandit Approaches to human-in-the-loop robot task completion with infrequent feedback.” To appear: 31st IEEE Int. Conf. on Tools with AI (ICTAI 2019), Nov 4-6 Portland Oregon, 2019.

Version

Published

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Download

Included in

Robotics Commons

COinS

Faculty Publications

A comparison of contextual bandit approaches to human-in-the-loop robot task completion with infrequent feedback

Degree of Contribution

Document Type

Keywords

Disciplines

Abstract

Publication Title

Issue

Article Number

Publication Date

Language

Peer Reviewed

Recommended Citation

Version

Creative Commons License

Included in

Search

Links

Browse

Author Corner

Faculty Publications

A comparison of contextual bandit approaches to human-in-the-loop robot task completion with infrequent feedback

Authors

Degree of Contribution

Document Type

Keywords

Disciplines

Abstract

Publication Title

Issue

Article Number

Publication Date

Language

Peer Reviewed

Recommended Citation

Version

Creative Commons License

Included in

Share

Search

Links

Browse

Author Corner