Title: Towards Human–AI Safety
2/20/24 – 2pm in WEB L102
Abstract: As generative artificial intelligence (AI) interacts with people at an unprecedented scale—from behavior predictors that guide autonomous cars’ decision-making to language models that converse with millions of end-users—the problem of human–AI safety has exploded in interest. However, the safety consequences of an AI model’s outputs cannot be determined in an isolated context: they are tightly coupled with the responses and behavior of human users over time. My group has been formalizing this research challenge by unifying ideas from machine learning with control theory, which rigorously models these safety-critical feedback loops induced by interaction. In this talk, I will describe our past and recent work towards human–AI safety, grounded in applications such as autonomous driving and personal robotic manipulation. Specifically, I will discuss how to detect and mitigate anomalous human–robot interactions, how robots can reliably learn from human feedback, and how we can ensure that vision-based robots understand end-user preferences and behave accordingly. Bio: Andrea Bajcsy is an Assistant Professor in the Robotics Institute at Carnegie Mellon University. She received her doctoral degree in Electrical Engineering & Computer Science from UC Berkeley. She works at the intersection of robotics, machine learning, and human-AI interaction. Her research develops theoretical frameworks and practical algorithms for intelligent embodied agents to safely interact with people, in applications such as personal robotic manipulators, quadrotors, and autonomous vehicles. Her work is funded by the NSF and has been featured in NBC news, WIRED magazine, and the Robohub podcast. She is the recipient of an Honorable Mention for the T-RO Best Paper Award, the NSF Graduate Research Fellowship, UC Berkeley Chancellor’s Fellowship, and worked at NVIDIA Research and Max Planck Institute for Intelligent Systems.