Preview Mode Links will not work in preview mode

Technical AI Safety Podcast

May 15, 2021

with Joel Z. Leibo

Feedback form

Request an episode

Multi-agent Reinforcement Learning in Sequential Social Dilemmas

Joel Z. Leibo, Vinicius Zambaldi, Marc Lanctot, Janusz Marecki, Thore Graepel
Matrix games like Prisoner's Dilemma have guided research on social dilemmas for decades. However, they necessarily treat the...

Mar 11, 2021

With Alex Turner

Feedback form

Request an episode

Optimal Policies Tend to Seek Power

by Alexander Matt Turner, Logan Smith, Rohin Shah, Andrew Critch, Prasad Tadepalli

Abstract: "Some researchers have speculated that capable reinforcement learning agents are often incentivized to seek resources and power in pursuit of...

Feb 1, 2021

with Greg Anderson

Feedback form

Request an episode

Neurosymbolic Reinforcement Learning with Formally Verified Exploration

by Greg Anderson, Abhinav Verma, Isil Dillig, Swarat Chaudhuri
Abstract: "We present Revel, a partially neural reinforcement learning (RL) framework for provably safe exploration in continuous...

Jan 5, 2021

With Bettina Könighofer and Rüdiger Ehlers

Feedback form

Request an episode

Safe Reinforcement Learning via Shielding

Mohammed Alshiekh, Roderick Bloem, Ruediger Ehlers, Bettina Könighofer, Scott Niekum, Ufuk Topcu
Reinforcement learning algorithms discover policies that maximize reward, but do not necessarily...

Dec 7, 2020

Feedback form:

Request an episode:

The Technical AI Safety Podcast is supported by the Center for Enabling Effective Altruist Learning and Research, or CEEALAR. CEEALAR, known to some as the EA Hotel, is a nonprofit focused on alleviating...