OpenAI News page 76

OpenAI News July 27, 2017 07:00

Better exploration with parameter noise

We’ve found that adding adaptive noise to the parameters of reinforcement learning algorithms frequently boosts performance. This exploration method is simple to implement and very rarely decreases performance, so it’s worth trying on any problem.

OpenAI News July 20, 2017 07:00

Proximal Policy Optimization

We’re releasing a new class of reinforcement learning algorithms, Proximal Policy Optimization (PPO), which perform comparably or better than state-of-the-art approaches while being much simpler to implement and tune. PPO has become the default...

Models Policy

OpenAI Models Policy

OpenAI News July 17, 2017 07:00

Robust adversarial inputs

We’ve created images that reliably fool neural network classifiers when viewed from varied scales and perspectives. This challenges a claim from last week that self-driving cars would be hard to trick maliciously since they capture images from multiple...

OpenAI News July 05, 2017 07:00

Hindsight Experience Replay

Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid...

Policy

OpenAI News July 01, 2017 07:00

Teacher–student curriculum learning

We propose Teacher–Student Curriculum Learning (TSCL), a framework for automatic curriculum learning, where the Student tries to learn a complex task and the Teacher automatically chooses subtasks from a given set for the Student to train on. We describe a...

Infrastructure

OpenAI News June 28, 2017 07:00

Faster physics in Python

We’re open-sourcing a high-performance Python library for robotic simulation using the MuJoCo engine, developed over our past year of robotics research.

OpenAI News June 13, 2017 07:00

Learning from human preferences

One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaboration...

Models

OpenAI News June 08, 2017 07:00

Learning to cooperate, compete, and communicate

Multiagent environments where agents compete for resources are stepping stones on the path to AGI. Multiagent environments have two useful properties: first, there is a natural curriculum—the difficulty of the environment is determined by the skill of your...

Agents

OpenAI News June 05, 2017 07:00

UCB exploration via Q-ensembles

We show how an ensemble of Q*-functions can be leveraged for more effective exploration in deep reinforcement learning. We build on well established algorithms from the bandit setting, and adapt them to the Q-learning setting. We propose an exploration...

OpenAI News May 24, 2017 07:00

OpenAI Baselines: DQN

We’re open-sourcing OpenAI Baselines, our internal effort to reproduce reinforcement learning algorithms with performance on par with published results. We’ll release the algorithms over upcoming months; today’s release includes DQN and three of its variants.

Models

OpenAI Models

OpenAI News May 16, 2017 07:00

Robots that learn

We’ve created a robotics system, trained entirely in simulation and deployed on a physical robot, which can learn a new task after seeing it done once.

OpenAI News May 15, 2017 07:00

Roboschool

We are releasing Roboschool: open-source software for robot simulation, integrated with OpenAI Gym.

Models

OpenAI Models