OpenAI News page 77

OpenAI News April 21, 2017 07:00

Equivalence between policy gradients and soft Q-learning

Two of the leading approaches for model-free reinforcement learning are policy gradient methods and Q-learning methods. Q-learning methods can be effective and sample-efficient when they work, however, it is not well-understood why they work, since...

Policy

Policy Target

OpenAI News April 10, 2017 07:00

Stochastic Neural Networks for hierarchical reinforcement learning

Deep reinforcement learning has achieved many impressive results in recent years. However, tasks with sparse rewards or long horizons continue to pose significant challenges. To tackle these important problems, we propose a general framework that first...

Policy Infrastructure

OpenAI News April 06, 2017 07:00

Unsupervised sentiment neuron

We’ve developed an unsupervised system which learns an excellent representation of sentiment, despite being trained only to predict the next character in the text of Amazon reviews.

OpenAI News April 01, 2017 07:00

Spam detection in the physical world

We’ve created the world’s first Spam-detecting AI trained entirely in simulation and deployed on a physical robot.

OpenAI News March 24, 2017 07:00

Evolution strategies as a scalable alternative to reinforcement learning

We’ve discovered that evolution strategies (ES), an optimization technique that’s been known for decades, rivals the performance of standard reinforcement learning (RL) techniques on modern RL benchmarks (e.g. Atari/MuJoCo), while overcoming many of RL’s...

OpenAI News March 21, 2017 07:00

One-shot imitation learning

Imitation learning has been commonly applied to solve different tasks in isolation. This usually requires either careful feature engineering, or a significant number of samples. This is far from what we desire: ideally, robots should be able to learn from...

OpenAI News March 20, 2017 07:00

Distill

We’re excited to support today’s launch of Distill, a new kind of journal aimed at excellent communication of machine learning results (novel or existing).

OpenAI News March 16, 2017 07:00

Learning to communicate

In this post we’ll outline new OpenAI research in which agents develop their own language.

Models Agents

OpenAI Models Agents

OpenAI News March 15, 2017 07:00

Emergence of grounded compositional language in multi-agent populations

By capturing statistical patterns in large corpora, machine learning has enabled significant advances in natural language processing, including in machine translation, question answering, and sentiment analysis. However, for agents to intelligently interact...

Agents

OpenAI News March 12, 2017 08:00

Prediction and control with temporal segment models

Models

OpenAI News March 06, 2017 08:00

Third-person imitation learning

Reinforcement learning (RL) makes it possible to train agents capable of achieving sophisticated goals in complex and uncertain environments. A key difficulty in reinforcement learning is specifying a reward function for the agent to optimize....

Agents

OpenAI News February 24, 2017 08:00

Attacking machine learning with adversarial examples

Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake; they’re like optical illusions for machines. In this post we’ll show how adversarial examples work across different...

Models