Post

Policy Distillation

Introduction

[Paper]: Policy Distillation

The following statements from the paper are key to understand this technique:

  1. Distillation is a method to transfer knowledge from a teacher model $T$ to a student model $S$.
  2. Goals:
    1. It is “used to extract the policy of a reinforcement learning agent and train a new network that performs at the expert level while being dramatically smaller and more efficient.”
    2. It is “used to consolidate multiple task-specific policies into a single policy.”

The following part has not been finished yet.

Single-Game Policy Distillation

Multi-Task Policy Distillation

This post is licensed under CC BY 4.0 by the author.