Skip to content

Latest commit

 

History

History
8 lines (7 loc) · 396 Bytes

general.rst

File metadata and controls

8 lines (7 loc) · 396 Bytes

What do you do with...

  1. Reinforcement Learning. Acting in a dynamic environment by trying to optimize a deferred reward.
    • A/B Testing - Bandit algorithm as an alternative to randomized controlled trials.
    • Drug Discovery
  2. General Adversarial Networks (GAN).
    • We have difficulty defining a reward function, but have examples of what good looks like.