Calculating Curiosity with Friston's Free Energy in Reinforcement Learning

Asked Nov 18 '22 at 08:49

Active Dec 12 '22 at 05:39

Viewed 64 times

I'm trying to implement the paper A Curiosity Algorithm for Robots Based on the Free Energy Principle in a reinforcement learning environment using PyTorch, but I am unclear how curiosity is calculated.

As the paper describes, I made a 'transitioner' which predicts the next state given the current state and action. One layer of that transitioner is a Bayesian linear layer from Blitz; the probability distribution of weights in that layer is $q_ψ(w) = N(w|μ,σ_2)$ where $ψ = \{μ,σ\}$. Then, curiosity is defined as the Kullback–Leibler divergence $D_{KL}[q_ψ(w|s_{t+1})||q_ψ(w)]$.

$q_ψ(w)$ is easily available, but $q_ψ(w|s_{t+1})$ is described as value of $q$ minimizing free energy $F = D_{KL}[q_ψ(w)||p_ψ(w)] − E_{q_ψ}(w)[log (p_ψ(s_{t+1}|w))]$, and $p_ψ$ isn't given. I think I understand the concept, but I'm puzzled how to implement this.

edited Dec 12 '22 at 05:39

asked Nov 18 '22 at 08:49

Ted Tinker

To update: I am learning much by focusing on the paper inspiring Blitz: "Weight Uncertainty in Neural Networks" (also cited in "A Curiosity Algorithm for Robots Based on the Free Energy Principle"). It seems the loss-function for weight probability distributions is, in fact, the variational free energy. Hence, DKL[qψ(w|st+1)||qψ(w)] literally describes comparing weights before and after training. – Ted Tinker Nov 20 '22 at 02:24
You can use mathjax on this site. – nbro Dec 11 '22 at 13:06
Oh, thank you. I'll try editing the question. – Ted Tinker Dec 12 '22 at 05:33

Calculating Curiosity with Friston's Free Energy in Reinforcement Learning

0 Answers0