Registration is open for joining Cohort 8 to learn Active Inference with a group from the textbook
Active Inference
Active inference is a theoretical framework that relates pragmatics and epistemics. It models the freedom that we have when our cognitive models do not match our perceived reality: We can either update our models or adjust our reality. The free energy principle states that we behave so as to reduce uncertainty, which is to say, reduce surprise.
Active inference is a robust conceptual modeling language which allows us to connect with the perspectives of all manner of frameworks. Look for Active Inference concepts in our Theory Translator based on Wondrous Wisdom.
Active inference is of special importance to Econet because both are embraced by Daniel Friedman, President of the Active Inference Institute, and Andrius Kulikauskas, a student of Active Inference. Daniel has set up the Math 4 Wisdom Coda where they explore connections between Active Inference and Wondrous Wisdom.
Learning Active Inference
Andrius is learning Active Inference with Cohort 8. We're spending two weeks on each chapter, two chapters in parallel, starting with theoretical Chapter 1 and application Chapter 2.
Here are some references for learning Active Inference.
A comprehensive foundation is provided by the Active Inference Textbook.
The video by Shamil Chandaria has a mathematical exposition of free energy.
Free Energy
Andrius: This is my understanding based on the above video.
{$P$} describes the probabilities given by the first mind, the neural mind, and {$Q$} describes the probabilities given by the second min, the conceptual mind. We have {$P\sim Q$}.
(Shannon) Relative Entropy: KL Divergence
{$$D_{KL}(p\parallel q)=\sum_{x\in X}p(x)\textrm{log}\frac{p(x)}{q(x)}$$}
What are the causes of my sensory data? Bayes theorem gives: {$P(v|u)=\frac{P(u|v)P(v)}{P(u)}$}
Use approximate posterior {$q$} and learn its parameters (synaptic weights) {$\phi $}. We have {$Q(v|u)\sim P(v|u)$}
Minimize the Kullback-Leibler divergence ('distance') between {$Q$} and true Posterior {$P$} by changing the parameters {$\phi$}.
{$$\textrm{min}!\;D_{KL}[P(v|u)\parallel Q(v|u)]=\sum_v Q(v|u)\textrm{ln} \left [\frac{Q(v|u)}{P(v|u)} \right ]$$}
{$$=\mathbb{E}_Q[\textrm{ln}Q(v|u)-\textrm{ln}P(v|u)]$$}
{$$=\mathbb{E}_Q[\textrm{ln}Q(v|u)-\textrm{ln}P(u|v)-\textrm{ln}P(v)+\textrm{ln}P(u)]$$}
{$$=\mathbb{E}_Q[\textrm{ln}Q(v|u)-\textrm{ln}P(v)]+\mathbb{E}_Q[-\textrm{ln}P(u|v)]+\textrm{ln}P(u)$$}
{$$=D_{KL}[Q(v|u)\parallel P(v)]+\mathbb{E}_Q[-\textrm{ln}P(u|v)]+\textrm{ln}P(u)$$}
{$\textrm{ln}P(u)]$} constant
{$D_{KL}[Q(v|u)\parallel P(v)]$} {$KL$}-divergence of prior and posterior
{$\mathbb{E}_Q[-\textrm{ln}P(u|v)]$} how surprising is the sensory data
{$D_{KL}[Q(v|u)\parallel P(v)]+\mathbb{E}_Q[-\textrm{ln}P(u|v)]$} free energy
{$F=\sum_v Q(v|u)[-\textrm{ln}P(v,u)]-\sum_v -Q(v|u)\textrm{ln}Q(v|u)$} free energy = average energy + entropy
{$p_\nu = \frac{e^{-\beta E_\nu}}{Z}$} probability of being in the energy state {$E_\nu$} (Boltzmann distribution)
{$-\textrm{ln}P(v,u)$} energy of the explanation
{$\sum_v Q(v|u)[-\textrm{ln}P(v,u)]$} average energy
{$\sum_v -Q(v|u)\textrm{ln}Q(v|u)$} entropy
Free energy
Andrius: I am studying the different kinds of energy so that I could understand free energy, which is basic for Active Inference.
Internal energy {$U$} has to do with energy on the microscopic scale, the kinetic energy and potential energy of particles.
Heat {$Q$} has to do with energy transfer on a macroscopic scale across a boundary.
See also: Active Inference at Math 4 Wisdom
Activities Grounded in Active Inference
Examples from chapter 7
- Perceptual processing (listening to an amateur musician) - deducing (intended) signal
- Decision-making and planning (rat navigating a T-maze by reading info) - epistemic vs. pragmatic
- Information seeking (eye saccade) seeking precision where you can get it, yielding streetlight effect
- Learning and novelty (synthetic worm exploring its environment) changing the generative model
- Hierarchical or deep inference (words and sentences) separate time scales
What I want to model
- How issues come to matter (to the continuous, procedural mind) discovering critical points and navigating with regard to them
- How meaning arises (to the discrete, declarative mind) for a language of critical points
How does prediction and resulting error come into play?