I'm currently diving into the Bayesian world and I find it pretty fascinating. I've so far understood that applying the Bayes' Rule, i.e. $$\text{posterior} = \frac{\text{likelihood}\times \text{prior}}{\text{evidence}}$$ are most of the time intractable because of the high dimensional parameter space in the denominator. One way to solve this is by using a prior conjugate to the likelihood, as then the analytical form of the posterior is known and calculations are simplified.
So far so good. Now I've read about bayesian sequential filtering and smoothing techniques such as the Kalman Filter or Rauch-Tung-Striebel Smoother (find references here). As far as I understood, assuming a time step $k$, instead of calculating the complete posterior distribution $p(X_k|Y_k)$ with $X=[x_1, ...,x_k]$ and $Y=[y_1,...y_k]$, a Markov Chain is assumed and only the marginal $p(x_k|Y_k)$ is estimated in a recursive manner. That is, the posterior calculate at time step $k$ serves as prior for the next time step. I guess, Bayes' Rule is somehow involved in these calculations
Furthermore, both techniques assume the posterior always to be Gaussian and therefore closed-form solutions are obtained. Now I was wondering what restriction does make the whole process tractable, i.e. eliminates the need to compute the evidence?
I guess it's the Gaussian assumption, i.e. the prior, the predicted, and the posterior distribution are all assumed to be Gaussian, and therefore updated distributions are obtained without computing the evidence - is this correct and does this refer to conjugate distributions?
Or is it the fact that we assume a Markov Chain and do not consider all states at each time step?