Basically, all generalisation is based on the prior knowledge [47]. Training data provides information only at the data points and prior knowledge like piecewise smoothness of natural phenomena is needed for generalising to future data. This means that all learning has some priors that are either implicit or explicit as in Bayesian inference. In practice, the priors are often too strict.
Bayesian solutions are sensitive to the priors. Bad or wrong prior information can lead to too a complex model and in a sense overfitting [42]. The selection of appropriate priors requires lots of expert work and therefore so called non-informative priors [37] are searched for. They could be used in case of complete ignorance of the problem at hand. With hierarchical priors [18], some choices can be moved to higher levels which contain less information.