Chapter 1-3
Concepts unfamiliar yet
tensor, pseudoinverse of A, gradient, matrix derivatives, derivatives w.r.t. a tensor, Jacobian matrix, Hessian matrix, Shannon entropy of a random variable, Kullback-Leibler divergence of two measures.
Tensors: an array of numbers arranged on a regular grid with a variable number of axes is known as a tensor.
Norm: In machine learning, norm is to measure the size of vectors.
The Moore-Penrose Pseudoinverse: p43-44, still not fully understand.
A Gaussian mixture model is a universal approximator of densities, in the sense that any smooth density can be approximated with any specific nonzero amount of error by a Gaussian mixture model with enough components.
A matrix is isotropic if it is proportionate to the identity matrix