Numerical Weather Prediction (NWP) is fundmentally an Initial Value Problem for a set of Partial Differential Equations describing the physics of the atmosphere. Due to the non-linear and chaotic nature of these equations, they must be solved iteratively using finite difference (or spectral) approximations. Furthermore, the solutions exhibit extreme sensitivity to small perturbations of the initial state. The goal of data assimilation is to incorporate new observational data into a prior NWP forecast in a way that minimizes the initial errors in a new NWP forecast. Data assimilation advances have contributed to consistently improving operational weather forecasts on multiple scales, from thunderstorms to prolonged heat waves, over the last several decades. While data assimilation is a broad and growing field, this presentation will focus on the two most commonly used data assimilation methods, the ensemble-based and variational frameworks. The basic theory of each framework will be introduced and used to show how and why these techniques work, and the relative advantages and limitations of each will be emphasized. A relatively new hybrid technique that combines the two frameworks will also be briefly discussed. This theoretical background be used to provide a deeper understanding of the practical challenges of estimating the error covariances that are central to both frameworks in real-world applications of full complexity.