Observer module

This section describes the most important tools of the Observer module with examples. For a more detailed description of the methods refer to the API.

The Observer module contains several state (and parameter) observers. Observers are used to infer states and parameters from measurements, which is essential when not all states are directly measurable or when measurements are corrupted by noise.

The Observer module contains the following classes:

  • Moving Horizon Estimator (MHE)

  • Kalman Filter (KF)

  • Extended Kalman Filter (EKF)

  • Unscented Kalman Filter (UKF)

  • Particle Filter (PF)

Moving Horizon Estimator

The class MovingHorizonEstimator (alias MHE) implements a Moving Horizon Estimator for state and parameter estimation. The MHE solves an optimization problem over a moving time window to estimate states and parameters by minimizing the mismatch between predicted and measured outputs.

For a complete example of using the MHE, see the Chemical Reaction MHE Example.

Mathematical formulation

The MHE is formulated for a discrete-time nonlinear system (HILO-MPC can work with both continuous-time and discrete-time models, automatically discretizing continuous models):

\[\begin{split}\begin{aligned} x_{k+1} &= f(x_k, u_k, w_k, p), \\ y_k &= h(x_k, u_k, p) + v_k \end{aligned}\end{split}\]

where \(x_k \in \mathbb{R}^{n_x}\) is the state, \(u_k \in \mathbb{R}^{n_u}\) is the input, \(w_k \in \mathbb{R}^{n_x}\) is the process noise, \(p \in \mathbb{R}^{n_p}\) are parameters, \(y_k \in \mathbb{R}^{n_y}\) are measurements, and \(v_k \in \mathbb{R}^{n_y}\) is measurement noise.

At time \(t\), the Moving Horizon Estimator solves an optimization problem over a sliding window of length \(N\) to estimate the state trajectory \(\{x_{t-N}, \ldots, x_t\}\) (and optionally parameters \(p\)):

\[\begin{split}\begin{aligned} \min_{\substack{x_{t-N:t},\\ w_{t-N:t-1},\\ p}}\; & \underbrace{\Vert x_{t-N}-\bar{x}_{t-N} \Vert_{P_{t-N}^{-1}}^2}_{\text{arrival cost}} + \sum_{k=t-N}^{t-1} \underbrace{\Vert w_k \Vert_{Q_k^{-1}}^2}_{\text{process noise penalty}} + \sum_{k=t-N}^{t} \underbrace{\Vert y_k - h(x_k,u_k,p) \Vert_{R_k^{-1}}^2}_{\text{measurement residual}} \\ \text{s.t.}\quad & x_{k+1} = f(x_k, u_k, w_k, p),\quad k=t-N,\ldots,t-1, \\ & x_k \in \mathcal{X},\quad k=t-N,\ldots,t, \\ & w_k \in \mathcal{W},\quad k=t-N,\ldots,t-1, \\ & p \in \mathcal{P}, \\ & c(x_k, u_k, p) \leq 0,\quad k=t-N,\ldots,t, \\ & e(x_k, u_k, p) = 0,\quad k=t-N,\ldots,t. \end{aligned}\end{split}\]

where \(\Vert z \Vert_M^2 = z^\top M z\) denotes the weighted squared norm.

  • \(N\) is the horizon length (see horizon).

  • \(P_0\) weights the arrival/prior term (see quad_arrival_cost.add_states(weights==...)).

  • \(R_k\) weights measurement residuals (see quad_stage_cost.add_measurements(weights=...)).

  • \(Q_k\) weights process or state noise (see quad_stage_cost.add_state_noise(weights=...)).

  • \(u_k\) and \(y_k\) are the inputs and measurements provided via add_measurements(y_meas=..., u_meas=...).

  • \(\hat{x}_{t-N}\) is the prior/arrival state supplied via estimate(x_arrival=...,p_arrival=...).

  • \(x^{\min}, x^{\max}, w^{\min}, w^{\max}, p^{\min}, p^{\max}\) are box constraints (see set_box_constraints(x_lb=..., x_ub=..., w_lb=..., w_ub=..., p_lb=..., p_ub=...)).

  • \(c(\cdot) \le 0\) and \(e(\cdot) = 0\) are nonlinear inequality and equality constraints (see stage_constraint.constraint, stage_constraint.lb, stage_constraint.ub).

Note

You can pass either a numpy matrix or a list or numpy array to the weights. If you pass a list or array, a matrix of appropriate dimensions will be created internally with the array in the diagonal and zero anywhere else.

Note

All the parameter variables of the model are assumed to be unknown and will be estimated by the MHE. If you want to keep them constant, you can add equal lower and upper constraint, or substitute them with a actual number directly in the model.

The solution of this optimization problem provides the smoothed state estimate \(\hat{x}_t\) at the current time, along with estimates of parameters \(\hat{p}\) if included in the optimization.

To set up an MHE properly you need at least to define: - A dynamic model - A horizon length using the horizon property. - A stage cost function (typically using quad_stage_cost). - An arrival cost function (typically using quad_arrival_cost).

The MHE can be initialized as follows:

from hilo_mpc import Model, MHE

# Initialize MHE with a model
mhe = MHE(model)

Required information, like e.g. the model dynamics or the sampling time, will be automatically extracted from the Model instance.

Horizon length

The horizon length determines how many past measurements are considered in the estimation problem. A longer horizon typically provides better estimates but increases computational cost:

mhe = MHE(model)
mhe.horizon = 10  # Use 10 past measurements

Cost functions

The MHE uses two types of costs:

  • Stage cost: penalizes the mismatch between predicted and measured outputs at each time step

  • Arrival cost: penalizes the deviation from the prior estimate at the beginning of the horizon

Stage cost

The stage cost is typically defined to penalize measurement errors and state/parameter noise:

# Add measurement error terms
mhe.quad_stage_cost.add_measurements(names=['y1', 'y2'],
                                     weights=[10, 10])

# Add state noise terms (if using state noise modeling)
mhe.quad_stage_cost.add_state_noise(names=['x1', 'x2'],
                                    weights=[1, 1])

Arrival cost

The arrival cost provides information about the initial state estimate at the beginning of the horizon:

# Set arrival cost for states
mhe.quad_arrival_cost.add_states(names=['x1', 'x2'],
                                 weights=[1, 1])

Adding measurements and estimation

Measurements are added to the MHE using the add_measurements() method, and estimation is performed using estimate():

# Add measurement
mhe.add_measurements(y_meas=[measurement_value], u_meas=[input_value])

# Estimate states once horizon is reached
x_estimated = mhe.estimate(x_arrival=x0)

Constraints

The MHE supports both box constraints and nonlinear constraints to enforce physical bounds and system constraints.

Box constraints

Box constraints impose simple lower and upper bounds on states, process noise, and parameters:

# Set box constraints for states
mhe.set_box_constraints(x_lb=[0, 0, 0], x_ub=[10, 10, 10])

# Set constraints for states and process noise
mhe.set_box_constraints(x_lb=[0, 0, 0], x_ub=[10, 10, 10],
                       w_lb=[-0.1, -0.1, -0.1], w_ub=[0.1, 0.1, 0.1])

# Add parameter bounds when estimating parameters
mhe.set_box_constraints(x_lb=[0, 0, 0], x_ub=[10, 10, 10],
                       p_lb=[0, 0], p_ub=[5, 5])

Nonlinear constraints

Nonlinear inequality and equality constraints can be added using the stage_constraint property:

# Add inequality constraint: 0.1 <= Ca <= 5
mhe.stage_constraint.constraint = model.x[0]  # Ca
mhe.stage_constraint.lb = 0.1
mhe.stage_constraint.ub = 5

# For equality constraints, set lb = ub
# For inequality constraints c(x) >= a, use lb=a and ub=ca.inf
# For inequality constraints c(x) <= b, use lb=-ca.inf and ub=b

Setup

Before using the MHE, it must be set up using the setup() method:

mhe.setup(options={'integration_method': 'collocation'})

Non-uniform sampling intervals

The MHE supports non-uniform sampling intervals, allowing measurements to be taken at irregular time intervals. This is particularly useful when dealing with multi-rate sensors or event-triggered measurements.

In the mathematical program, this corresponds to time steps with variable lengths \(\Delta t_k\), which affect the discretized dynamics constraint \(x_{k+1} = f_{\Delta t_k}(x_k, u_k, w_k)\). The chosen integration method (options={'integration_method': ...}) determines how \(f_{\Delta t_k}(\cdot)\) is formed from the continuous-time model.

Multi-rate measurements

When different sensors provide measurements at different rates, the MHE can handle this naturally by accepting measurements only when they are available.

Kalman Filter

The class KalmanFilter (alias KF) implements the Kalman filter developed by Rudolf E. Kálmán. The Kalman filter is optimal for linear systems with Gaussian noise and provides recursive state estimation with low computational cost.

Mathematical formulation

The Kalman filter is designed for linear systems of the form:

\[\begin{split}\begin{aligned} x_{k+1} &= A x_k + B u_k + w_k, \\ y_k &= C x_k + D u_k + v_k \end{aligned}\end{split}\]

where \(w_k \sim \mathcal{N}(0, Q)\) is process noise and \(v_k \sim \mathcal{N}(0, R)\) is measurement noise.

The Kalman filter operates in two steps:

Prediction step (time update):

\[\begin{split}\begin{aligned} \hat{x}_{k|k-1} &= A \hat{x}_{k-1|k-1} + B u_k, \\ P_{k|k-1} &= A P_{k-1|k-1} A^\top + Q \end{aligned}\end{split}\]

Update step (measurement update):

\[\begin{split}\begin{aligned} K_k &= P_{k|k-1} C^\top (C P_{k|k-1} C^\top + R)^{-1}, \\ \hat{x}_{k|k} &= \hat{x}_{k|k-1} + K_k (y_k - C \hat{x}_{k|k-1} - D u_k), \\ P_{k|k} &= (I - K_k C) P_{k|k-1} \end{aligned}\end{split}\]

where:

  • \(\hat{x}_{k|k-1}\) is the predicted state estimate

  • \(\hat{x}_{k|k}\) is the updated state estimate after incorporating measurement \(y_k\)

  • \(P_{k|k-1}\) is the predicted error covariance

  • \(P_{k|k}\) is the updated error covariance

  • \(K_k\) is the Kalman gain matrix

  • \(Q\) is the process noise covariance → provided as Q parameter in kf.estimate()

  • \(R\) is the measurement noise covariance → provided as R parameter in kf.estimate()

To set up the Kalman filter you need an already set up Model instance with linear dynamics. Additionally you might want to supply a plot backend (via the plot_backend keyword argument) in order to visualize the estimation results later on. At the moment only Matplotlib and Bokeh are supported for plotting.

The Kalman filter can be initialized as follows:

from hilo_mpc import KF

# Initialize Kalman filter
kf = KF(model)

Required information, like e.g. the model dynamics or the sampling time, will be automatically extracted from the Model instance.

Setting noise covariances

The Kalman filter requires specification of process noise covariance (Q) and measurement noise covariance (R):

import numpy as np

# Set process noise covariance
Q = np.eye(n_states) * 0.01

# Set measurement noise covariance
R = np.eye(n_measurements) * 0.1

Setup and estimation

After initialization, the Kalman filter must be set up before use:

# Setup the Kalman filter
kf.setup()

# Set initial state estimate and covariance
x0 = [initial_state_estimate]
P0 = np.eye(n_states)  # Initial covariance

# Perform estimation
x_estimated = kf.estimate(x0=x0, y=measurement, u=input, P=P0, Q=Q, R=R)

The estimate() method performs both the prediction and update steps of the Kalman filter algorithm.

Extended Kalman Filter

The class ExtendedKalmanFilter (alias EKF) implements the Extended Kalman Filter for nonlinear systems. The EKF linearizes the nonlinear system dynamics and measurement equations around the current state estimate at each time step.

Mathematical formulation

The EKF is designed for nonlinear systems:

\[\begin{split}\begin{aligned} x_{k+1} &= f(x_k, u_k) + w_k, \\ y_k &= h(x_k, u_k) + v_k \end{aligned}\end{split}\]

where \(w_k \sim \mathcal{N}(0, Q)\) and \(v_k \sim \mathcal{N}(0, R)\).

At each time step, the EKF linearizes the system around the current estimate \(\hat{x}_{k|k-1}\):

\[\begin{split}\begin{aligned} F_k &= \left. \frac{\partial f}{\partial x} \right|_{x=\hat{x}_{k-1|k-1}, u=u_k}, \\ H_k &= \left. \frac{\partial h}{\partial x} \right|_{x=\hat{x}_{k|k-1}, u=u_k} \end{aligned}\end{split}\]

Prediction step:

\[\begin{split}\begin{aligned} \hat{x}_{k|k-1} &= f(\hat{x}_{k-1|k-1}, u_k), \\ P_{k|k-1} &= F_k P_{k-1|k-1} F_k^\top + Q \end{aligned}\end{split}\]

Update step:

\[\begin{split}\begin{aligned} K_k &= P_{k|k-1} H_k^\top (H_k P_{k|k-1} H_k^\top + R)^{-1}, \\ \hat{x}_{k|k} &= \hat{x}_{k|k-1} + K_k (y_k - h(\hat{x}_{k|k-1}, u_k)), \\ P_{k|k} &= (I - K_k H_k) P_{k|k-1} \end{aligned}\end{split}\]

The key difference from the linear Kalman filter is that:

  1. The state prediction uses the nonlinear function \(f(\cdot)\) directly

  2. The Jacobian matrices \(F_k\) and \(H_k\) are computed at each step through automatic differentiation

  3. These Jacobians replace the constant matrices \(A\) and \(C\) in the covariance updates

The EKF is suitable for systems where the dynamics are:

  • Nonlinear but can be approximated well by linearization

  • Subject to Gaussian or approximately Gaussian noise

Initialization is similar to the standard Kalman filter:

from hilo_mpc import EKF

# Initialize Extended Kalman filter
ekf = EKF(model)

The same methods and properties as for the Kalman filter apply. The key difference is that the EKF automatically linearizes the nonlinear model at each estimation step.

Setup and estimation

# Setup the EKF
ekf.setup()

# Perform estimation (same interface as KF)
x_estimated = ekf.estimate(x0=x0, y=measurement, u=input, P=P0, Q=Q, R=R)

The EKF handles the linearization internally, so the user interface remains the same as the standard Kalman filter.

Unscented Kalman Filter

The class UnscentedKalmanFilter (alias UKF) implements the Unscented Kalman Filter, which uses the unscented transformation to handle nonlinearities. Unlike the EKF, the UKF does not require explicit linearization and can provide better estimates for highly nonlinear systems.

Mathematical formulation

The UKF is also designed for nonlinear systems:

\[\begin{split}\begin{aligned} x_{k+1} &= f(x_k, u_k) + w_k, \\ y_k &= h(x_k, u_k) + v_k \end{aligned}\end{split}\]

where \(w_k \sim \mathcal{N}(0, Q)\) and \(v_k \sim \mathcal{N}(0, R)\).

The UKF uses sigma points to capture the mean and covariance of the state distribution through nonlinear transformations. For an \(n_x\)-dimensional state, the UKF generates \(2n_x + 1\) sigma points:

\[\begin{split}\begin{aligned} \mathcal{X}_{k-1}^{(0)} &= \hat{x}_{k-1|k-1}, \\ \mathcal{X}_{k-1}^{(i)} &= \hat{x}_{k-1|k-1} + \left(\sqrt{(n_x + \lambda) P_{k-1|k-1}}\right)_i, \quad i=1,\ldots,n_x, \\ \mathcal{X}_{k-1}^{(i)} &= \hat{x}_{k-1|k-1} - \left(\sqrt{(n_x + \lambda) P_{k-1|k-1}}\right)_{i-n_x}, \quad i=n_x+1,\ldots,2n_x \end{aligned}\end{split}\]

where \(\lambda = \alpha^2 (n_x + \kappa) - n_x\) is a scaling parameter.

Prediction step:

  1. Propagate sigma points through the dynamics:

\[\mathcal{X}_{k|k-1}^{(i)} = f(\mathcal{X}_{k-1}^{(i)}, u_k), \quad i=0,\ldots,2n_x\]
  1. Compute predicted mean and covariance:

\[\begin{split}\begin{aligned} \hat{x}_{k|k-1} &= \sum_{i=0}^{2n_x} W_i^{(m)} \mathcal{X}_{k|k-1}^{(i)}, \\ P_{k|k-1} &= \sum_{i=0}^{2n_x} W_i^{(c)} (\mathcal{X}_{k|k-1}^{(i)} - \hat{x}_{k|k-1})(\mathcal{X}_{k|k-1}^{(i)} - \hat{x}_{k|k-1})^\top + Q \end{aligned}\end{split}\]

Update step:

  1. Generate new sigma points and propagate through measurement function:

\[\mathcal{Y}_{k|k-1}^{(i)} = h(\mathcal{X}_{k|k-1}^{(i)}, u_k), \quad i=0,\ldots,2n_x\]
  1. Compute predicted measurement and innovation covariance:

\[\begin{split}\begin{aligned} \hat{y}_{k|k-1} &= \sum_{i=0}^{2n_x} W_i^{(m)} \mathcal{Y}_{k|k-1}^{(i)}, \\ P_{yy} &= \sum_{i=0}^{2n_x} W_i^{(c)} (\mathcal{Y}_{k|k-1}^{(i)} - \hat{y}_{k|k-1})(\mathcal{Y}_{k|k-1}^{(i)} - \hat{y}_{k|k-1})^\top + R, \\ P_{xy} &= \sum_{i=0}^{2n_x} W_i^{(c)} (\mathcal{X}_{k|k-1}^{(i)} - \hat{x}_{k|k-1})(\mathcal{Y}_{k|k-1}^{(i)} - \hat{y}_{k|k-1})^\top \end{aligned}\end{split}\]
  1. Compute Kalman gain and update state:

\[\begin{split}\begin{aligned} K_k &= P_{xy} P_{yy}^{-1}, \\ \hat{x}_{k|k} &= \hat{x}_{k|k-1} + K_k (y_k - \hat{y}_{k|k-1}), \\ P_{k|k} &= P_{k|k-1} - K_k P_{yy} K_k^\top \end{aligned}\end{split}\]

The weights are given by:

\[\begin{split}\begin{aligned} W_0^{(m)} &= \frac{\lambda}{n_x + \lambda}, \\ W_0^{(c)} &= \frac{\lambda}{n_x + \lambda} + (1 - \alpha^2 + \beta), \\ W_i^{(m)} &= W_i^{(c)} = \frac{1}{2(n_x + \lambda)}, \quad i=1,\ldots,2n_x \end{aligned}\end{split}\]

The UKF uses sigma points to capture the mean and covariance of the state distribution through nonlinear transformations. This approach often provides more accurate estimates than the EKF, especially for systems with strong nonlinearities.

Initialization

from hilo_mpc import UKF

# Initialize Unscented Kalman filter
ukf = UKF(model)

The UKF has additional tuning parameters that control the distribution of sigma points:

  • \(\alpha\) (ukf.alpha) - Spread of sigma points around the mean (typically \(10^{-3} \leq \alpha \leq 1\))

  • \(\beta\) (ukf.beta) - Incorporate prior knowledge of distribution (2 is optimal for Gaussian distributions)

  • \(\kappa\) (ukf.kappa) - Secondary scaling parameter (typically 0 or \(3-n_x\))

# Set UKF-specific parameters
ukf.alpha = 0.001  # Spread of sigma points (typically 0.001 to 1)
ukf.beta = 2.0     # Incorporate prior knowledge of distribution (2 is optimal for Gaussian)
ukf.kappa = 0.0    # Secondary scaling parameter (typically 0 or 3-n where n is state dimension)

Setup and estimation

# Setup the UKF
ukf.setup()

# Perform estimation
x_estimated = ukf.estimate(x0=x0, y=measurement, u=input, P=P0, Q=Q, R=R)

The UKF provides the same interface as the KF and EKF but uses the unscented transformation internally.

Particle Filter

The class ParticleFilter (alias PF) implements a particle filter (Sequential Monte Carlo method) for state estimation. The particle filter can handle:

  • Highly nonlinear dynamics

  • Non-Gaussian noise distributions

  • Multimodal state distributions

Mathematical formulation

The particle filter represents the probability distribution of the state using a set of \(N_s\) particles (samples) \(\{x_k^{(i)}, w_k^{(i)}\}_{i=1}^{N_s}\), where \(x_k^{(i)}\) is the \(i\)-th particle and \(w_k^{(i)}\) is its associated weight.

Consider the nonlinear system:

\[\begin{split}\begin{aligned} x_{k+1} &= f(x_k, u_k) + w_k, \\ y_k &= h(x_k, u_k) + v_k \end{aligned}\end{split}\]

The particle filter algorithm consists of the following steps:

1. Initialization (\(k=0\)):

Sample \(N_s\) particles from the initial distribution:

\[x_0^{(i)} \sim p(x_0), \quad w_0^{(i)} = \frac{1}{N_s}, \quad i=1,\ldots,N_s\]

2. Prediction step:

Propagate each particle through the system dynamics:

\[x_{k}^{(i)} = f(x_{k-1}^{(i)}, u_k) + w_k^{(i)}, \quad w_k^{(i)} \sim p(w)\]

3. Update step:

Update particle weights based on the likelihood of the measurement:

\[\begin{split}\begin{aligned} \tilde{w}_k^{(i)} &= w_{k-1}^{(i)} \cdot p(y_k | x_k^{(i)}, u_k), \\ w_k^{(i)} &= \frac{\tilde{w}_k^{(i)}}{\sum_{j=1}^{N_s} \tilde{w}_k^{(j)}} \end{aligned}\end{split}\]

where the likelihood is typically:

\[p(y_k | x_k^{(i)}, u_k) = \frac{1}{\sqrt{(2\pi)^{n_y}|R|}} \exp\left(-\frac{1}{2}\Vert y_k - h(x_k^{(i)}, u_k) \Vert_R^2\right)\]

4. State estimate:

The state estimate is computed as the weighted mean of particles:

\[\hat{x}_k = \sum_{i=1}^{N_s} w_k^{(i)} x_k^{(i)}\]

5. Resampling:

To prevent particle degeneracy (all weight concentrated on few particles), resampling is performed when the effective sample size falls below a threshold:

\[N_{\text{eff}} = \frac{1}{\sum_{i=1}^{N_s} (w_k^{(i)})^2}\]

Various resampling schemes exist (multinomial, systematic, stratified). After resampling, weights are reset to \(w_k^{(i)} = 1/N_s\).

Optional enhancements:

  • Roughening: Add small random noise to particles after resampling to prevent sample impoverishment

  • Prior editing: Reject particles that are inconsistent with known constraints before weight update

The particle filter represents the probability distribution of the state using a set of particles (samples), making it very flexible but more computationally intensive than Kalman-based filters.

Initialization

from hilo_mpc import PF

# Initialize Particle filter
pf = PF(model, plot_backend='bokeh')

Setting the number of particles

The number of particles \(N_s\) affects both accuracy and computational cost:

# Set number of particles
pf.sample_size = 1000  # More particles = better accuracy but higher cost

Variants and options

The particle filter supports different variants and options:

# Enable roughening to prevent particle depletion
pf = PF(model, roughening=True, K=0.2)

# Enable prior editing
pf = PF(model, prior_editing=True)

Setup and estimation

# Setup the particle filter
pf.setup()

# Perform estimation
x_estimated = pf.estimate(y=measurement, u=input, Q=Q, R=R)

The particle filter automatically handles particle propagation, weight computation, and resampling.

Choosing the Right Observer

The choice of observer depends on several factors:

System Characteristics:

Computational Resources:

  • Low computational cost: KF (fastest) or EKF

  • Moderate computational cost: UKF

  • High computational cost acceptable: PF or MovingHorizonEstimator (MHE)

Estimation Requirements:

Summary Table: