Sensitivity Analysis
Sensitivity Analysis
The fundamental problem in linear systems analysis concerns the equation:
$$A\mathbf{x} = \mathbf{y}$$where \(A \in \mathbb{R}^{n \times n}\) denotes the system matrix and \(\mathbf{y} \in \mathbb{R}^n\) represents the given right-hand side. We seek to determine \(\mathbf{x} \in \mathbb{R}^n\) satisfying this relation.
The Invertible Case
When the matrix \(A\) possesses full rank and is consequently invertible, the solution follows directly:
$$\mathbf{x} = A^{-1}\mathbf{y}$$Behavior Under Perturbations
A central question concerns the stability of solutions under small perturbations. Consider the case where \(\mathbf{y}\) undergoes a perturbation \(\Delta\mathbf{y}\):
$$\mathbf{y} \longrightarrow \mathbf{y} + \Delta\mathbf{y}$$The corresponding change in the solution \(\mathbf{x}\) can be characterized by:
$$\mathbf{x} \longrightarrow \mathbf{x} + \Delta\mathbf{x}$$Derivation of the Sensitivity Relation
The perturbed quantities must satisfy the original system equation:
$$A(\mathbf{x} + \Delta\mathbf{x}) = \mathbf{y} + \Delta\mathbf{y}$$Expanding yields:
$$A\mathbf{x} + A\Delta\mathbf{x} = \mathbf{y} + \Delta\mathbf{y}$$Since \(A\mathbf{x} = \mathbf{y}\), these terms vanish:
$$A\Delta\mathbf{x} = \Delta\mathbf{y}$$Multiplying by the inverse yields:
A Fundamental Inequality
The subsequent analysis requires a fundamental result concerning the interaction between matrix norms and vector norms.
Proof. The matrix 2-norm is defined as:
$$\left\|B\right\|_2 = \max_{\left\|\mathbf{z}\right\|_2 = 1} \left\|B\mathbf{z}\right\|_2$$This definition is equivalent to:
$$\left\|B\right\|_2 = \max_{\mathbf{z} \neq 0} \frac{\left\|B\mathbf{z}\right\|_2}{\left\|\mathbf{z}\right\|_2}$$Since this is a maximum, for any particular nonzero vector \(\mathbf{y}\) we have:
$$\left\|B\right\|_2 \geq \frac{\left\|B\mathbf{y}\right\|_2}{\left\|\mathbf{y}\right\|_2}$$Multiplying by the norm yields the desired inequality. ∎
Relative Sensitivity Bound
The relation above admits a refinement in terms of relative changes. Applying the submultiplicative property yields:
$$\left\|\Delta\mathbf{x}\right\|_2 = \left\|A^{-1}\Delta\mathbf{y}\right\|_2 \leq \left\|A^{-1}\right\|_2 \left\|\Delta\mathbf{y}\right\|_2$$From the equation, the submultiplicative property provides:
$$\left\|\mathbf{y}\right\|_2 = \left\|A\mathbf{x}\right\|_2 \leq \left\|A\right\|_2 \left\|\mathbf{x}\right\|_2$$Combining these inequalities produces the fundamental sensitivity bound:
This bound relates the relative change in the solution to the relative change in the data through the condition number of the matrix.
Interpretation of the Condition Number
The magnitude of the condition number characterizes the sensitivity of the solution to perturbations in the data:
Large condition number: When \(\kappa(A) \gg 1\), the system is ill-conditioned. Even small perturbations in the data can lead to arbitrarily large changes in the solution.
Spectral Characterization
For an invertible matrix \(A\), the singular value decomposition provides a complete characterization of the condition number. Since invertibility requires all singular values to be strictly positive, the decomposition takes the form:
$$A = U \begin{bmatrix} \sigma_1 & & 0 \\ & \ddots & \\ 0 & & \sigma_n \end{bmatrix} V^T$$where the singular values are ordered \(\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_n > 0\). The condition \(\sigma_n > 0\) is equivalent to invertibility, as any zero singular value would render the matrix singular. The inverse matrix admits the representation:
$$A^{-1} = V \begin{bmatrix} \frac{1}{\sigma_1} & & 0 \\ & \ddots & \\ 0 & & \frac{1}{\sigma_n} \end{bmatrix} U^T$$The operator norm of \(A\) equals the largest singular value, while the norm of the inverse equals the reciprocal of the smallest singular value. These observations yield:
$$\left\|A\right\|_2 = \sigma_1, \quad \left\|A^{-1}\right\|_2 = \frac{1}{\sigma_n}$$The condition number therefore reduces to the ratio of extreme singular values:
Since \(\sigma_1 \geq \sigma_n\), the condition number satisfies:
The Orthogonal Case
An important special case arises when the matrix \(A\) is orthogonal, characterized by the property:
$$A^TA = I$$The singular values of any matrix correspond to the square roots of the eigenvalues of \(AA^T\). For an orthogonal matrix, this product reduces to the identity, whose eigenvalues are all unity. Consequently, all singular values equal one:
$$\sigma_1 = \sigma_2 = \cdots = \sigma_n = 1$$The condition number therefore attains its minimum possible value:
$$\kappa(A) = \frac{\sigma_1}{\sigma_n} = \frac{1}{1} = 1$$Perturbations in the Matrix
The analysis extends to the case where the matrix itself undergoes perturbation. Consider the perturbed system where both the matrix and the solution change:
$$A \longrightarrow A + \Delta A, \quad \mathbf{x} \longrightarrow \mathbf{x} + \Delta\mathbf{x}$$The perturbed quantities must satisfy the equation with the original right-hand side:
$$(A + \Delta A)(\mathbf{x} + \Delta\mathbf{x}) = \mathbf{y}$$Expanding the left-hand side yields:
$$A\mathbf{x} + A\Delta\mathbf{x} + \Delta A \mathbf{x} + \Delta A \Delta\mathbf{x} = \mathbf{y}$$Since \(A\mathbf{x} = \mathbf{y}\), this term vanishes, leaving:
$$A\Delta\mathbf{x} + \Delta A \mathbf{x} + \Delta A \Delta\mathbf{x} = \mathbf{0}$$Isolating the term involving \(A\Delta\mathbf{x}\) produces:
$$A\Delta\mathbf{x} = -\Delta A(\mathbf{x} + \Delta\mathbf{x})$$Multiplying by the inverse yields the fundamental relation:
Derivation of the Bound
Taking norms of both sides and applying the submultiplicative property:
$$\left\|\Delta\mathbf{x}\right\|_2 = \left\|A^{-1}\Delta A(\mathbf{x} + \Delta\mathbf{x})\right\|_2$$Setting \(B = A^{-1}\Delta A\) and \(\mathbf{z} = \mathbf{x} + \Delta\mathbf{x}\), the submultiplicative property provides:
$$\left\|\Delta\mathbf{x}\right\|_2 = \left\|B\mathbf{z}\right\|_2 \leq \left\|B\right\|_2 \left\|\mathbf{z}\right\|_2$$Expanding the norm of the product matrix:
$$\left\|\Delta\mathbf{x}\right\|_2 \leq \left\|A^{-1}\Delta A\right\|_2 \left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2$$Applying the submultiplicative property once more:
$$\left\|\Delta\mathbf{x}\right\|_2 \leq \left\|A^{-1}\right\|_2 \left\|\Delta A\right\|_2 \left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2$$Introducing the condition number by multiplying and dividing by \(\left\|A\right\|_2\):
$$\left\|\Delta\mathbf{x}\right\|_2 \leq \left\|A^{-1}\right\|_2 \left\|A\right\|_2 \cdot \frac{\left\|\Delta A\right\|_2}{\left\|A\right\|_2} \cdot \left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2$$This yields the fundamental bound for matrix perturbations:
Dividing both sides by \(\left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2\) produces the relative error bound:
Sensitivity to Joint Perturbations
The analysis culminates in the case where both the matrix and the right-hand side undergo simultaneous perturbation. Consider the perturbed system:
$$(A + \Delta A)(\mathbf{x} + \Delta\mathbf{x}) = \mathbf{y} + \Delta\mathbf{y}$$Expanding and using the relation \(A\mathbf{x} = \mathbf{y}\) yields:
$$A\Delta\mathbf{x} = \Delta\mathbf{y} – \Delta A(\mathbf{x} + \Delta\mathbf{x})$$Multiplying by the inverse produces:
$$\Delta\mathbf{x} = A^{-1}\Delta\mathbf{y} – A^{-1}\Delta A(\mathbf{x} + \Delta\mathbf{x})$$Taking norms and applying the triangle inequality:
$$\left\|\Delta\mathbf{x}\right\|_2 = \left\|A^{-1}\Delta\mathbf{y} – A^{-1}\Delta A(\mathbf{x} + \Delta\mathbf{x})\right\|_2 \leq \left\|A^{-1}\Delta\mathbf{y}\right\|_2 + \left\|A^{-1}\Delta A(\mathbf{x} + \Delta\mathbf{x})\right\|_2$$The submultiplicative property yields:
$$\left\|\Delta\mathbf{x}\right\|_2 \leq \left\|A^{-1}\right\|_2 \left\|\Delta\mathbf{y}\right\|_2 + \left\|A^{-1}\right\|_2 \left\|\Delta A\right\|_2 \left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2$$Dividing by \(\left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2\) and introducing the condition number:
$$\frac{\left\|\Delta\mathbf{x}\right\|_2}{\left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2} \leq \left\|A^{-1}\right\|_2 \frac{\left\|\Delta\mathbf{y}\right\|_2}{\left\|\mathbf{y}\right\|_2} \frac{\left\|\mathbf{y}\right\|_2}{\left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2} + \kappa(A) \frac{\left\|\Delta A\right\|_2}{\left\|A\right\|_2}$$Since \(\left\|\mathbf{y}\right\|_2 = \left\|A\mathbf{x}\right\|_2 \leq \left\|A\right\|_2 \left\|\mathbf{x}\right\|_2\), the first term becomes:
$$\frac{\left\|\Delta\mathbf{x}\right\|_2}{\left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2} \leq \kappa(A) \frac{\left\|\Delta\mathbf{y}\right\|_2}{\left\|\mathbf{y}\right\|_2} \frac{\left\|\mathbf{x}\right\|_2}{\left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2} + \kappa(A) \frac{\left\|\Delta A\right\|_2}{\left\|A\right\|_2}$$The triangle inequality provides \(\left\|\mathbf{x}\right\|_2 = \left\|\mathbf{x} + \Delta\mathbf{x} – \Delta\mathbf{x}\right\|_2 \leq \left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2 + \left\|\Delta\mathbf{x}\right\|_2\). Substituting yields:
$$\frac{\left\|\Delta\mathbf{x}\right\|_2}{\left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2} \leq \kappa(A) \frac{\left\|\Delta\mathbf{y}\right\|_2}{\left\|\mathbf{y}\right\|_2} \left(1 + \frac{\left\|\Delta\mathbf{x}\right\|_2}{\left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2}\right) + \kappa(A) \frac{\left\|\Delta A\right\|_2}{\left\|A\right\|_2}$$Rearranging to isolate the relative error term:
$$\frac{\left\|\Delta\mathbf{x}\right\|_2}{\left\|\mathbf{x} + \Delta\mathbf{x}\right\|_2} \leq \frac{\kappa(A)}{1 – \kappa(A) \frac{\left\|\Delta\mathbf{y}\right\|_2}{\left\|\mathbf{y}\right\|_2}} \left(\frac{\left\|\Delta\mathbf{y}\right\|_2}{\left\|\mathbf{y}\right\|_2} + \frac{\left\|\Delta A\right\|_2}{\left\|A\right\|_2}\right)$$The amplification factor is bounded by the quotient involving the condition number. For a given parameter \(\gamma > 1\), the bound admits simplification when:
$$\kappa(A) \leq \frac{\gamma}{1 + \gamma \frac{\left\|\Delta\mathbf{y}\right\|_2}{\left\|\mathbf{y}\right\|_2}}$$This result demonstrates that the effect of joint perturbations remains controlled by the condition number of the matrix, with the amplification factor providing an upper bound on the relative error propagation.
Sensitivity Analysis for Least Squares
The least squares problem concerns the minimization:
$$\min_{\mathbf{x} \in \mathbb{R}^n} \left\|A\mathbf{x} – \mathbf{y}\right\|_2$$where \(A \in \mathbb{R}^{m \times n}\) denotes the system matrix and \(\mathbf{y} \in \mathbb{R}^m\) represents the data vector. The solution set admits the characterization:
$$S = A\mathbf{y} + \mathcal{N}(A)$$where \(\mathcal{N}(A)\) denotes the null space of the matrix.
The Perturbed Problem
Consider the perturbed least squares problem where the data vector undergoes a perturbation \(\Delta\mathbf{y}\):
$$\min_{\mathbf{x} \in \mathbb{R}^n} \left\|A\mathbf{x} – (\mathbf{y} + \Delta\mathbf{y})\right\|_2$$The solution set of the perturbed problem takes the form:
$$S_p = A(\mathbf{y} + \Delta\mathbf{y}) + \mathcal{N}(A)$$The difference between the solution sets reduces to:
$$S_p – S = A\Delta\mathbf{y}$$The Minimum-Norm Solution
Among all solutions to the least squares problem, the minimum-norm solution possesses unique properties. Denote by \(\mathbf{x}^+\) the minimum-norm solution of the original problem and by \(\mathbf{x}^+ + \Delta\mathbf{x}\) the minimum-norm solution of the perturbed problem. These solutions satisfy:
$$S = A\mathbf{y} + \mathcal{N}(A) = \mathbf{x}^+ + \mathcal{N}(A)$$ $$S_p = A(\mathbf{y} + \Delta\mathbf{y}) + \mathcal{N}(A) = (\mathbf{x}^+ + \Delta\mathbf{x}) + \mathcal{N}(A)$$The perturbation in the minimum-norm solution therefore satisfies:
where \(A^+\) denotes the Moore-Penrose pseudoinverse of the matrix \(A\).
Geometry of Bounded Perturbations
Consider the constraint that perturbations in the data vector remain bounded in norm:
$$\left\|\delta\mathbf{y}\right\|_2 \leq 1$$The corresponding set of induced perturbations in the solution admits the characterization:
$$\mathcal{E} = \{\, \delta\mathbf{x} = B\,\delta\mathbf{y} : \left\|\delta\mathbf{y}\right\|_2 \leq 1 \,\}$$where \(B \in \mathbb{R}^{n \times n}\) denotes the linear map connecting perturbations in the data to perturbations in the solution. This set describes all possible changes in the solution induced by bounded perturbations in the data.
The Two-Dimensional Case
For geometric intuition, consider first the case \(n = 2\). The Euclidean ball of radius \(r\) in the plane admits the description:
$$\{\, \mathbf{x} \in \mathbb{R}^2 : x_1^2 + x_2^2 \leq r^2 \,\}$$This constraint can be expressed in normalized form as:
$$\frac{x_1^2}{r^2} + \frac{x_2^2}{r^2} \leq 1$$Equivalently, in quadratic form notation:
$$\mathbf{x}^T \begin{bmatrix} \frac{1}{r^2} & 0 \\ 0 & \frac{1}{r^2} \end{bmatrix} \mathbf{x} \leq 1$$Introducing the change of variables \(\mathbf{y} = \frac{1}{r}\mathbf{x}\), the condition reduces to the unit ball constraint \(\left\|\mathbf{y}\right\|_2 \leq 1\). The disk of radius \(r\) therefore emerges as the image of the unit ball under the scaling transformation \(\mathbf{x} = r\mathbf{y}\).
Quadratic Form Characterization
In higher dimensions, the general perturbation set takes the form:
$$\mathbf{x} = B\mathbf{y}, \quad \left\|\mathbf{y}\right\|_2 \leq 1$$When the matrix \(B\) is invertible, the unit-ball condition becomes:
$$\left\|B^{-1}\mathbf{x}\right\|_2^2 \leq 1$$Expanding the quadratic form yields:
$$\mathbf{x}^T (B^{-1})^T B^{-1} \mathbf{x} \leq 1$$Defining the symmetric matrix:
$$P = BB^T$$the constraint admits the representation:
$$\mathbf{x}^T P^{-1} \mathbf{x} \leq 1$$The geometric structure of the perturbation set depends on the properties of the matrix \(P\):
Singular case: When \(B\) is not invertible, the matrix \(P\) is only positive semidefinite. The perturbation set degenerates into a lower-dimensional ellipsoid, geometrically flattened into a subspace.
Quadratic Forms and Semi-Axis Lengths
A quadratic form in variables \(\mathbf{x} \in \mathbb{R}^n\) is defined as
$$ q(\mathbf{x}) = \mathbf{x}^T P \mathbf{x} = \sum_{i=1}^n \sum_{j=1}^n p_{ij} x_i x_j, $$where \(P = (p_{ij})\) is a symmetric matrix. Such expressions play a central role in geometry and optimization, since their level sets describe familiar shapes such as ellipsoids, paraboloids, or hyperboloids, depending on the eigenvalues of \(P\).
In particular, when \(P\) is symmetric and positive definite, the set
$$ \{\mathbf{x} \in \mathbb{R}^n : \mathbf{x}^T P^{-1} \mathbf{x} \leq 1\} $$represents a bounded ellipsoid centered at the origin. The requirement of positive definiteness is crucial: all eigenvalues of \(P\) must be strictly positive in order for the region to be closed and finite. If an eigenvalue is zero, the ellipsoid degenerates into a lower-dimensional figure; if any eigenvalue is negative, the quadratic form no longer defines an ellipsoid but rather an unbounded hyperbolic region.
This becomes transparent through the spectral theorem. Since \(P\) is symmetric, it admits the decomposition
$$ P = U \Lambda U^T, $$where \(U\) is an orthogonal matrix whose columns are eigenvectors of \(P\), and \(\Lambda = \mathrm{diag}(\lambda_1, \ldots, \lambda_n)\) contains the corresponding eigenvalues. Substituting this form into the quadratic constraint gives
$$ \mathbf{x}^T U \Lambda^{-1} U^T \mathbf{x} \leq 1. $$By introducing rotated coordinates \(\mathbf{z} = U^T \mathbf{x}\), the inequality simplifies to
$$ \mathbf{z}^T \Lambda^{-1} \mathbf{z} = \sum_{i=1}^n \frac{z_i^2}{\lambda_i} \leq 1. $$This canonical representation makes the geometry explicit: the eigenvectors of \(P\) specify the directions of the principal axes of the ellipsoid, while the eigenvalues determine the squared lengths of the semi-axes, namely
$$ a_i = \sqrt{\lambda_i}. $$Positive definiteness of \(P\) thus ensures that the quadratic form describes a genuine ellipsoid, with strictly positive and finite semi-axes aligned along the eigenvectors of \(P\).
The Rank-Deficient Case
When the matrix \(B \in \mathbb{R}^{n \times m}\) has rank \(r < \min(n,m)\), the perturbation set degenerates to a lower-dimensional ellipsoid. The singular value decomposition yields:
$$B = U \Sigma V^T, \quad \Sigma = \text{diag}(\sigma_1, \ldots, \sigma_r, 0, \ldots, 0)$$where \(\sigma_1 \geq \cdots \geq \sigma_r > 0\). The perturbation set admits the representation:
$$\mathcal{E} = \{\, \mathbf{x} = U_r \Sigma_r \mathbf{z} : \left\|\mathbf{z}\right\|_2 \leq 1 \,\}$$where \(U_r \in \mathbb{R}^{n \times r}\) contains the first \(r\) left singular vectors and \(\Sigma_r = \text{diag}(\sigma_1, \ldots, \sigma_r)\). The geometric properties are:
The quadratic form admits expression through the Moore-Penrose pseudoinverse. Defining:
$$P = BB^T = U \,\text{diag}(\sigma_1^2, \ldots, \sigma_r^2, 0, \ldots, 0)\, U^T$$the perturbation set satisfies:
$$\mathcal{E} = \{\, \mathbf{x} \in \text{Range}(P) : \mathbf{x}^T P^{\dagger} \mathbf{x} \leq 1 \,\}$$where the pseudoinverse takes the form:
$$P^{\dagger} = U \,\text{diag}(\sigma_1^{-2}, \ldots, \sigma_r^{-2}, 0, \ldots, 0)\, U^T$$This formulation makes explicit the degeneration of the ellipsoid outside the range of \(P\).
Connection to the Singular Value Decomposition
The singular value decomposition of the matrix \(B\) provides a complete geometric interpretation. The decomposition takes the form:
$$B = U \Sigma V^T$$where \(U\) and \(V\) denote orthogonal matrices and \(\Sigma = \text{diag}(\sigma_1, \ldots, \sigma_n)\) contains the singular values. This factorization admits a geometric interpretation as a sequence of transformations:
Stretching: The diagonal matrix \(\Sigma\) stretches the ball into an axis-aligned ellipsoid with semi-axes \(\sigma_1, \ldots, \sigma_n\).
Rotation: The matrix \(U\) applies a final rotation, orienting the ellipsoid in the output space.
This decomposition establishes that any linear transformation maps spherical regions to ellipsoidal regions, with the singular values determining the lengths of the ellipsoid’s semi-axes.
Least Squares, Perturbations, and Ellipsoids
Consider first the least squares problem
$$ \min_x \|Ax – y\|_2, $$whose solution set is given by
$$ S = A^+ y + \mathcal{N}(A), $$where \(A^+\) is the Moore–Penrose pseudoinverse and \(\mathcal{N}(A)\) denotes the null space of \(A\). In particular, the minimum-norm solution is
$$ x^* = A^+ y. $$Now suppose the right-hand side is perturbed, and consider
$$ \min_x \|A x – (y + \Delta y)\|_2. $$The new solution set becomes
$$ S_p = S + A^+ \Delta y, $$so that the updated minimum-norm solution is
$$ x^* + \Delta x = A^+ (y + \Delta y). $$To study the effect of perturbations, define the set
$$ \mathcal{E} = \{ \Delta x \in \mathbb{R}^n : \Delta x = A^+ \Delta y, \;\; \|\Delta y\|_2 \leq 1 \}. $$This is precisely the image of the unit ball under the linear transformation \(A^+\). Geometrically, \(\mathcal{E}\) is an ellipsoid. Its semi-axes are determined by the singular values of \(A\).
– The semi-axis directions of \(\mathcal{E}\) are the columns of \(V\).
– The semi-axis lengths are $$ \frac{1}{\sigma_1}, \;\; \frac{1}{\sigma_2}, \;\; \ldots, \;\; \frac{1}{\sigma_r}, \;\; 0, \ldots, 0, $$ where \(r = \mathrm{rank}(A)\).
In two dimensions this can be visualized as the image of a unit circle in the space of perturbations \(\Delta y\), which under the mapping \(A^+\) becomes an ellipse in the space of corrections \(\Delta x\). The more ill-conditioned the matrix \(A\), the more elongated this ellipse becomes.