FTS Lesson: Principal Components: Theory
Click Examples of PCA to see an illustration of the technique.
Let be the n interest rates or, more generally, asset returns.
If we are analyzing interest rates, we typically study changes in interest rates. For stock returns, we can study either changes in returns or returns themselves. The discussion below works with changes, but you can do exactly the same with stock returns. Let
denote the vector of changes in rates minus the mean
Let S be the covariance matrix of the Dr, so S is an n ´ n matrix.
We believe that there are n independent factors, F1,…,Fn which determine Dr. So write
Dr = Bf where B is an n ´ n matrix.
We need to find B and f so that the factors are independent.
Re-write this as:
F = ADr for an n ´ n matrix A.
Then, the covariance of f is
Since we want the factors to be independent, we need this matrix to be a diagonal matrix, so that all the covariance’s of the factors are zero (which is necessary for independence).
So we need to find a matrix A that transforms S into a diagonal matrix. This is what eigenvalues and eigenvectors let you do.
Given a matrix S, an eigenvalue of S is a number l that solves the equation
Sa = la for some vector a.
If l is an eigenvalue of S and Sa = la then a is the associated eigenvector.
An n by n matrix will have n eigenvalues. If S is symmetric, then these eigenvalues are always real numbers. If S is singular, then zero is an eigenvalue with a non-zero eigenvector.
Let A be the matrix of eigenvectors and suppose A is non-singular (which requires S to be non-singular). Then,
SA = AL and so L = A-1SA and we have almost achieved our goal of diagonalizing S. The only problem is that we want
while we have achieved
So we need to “fix” A so that A’ = A-1. It turns out that if S is a symmetric matrix, then we can choose the eigenvectors to be “orthonormal” which means A multiplied by itself is the identity matrix: A’A = I so that A’ = A-1.
If we pick such an A, we have what we want:
Let F = ADr where A is an orthonormal matrix of eigenvectors of S. Then, the covariance matrix of F is L = Sf .
One last point: this transformation preserves the total variance.
Since L is a diagonal matrix, the sum of the variances of the factors is simply the sum of the diagonals, i.e. l1 + l2 + ... + ln
The sum of the diagonals of a (square) matrix is called the trace of the matrix.
It turns out that trace(GH) = trace(HG) for square matrices G and H.
Since , we get:
But trace(S) is the sum of the variances of Dr (the changes in interest rates or stock returns).
So the sum of the variances of the factors equals the sum of the variances of the changes in rates.
So we can calculate how much of each of the factors explains the total variance. Factor 1 explains the proportion
of the total variance.
So the largest eigenvalue corresponds to the factor that explains most of the variance, the second largest eigenvalue corresponds to the factor that explains most of the remaining variance, and so on.
So we usually order the factors so that the first factor is the one with the largest eigenvalue, and so on.
With spot interest rates, we typically find that 2 or 3 factors explain 90 to 95% of the variance. This happens because interest rates of different maturities are highly correlated. This is not true for all countries and not always true for other curves such as forward curves. This is certainly not true for stock returns.
For interest rates, when we calculate the eigenvalues, it usually turns out that the largest one has eigenvectors that are all positive. So we call this the “shift” factor: it changes all rates in the same direction. The second eigenvector typically raises short and long rates in opposite directions, and is called a “twist” factor. The third factor typically moves medium term rates in the opposite direction of both short and long rates, and is called the “butterfly” factor.