Principal Components Analysis (PCA) in Quantitative Finance
February 10, 2012
Principal Components Analysis (PCA) is a very important mathematical technique used in almost all areas of Quantitative Finance. Institutional portfolio managers use this to allocate funds amongst assets and asset classes, interest rate structurers and quants use this technique to model the yield curve and analyze its shape and many rate quants use this technique to implement the famous HJM Model. Many rates and fixed income traders use this methodology to hedge their portfolios, quantitative equity traders use this to develop algorithms to buy and sell stocks and FX algorithmic traders use this to generate price signals.
Even outside of Quantitative Finance, PCA is everywhere in our lives, from biology, physics, engineering, economics to software development and internet search engines. The most famous Quantitative Finance, PCA is everywhere in our lives, from biology, physics, engineering, economics to software development and internet search engines. The most famous and powerful internet search engine, Google, uses PCA algorithm. It can be safely said that without PCA there would be no Google Search
Some of the specific applications of PCA in the field of quantitative finance are:
- Analyzing the shape of the yield curve;
- Hedging fixed income portfolios;
- Implementation of interest rate models, such as the Heath Jarrow Morton (HJM) model, calibrating Libor model, etc.;
- Forecasting portfolio returns and analyzing the risk of large institutional portfolios;
- Developing asset allocation algorithms for equity portfolios;
- Developing long short equity trading algorithms and analyzing pairs;
- Analyzing market risk of asset portfolios, developing innovating risk analysis strategies such as the risk on, risk off strategy, etc.;
- Analyzing and forecasting the volatility surface skew;
PCA is a methodology to reduce the dimensionality of a complex problem. Say, a fund manager has 1,000 stocks in his portfolio. If we were to analyze all the stocks quantitatively then we need a correlation matrix. As is obvious, even with computing power, this problem can get very unwieldy and cumbersome. But what if there are say, 20 factors think of these "factors" as some kind of "mathematical variables" which explain the movement of all the 1,000 stocks in the manager's universe. Then by analyzing those 20 factors we can get a handle on the dynamics of the entire 1,000 stock universe. This way, a 1,000 stock portfolio gets reduced to a 20 factor portfolio, where each of these 20 factors are independent of other factors and in some way explain the movement of all the 1,000 stocks. This is what PCA does.
The estimation of the factors is the most crucial part of the methodology. These "mathematical factors" are known as the "Principal Components" (PCs) of the asset correlation matrix. Implementation of the PCA methodology entails estimation of PCs of a particular asset correlation matrix or the variance covariance matrix using mathematical techniques. Once the PCs are estimated, applying them to analyze a certain problem in finance is easy.
There are two ways to estimate the PCs mathematically.
- Using the eigen decomposition of a correlation or a variance covariance matrix. This entails estimating the eigevectors and eigenvalues of a correlation (or a variance-covariance matrix) and then estimating the PCs using these eigenvectors. Eigenvectors of a symmetric correlation (or a variance-covariance) matrix form the coefficients of the PCs. The eigenvalues each eigenvector has an associated eigenvalue which tells us how important is that eigenvector in explaining the variance of the asset returns or the correlation amongst them help us in deciding how many eigenvectors factors are important in explaining the correlation of the asset returns (or, in other words, the variance of the asset returns) and hence to keep in our analysis. If there are 1,000 assets in our portfolio, resulting in a correlation matrix then there would be 1,000 eigenvectors. However, maybe, only 20 or 25 of these eigenvectors would be able to explain say, 98% of all variance amongst the asset returns, then we would only retain these 20 or 25 eigenvectors and throw out the rest of the 978 or 975 eigenvectors. The eigenvalues associated with each of these 1,000 eigenvectors will tell us how many eigenvectors to retain and how many to throw out. Eigenvalues and eigenvectors or a symmetric matrix can be calculated using established mathematical techniques such as the Jacobi algorithm, the Power method, etc.
- There is another method to estimate the PCs in which one does not have to resort to explicitly calculating the eigenvectors and eigenvalues using a particular algorithm. This method uses an optimization technique maximization of the product of the PCs and the variance-covariance matrix that involves matrix algebra. If we assume that there are three PCs (say, in a three asset problem), PC1, PC2 and PC3, each depicting an array (or, a vector) then we can find the first PC by maximizing the product: subject to the fact that , where, the sign implies matrix multiplication between an array and a matrix, symbolizes the transpose of the array and stands for the variance-covariance matrix. Similarly, we can estimate the second and the third PCs, however, given the fact that all PCs are independent of each other, the orthogonality conditions needs to be satisfied, i.e. and so on.
Both the above methods can be easily implemented on an ExcelTM spreadsheet, though the eigen decomposition would need hard coding using VBA. We talk about PCA a lot in our CFE Course and most of the above applications are implemented on ExcelTM spreadsheet as part of our CFE Course and CFE Seminars.
- Market Risk Analysis: Pricing, Hedging and Trading Financial Instruments, Part III, Carol Alexander, John Wiley & Sons.
- Fixed Income Securities, Bruce Tuckman & Angel Serrat, John Wiley & Sons.
Any comments and queries can
be sent through our
More on Quantitative Finance >>
back to top