This dissertation is on high dimensional data and their associated regularization through dimension reduction and penalization. We start with two real world problems to illustrate the practical difficulties and remedies in analyzing high dimensional data. In Chapter 1, we are tasked with modeling and predicting the U.S. stock market, where the number of stocks far exceeds the number of days relevant to the current market. Through an existing statistical arbitrage framework, we reduce the dimension of our problem with the use of correspondence analysis. We develop a data driven regression model and highlight some common statistical methods that improve our predictions. In Chapter 2, we attempt to detect and predict system anomalies in large ...