Regression Analysis - CFA Level 1. Linear Regression. A linear regression is constructed by fitting a line through a scatter plot of paired observations between two variables. The sketch below illustrates an example of a linear regression line drawn through a series of (X, Y) observations: Figure 2. Linear Regression. A linear regression line is usually determined quantitatively by a best- fit procedure such as least squares (i. In linear regression, one variable is plotted on the X axis and the other on the Y. The X variable is said to be the independent variable, and the Y is said to be the dependent variable. When analyzing two random variables, you must choose which variable is independent and which is dependent. The choice of independent and dependent follows from the hypothesis - for many examples, this distinction should be intuitive. Sign Up To Receive Discounts & News! Shop the latest styles and best brands of shoes, boots, sneakers and sandals for men, women, and kids. Great selection, even better prices. Looking to start studying for level 1 now. Your guys' input would be great. The most popular use of regression analysis is on investment returns, where the market index is independent while the individual security or mutual fund is dependent on the market. In essence, regression analysis formulates a hypothesis that the movement in one variable (Y) depends on the movement in the other (X). Regression Equation. The regression equation describes the relationship between two variables and is given by the general format: Formula 2. Y = a + b. X + . If b = 0.
X increases (or decreases) by a certain amount, Y increases (or decreases) by 0. The intercept a indicates the value of Y at the point where X = 0. Thus if X indicated market returns, the intercept would show how the dependent variable performs when the market has a flat quarter where returns are 0. In investment parlance, a manager has a positive alpha because a linear regression between the manager's performance and the performance of the market has an intercept number a greater than 0. Linear Regression - Assumptions. Drawing conclusions about the dependent variable requires that we make six assumptions, the classic assumptions in relation to the linear regression model: The relationship between the dependent variable Y and the independent variable X is linear in the slope and intercept parameters a and b. This requirement means that neither regression parameter can be multiplied or divided by another regression parameter (e. In other words, we can't construct a linear model where the equation was Y = a + b. X + . Assumptions #2 and #3 allow the linear regression model to produce estimates for slope b and intercept a. The variance of the error term is constant for all observations. Assumption #4 is known as the . When a linear regression is heteroskedastic its error terms vary and the model may not be useful in predicting values of the dependent variable. This assumption is necessary to estimate the variances of the parameters. The distribution of the error terms is normal. Assumption #6 allows hypothesis- testing methods to be applied to linear- regression models. Standard Error of Estimate. Abbreviated SEE, this measure gives an indication of how well a linear regression model is working. It compares actual values in the dependent variable Y to the predicted values that would have resulted had Y followed exactly from the linear regression. For example, take a case where a company's financial analyst has developed a regression model relating annual GDP growth to company sales growth by the equation Y = 1. X. Assume the following experience (on the next page) over a five- year period; predicted data is a function of the model and GDP, and . Growth (Yi)Residual(Yi - Yi)Squared residual. To find the standard error of the estimate, we take the sum of all squared residual terms and divide by (n - 2), and then take the square root of the result. In this case, the sum of the squared residuals is 0. With five observations, n - 2 = 3, and SEE = (3. The computation for standard error is relatively similar to that of standard deviation for a sample (n - 2 is used instead of n - 1). It gives some indication of the predictive quality of a regression model, with lower SEE numbers indicating that more accurate predictions are possible. However, the standard- error measure doesn't indicate the extent to which the independent variable explains variations in the dependent model. Coefficient of Determination. Like the standard error, this statistic gives an indication of how well a linear- regression model serves as an estimator of values for the dependent variable. It works by measuring the fraction of total variation in the dependent variable that can be explained by variation in the independent variable. In this context, total variation is made up of two fractions: Total variation = explained variation+unexplained variationtotal variationtotal variation. The coefficient of determination, or explained variation as a percentage of total variation, is the first of these two terms. It is sometimes expressed as 1 - (unexplained variation / total variation). For a simple linear regression with one independent variable, the simple method for computing the coefficient of determination is squaring the correlation coefficient between the dependent and independent variables. Since the correlation coefficient is given by r, the coefficient of determination is popularly known as . For example, if the correlation coefficient is 0. R- squared is (0. R- squared terms are usually expressed as percentages; thus 0. A second method of computing this number would be to find the total variation in the dependent variable Y as the sum of the squared deviations from the sample mean. Next, calculate the standard error of the estimate following the process outlined in the previous section. The coefficient of determination is then computed by (total variation in Y - unexplained variation in Y) / total variation in Y. This second method is necessary for multiple regressions, where there is more than one independent variable, but for our context we will be provided the r (correlation coefficient) to calculate an R- squared. What R2 tells us is the changes in the dependent variable Y that are explained by changes in the independent variable X. R2 of 5. 7. 8 tells us that 5. Y result from X; it also means that 1 - 5. Y are unexplained by X and are the result of other factors. So the higher the R- squared, the better the predictive nature of the linear- regression model. Regression Coefficients. For either regression coefficient (intercept a, or slope b), a confidence interval can be determined with the following information: An estimated parameter value from a sample. Standard error of the estimate (SEE)Significance level for the t- distribution. Degrees of freedom (which is sample size - 2)For a slope coefficient, the formula for confidence interval is given by b . For five years of quarterly returns, the slope coefficient b is found to be 1. Student's t- distribution for 1. This data gives us a confidence interval of 1. Our interpretation is that there is only a 5% chance that the slope of the population is either less than 0. S& P 5. 00, but no more than 1. Hypothesis testing and Regression Coefficients. Regression coefficients are frequently tested using the hypothesis- testing procedure. Depending on what the analyst is intending to prove, we can test a slope coefficient to determine whether it explains chances in the dependent variable, and the extent to which it explains changes. Betas (slope coefficients) can be determined to be either above or below 1 (more volatile or less volatile than the market). Alphas (the intercept coefficient) can be tested on a regression between a mutual fund and the relevant market index to determine whether there is evidence of a sufficiently positive alpha (suggesting value added by the fund manager). The mechanics of hypothesis testing are similar to the examples we have used previously. A null hypothesis is chosen based on a not- equal- to, greater- than or less- than- case, with the alternative satisfying all values not covered in the null case. Suppose in our previous example where we regressed a mutual fund's returns on the S& P 5. A fund equal in volatility to the market will have slope b of 1. H0)as the case where slope is less than or greater to 1. The alternative hypothesis Ha has b > 1. We know that this is a greater- than case (i. Example: Interpreting a Hypothesis Test. From our sample, we had estimated b of 1. Our test statistic is computed with this formula: t = estimated coefficient - hypothesized coeff. For this example, our calculated test statistic is below the rejection level of 1. Interpretation: the hypothesis that b > 1 for this fund probably needs more observations (degrees of freedom) to be proven with statistical significance. Also, with 1. 1. 8 only slightly above 1. Example: Interpreting a regression coefficient. The CFA exam is likely to give the summary statistics of a linear regression and ask for interpretation. To illustrate, assume the following statistics for a regression between a small- cap growth fund and the Russell 2. Correlation coefficient. Intercept- 0. 4. 17. Slope. 1. 3. 17. What do each of these numbers tell us? Variation in the fund is about 7. Russell 2. 00. 0 index. This is true because the square of the correlation coefficient, (0. R- squared. The fund will slightly underperform the index when index returns are flat. This results from the value of the intercept being - 0. When X = 0 in the regression equation, the dependent variable is equal to the intercept. The fund will on average be more volatile than the index. This fact follows from the slope of the regression line of 1. This fact follows from the regression. Additional risk is compensated with additional reward, with the reverse being true in down markets. Predicted values of the fund's return, given a return for the market, can be found by solving for Y = - 0. X (X = Russell 2.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |