1). The US Environmental Protection Agency collected magnesium uptake data, which is included in the file “magnes.xls” available on the unit webpage. Moreover, this data set contains the amount of magnesium uptake measured at different times with two different treatments. It is anticipated that the two treatments used may result in different regression equations.
(a) A model is suggested in which magnesium uptake is regressed against the time in a quadratic model:
E(Y ) = β0 + β1x + β2x2 + β3z
where z is an indicator variable representing the treatments. Fit this regression model. Report the R2, the fitted equation of the model and check assumptions of the model with appropriate residuals diagnostic.
(b) A researcher wants to determine if the simple indicator variable is really appropriate. Basically, the question is equivalent to whether the two separate models for treatments
E(y) = β0 + β1x + β2x2 for treatment 1
E(y) = γ0 + γ1x + γ2x2 for treatment 2
satisfy the hypothesis H0: β1 = γ1 and β2 = γ2.
One way of testing this hypothesis is by the following steps
(1) Combine the two models into one E(y) = Xβ with appropriate design matrix X and coefficient vector β. In this question β is 6 × 1.
(2) Identify a C matrix such that the above hypothesis can be expressed as H0: Cβ = 0.
Clearly specify the matrices X, β and C.
(c) Hence test the above hypothesis using the T 2 test.
(d) Also perform the above test using Full and Reduced models.
2). The file “air.xls” (see the unit web page) contains 42 measurements on airpollution variables recorded at 12:00 noon in the Los Angeles area on different days.
(a) Obtain the sample correlation matrix R.
(b) Find eigenvalues and eigenvectors of R. Then determine how many common factors are needed for the FA model.
We assume the common factor number m = 2 for the following questions.
(c) Estimate the factor loading values λjk and specific variances ψj using the principal component approach.
(d) Estimate the above quantities again using maximum likelihood. What is the difference between the solutions of ML and principal component?
(e) Calculate the factor scores from the ML estimates by :
(1) weighted least squares, and
(2) regression approach.
3). The weekly rates of return for five stocks listed on the New York Stock Exchange are given in file “stock.xls”; see the unit webpage.
(a) Construct the sample covariance (or correlation) matrix S and find the sample principal components.
(b) Determine the proportion and the cumulative proportion of the total variance explained by each principal component.
(c) Construct a SCREE plot of the eigenvalues. Can we summarize the 5 stock variables in less than 5 dimensions?