solution
The data file co2data.csv contains monthly mean atmospheric CO2 concentrations measured at Mauna Loa, Hawaii, from January 1959 to December 2003. Atmospheric CO2 concentrations show a distinct seasonal pattern, reflecting the annual cycle of plant activities. The data set has four columns: CO2 (monthly CO2 concentrations in ppm), mon (calendar month), year, and months (months since January 1959). The data plot (Figure 6.30) showed an unmistakable increasing temporal trend in CO2 concentrations. In this question you are asked to quantify the magnitude of this increase. A frequently used statistical method for estimating temporal trend is to fit a linear regression model of the CO2 concentration against a time variable (e.g., number of months since a starting point). In this case, the column months in the data set is such a time variable.
(a) Fit a simple regression model using CO2 as the response variable and months as the predictor variable. Quantify the temporal trend (monthly or annual rate of increase in CO2 concentration) and discuss the potential problems of the model. (Hint: plot the residuals against months.)
(b) Refit the model by using mon as a second (factor) predictor and explain the temporal trend in CO2 concentrations.
In both models, the residuals versus fitted plot shows a systematic pattern. What may be the cause of such pattern?