Discussed the difference between in-sample and out-of-sample predictions in the same class.

EC2019 Econometrics II:

Programming Project Listed below are the topics for this module’s project which counts 40% of the total mark. You must select one of the four questions below.

You must submit the write-up of your results as well as your R script through Blackboard.The word limit is 2,000. Your R script does not count towards this limit.Collusion and plagiarism will not be tolerated.1. The Granger & New bold experiment using two independent ARIMA(1,1,1) series.Ensure to set a random seed for this question.Letytandxtbe generated by the following processes:yt=yt−1+εt−θ1εt−1;εt∼iin(0,1)(1)xt=xt−1+vt−θ2vt−1;vt∼iin(0,1)(2)withE(εtvt−s) = 0;∀s.(3)whereiinstands for independently, identically, and normally distributed.Simulaterpairs of 500 consecutive data points from processes (1) and (2) with your choice ofvalues forθ1andθ2, ensuring that the invertibility condition is met for both processes. For everysimulated pair, estimate the following by least squares:yt=α+βxt+t;t= 1,2,···,500

(4)and conduct the following hypothesis test:H0:β= 0H1:β6= 0

(5)With the assumption that the test statistic follows a standard normal, report the following:

1. The averaget-ratio.2. The average absolutet-ratio.3. The proportion ofrregressions for which the null hypothesis of the test (5) is rejected at5% significance level.4. The averageR2.5. The proportion ofrregressions withR2>0.7.Choose a sufficient number of replications,r, for reliable results. Conduct this experiment andreport the above statistics for at least two different combinations of values forθ1andθ2. Specifyreasons for your choice of values forθ1andθ2. Alternatively, you can let one of the parametersincrease by an increment of 0.1—for example,θ1=−0.9,θ1=−0.8, all the way toθ1= 0.9.For each one of theseθ1values,θ2can remain fixed at a certain value. Discuss if there are anyobvious relationships between the value/magnitude of the parameters and the results of point 1to 5 above.Compare your results from those obtained in Class 4, where two independent random walks wereused.1

2. Find the size of the Augmented Dickey-Fuller test conducted on a particular ARMA(2,2) process.Ensure to set a random seed for this question.In Class 4, when testing for an AR unit root on monthly time series of the U.S. civilian unem-ployment rate, you found that the ADF test rejected the null hypothesis despite strong indicationfrom the correlogram and the KPSS test that the process had been generated by a unit rootprocess.Select the best-fitting ARMA(p,q) process by AIC and BIC on the first difference of the unem-ployment series, with maximump+q= 5. You will find that both of them select ARMA(2,2)for this series, and that the fitted model is given by1:∆yt= ̃c+ ̃φ1∆yt−1+ ̃φ2∆yt−2+t+ ̃θ1t−1+ ̃θ2t−2;t∼iin(0, ̃σ2)(6)with the maximum likelihood estimates of: ̃c= 5.527e-06 ̃φ1= 1.587; ̃φ2=−0.6914 ̃θ1=−1.525; ̃θ2= 0.7327 ̃σ2= 0.03204(7)whereytdenotes the level unemployment series.I would like you to find out if the apparently incorrect rejection of the null of the ADF test onthe U.S. unemployment series was simply a case of Type I error, or the result of a size distortionwhen a series contains significant MA components, as reported in the literature by Schwert(1989)2amongst others.Simulatertime series from ARMA(2,2) process of

(6) with the parameter values given in

(7).The U.S. unemployment rate series for which the ADF test wrongly rejected the null had

817observations. So, for comparable results, the simulated series should have 800 observations. Foreach of the simulated series ∆yt, do the following:1. Create the level seriesyt,t= 1,2,···,800.2.

Conduct the ADF test onytwith 5 lags, Case 2, asytwould have no obvious trendingbehaviour with the drift term ̃cbeing practically zero. The test equation for this is givenby:∆yt=c+ρyt−1+5∑i=1πi∆yt−i+t(8)3. Save the ADF test statistic:t=ˆρs.e.( ˆρ)(9)where ˆρdenotes the least squares estimate ofρands.e.( ˆρ) its standard error.Report the proportion ofrADF tests in which the null hypothesis is rejected at 5% and 1%significance levels.

Discuss the implications of your results.

1Recall that thearmafunction defines the MA terms with positive coefficients2Schwert, G.W. (1989) “Tests for Unit Roots: A Monte Carlo Investigation”,Journal of Business and EconomicStatistics, 7, 2, p147-159.2
The ADF test involves nothing more than the estimation of the test equation by OLS and savingthe test statistic for one of the estimated parameters. As with any linear model, the ADF testequation of (8) can be written as the standard multiple regression model we encountered inChapter 3 of Econometrics I:y=Xβ+(10)wherey,X,β, andare defined as:y(T×1)=∆y1∆y2∆y3∆y4∆y5∆y6…∆yT−1∆yT;X(T×7)=1y0∆y0∆y−1∆y−2∆y−3∆y−41y1∆y1∆y0∆y−1∆y−2∆y−31y2∆y2∆y1∆y0∆y−1∆y−21y3∆y3∆y2∆y1∆y0∆y−11y4∆y4∆y3∆y2∆y1∆y01y5∆y5∆y4∆y3∆y2∆y1…………………1yT−2∆yT−2∆yT−3∆yT−4∆yT−5∆yT−61yT−1∆yT−1∆yT−2∆yT−3∆yT−4∆yT−5;β(7×1)=cρπ1π2π3π4π5;(T×1)=123456…T−1T(11)Writing out the regression model of (10) with (11) indeed gives the ADF test equation of (8).Here,T= 800. However, the vectors and matrices written out in (11) make it clear that the us-able observations in practice are between rows 6 and 800, since data ony0, and ∆y−4,∆y−3,···,∆y0do not exist. You lose the first five observations as a result of having to regress on the past fivevalues.Therefore, in practice, when estimating the ADF test equation of (8) with simulated 800 datapoints, your vector of dependent variableywill be of dimension 795×1, your regressor matrixXwill be of 795×7.3. Compare the mid-horizon out-of-sample predictive performance of two ARMA(p,q) models es-timated on the growth rate of U.S. industrial production.Letytdenote the proportional growth rate of monthly U.S. industrial production index seriesyou have encountered before. We saw in Class 3 that AIC selected AMA(4,1) and BIC selectedARMA(1,1). We discussed the difference between in-sample and out-of-sample predictions in the same class. Recall that in-sample predictions were none other than fitted values, whereas out-of-sample predictions, for the purposes of evaluating relative predictive performance of competingmodels, required successive estimation of models on different sub-samples of the data. You learnthow to compute out-of-sample predictions forh= 1, wherehis the forecast horizon.For this topic, I would like you to use the same data, and evaluate the two models’ predictive per-formance by computing predictions ˆyt+h|t, forh= 2,3,···,6. They are mid-horizon predictions3
up to and including six-month-ahead, using information available at timetonly, therefore:ˆyt+2|t=E[yt+2|It](12)ˆyt+3|t=E[yt+3|It](13)…ˆyt+6|t=E[yt+6|It](14)whereItdenotes all information available up to and including timet. The predictive time periodis for the last 6 years of the sample, between January 2013 and December 2018 inclusive.Take ARMA(1,1) for example:yt=c+φyt−1+t+θt−1(15)and consider one-step-ahead out-of-sample prediction (h= 1) for the first predictive time periodof January 2013. The prediction you want to compute is given by:ˆy13M1|12M12=E[y13M1|I12M12](16)where 13M1 and 12M12 refer to January 2013 and December 2012, respectively. As we arepredicting with the knowledge available at December 2012, the parameter estimates and theresidual series used to compute (16) are those obtained from estimating an ARMA(1,1) on thesub-sample of data up to and including December 2012. Let these estimates be denoted by ̃c(1), ̃φ(1), and ̃θ(1), where the superscript denotes that these are parameters obtained from the firstsub-sample. Also, let ̃(1)tdenote today’s residual from the same sub-sample. We saw in Class 3that the predictive formula forh= 1 was given by:ˆy13M1|12M12= ̃c(1)+ ̃φ(1)yt+ ̃θ(1) ̃(1)t(17)The difference between (17) and the actual observationy13M1is the forecast error.Consider predicting two time periods ahead, for (12), still standing at December 2012. Notethat the actual observation and the forecasting formula forh= 2 are given by:yt+2=c+φyt+1+t+2+θt+1,(18)ˆyt+2|t=c+φˆyt+1|t,(19)since all terms on the right-hand-side of (18) are in the future.For predicting February 2013 standing at December 2012, the sub-sample remains the same asthe one we have just used for one-step-ahead prediction. From the prediction formula of (19),ˆy13M2|12M12is given by:ˆy13M2|12M12= ̃c(1)+ ̃φ(1)ˆy13M1|12M12(20)You need to substitute the prediction computed in (17) into ˆy13M1|12M12of (20) above. Continuepredicting up to and including six-month ahead, saving the prediction error each time. Do thesame for the other competitor model, ARMA(4,1). Note that you will need to work out predictionformula from this process. Refer to Chapter 4, Section 2 of my lecture notes for the forecastingformulae from ARMA(p,q) processes.Compute two sets of statistics on root mean square error (RMSE): one from ARMA(1,1) andanother from ARMA(4,1). Discuss what these indicate regarding the predictive behaviour of thetwo candidate ARMA models.4
4. Compare the in-sample predictive performance of AR(2) with a dummy variable and VAR(2)withn= 3.The file “EC2019ProjQ4.csv” contains three quarterly time series data on the following vari-ables:1. Real GDP index.2. Three-month Treasury bill rate, a measure of short-term nominal interest rate.3. Inflation rate.All data are those of the U.K., between 1981Q1 to 2016Q4. The first series is the one you havebeen using in computer classes. Interest rate and inflation rate are recorded in percentages.We already know that the natural log of real GDP series—let this be denoted byy1t—has beengenerated by a unit root process. We also know from conducting Box-Jenkins model selectionprocedure that AR(2) adequately captures all the patterns in its first difference, ∆y1t.4.1 Test for a unit root in the other two series. Discuss your results. If you find a unit root inone or both of them, reduce them to stationarity.4.2 A plot of ∆y1treveals that there is a negative outlier in 2008Q4 due to the financial crisisof 2008. The real GDP decreased by 2.28% in one quarter. Outliers such as this onedisproportionately affect the least squares estimation, since OLS minimises thesquaresofresiduals.One method of dealing with outliers is to include a dummy variable in a model. Consider adummy variable which has 1 for 2008Q4 and zero everywhere else. Let this be denoted byD08Q4. Inserting this dummy into any univariate model allows the observation of 2008Q4alone to have a different intercept (recall what dummy variables do from Econometrics I),thus absorbing the sudden reduction in the series. Estimate the following model on ∆y1t:∆y1t=c+βD08Q4+φ1∆y1,t−1+φ2∆y1,t−2+t;t= 1,2,···,T(21)whereβis the parameter for regressorD08Q4. The model of (21) is none other than AR(2)on first differences with an exogenous variable, the dummy.Estimate (21) using the entire sample, 143 observations. Using the parameter estimatesobtained, compute in-sample predictions for thelevely1tfor the last 7 years of the sample:so the predictive time period is between 2010Q1 and 2016Q4 inclusive, giving 28 quarters.Obtain the sequence of forecast errors:e1t,ar=y1t−ˆyu1t(22)where ˆyu1tdenotes the in-sample prediction from the univariate model of (21). Compute theroot mean square error (RMSE).4.3 Consider the following VAR(2):yt=c+Φ1yt−1+Φ2yt−2+t;t= 1,2,···,T;t∼iid(0,Ω)(23)whereyt=∆y1ty2ty3t;c=

c1c2c3;Φi=

φi,11φi,12φi,13φi,21φi,22φi,23φi,31φi,32φi,33

;t=1t2t3t

(24)5
y2tandy3tdenote the nominal interest rate and inflation rate reduced to stationarity (ifnecessary), respectively. You need to ensure that all three elements ofytare stationary.cisa 3×1 vector of constants,Φi,i= 1,2 are 3×3 parameter matrices,ta three-dimensionalvector white noise process, andΩis a 3×3 variance-covariance matrix oft.Estimate (23) using the entire sample. Do the past interest rate and inflation rate determinethe current real GDP growth rate?Using the parameter estimates obtained by fitting a VAR(2), compute in-sample predictions,again for the last 7 years of the sample. Compute the sequence of forecast errors:e1t,var=y1t−ˆyv1t(25)where ˆyv1tdenotes the prediction ofy1tcomputed from VAR(2) of (23). Obtain the rootmean square error. Discuss which of the two models (21) and (23) has a better in-samplepredictive performance and why.6