BAYESIAN ESTIMATION OFSIMULTANEOUS EQUATION MODEL WITH LAGGED ENDOGENOUS VARIABLES AND FIRST ORDER SERIALLY CORRELATED ERRORS

Most simultaneous equation models involve the inclusion of lagged endogenous and/or exogenous variables and sometimes it may be misleading to assume that the errors are normally distributed when in reality they exhibit functional formsthat are not normal especially in practical situations. The classical methods of estimating parameters of simultaneous equation models are usually affected by the presence of autocorrelation among the error terms. Unfortunately, in practice the form of correlation between the pairs of the random deviates is unknown.In this paper classical and Bayesian methods for the estimation of simultaneous equation model withlagged endogenous variables and first order serially correlated errors are considered. The smallsample properties of the methods at different levels of correlation for ρ = 0.2, 0.5 and 0.8are compared.Better parameter estimates were produced by the Bayesian estimator with smaller standard errors compared to the classical method. The standard deviations of the Bayesian estimator are consistently better than those of the OLS estimator for the sample sizes considered. For example, the standard deviations of the Bayesian for b (the coefficient of the lagged endogenous variable,y ) when ρ = 0.2 at N = 10, 15, 20 and 25 were 0.07712781, 0.05433923, 0.03230012 and 0.03177252 respectively while those of OLS were 0.0784732, 0.4718914, 0.05701936 and 1.31422868. However, when ρ = 0.8, the standard deviations were 0.0548055, 0.03860254, 0.02572899 and 0.02126175 for Bayesian and 0.0562190, 0.03882345, 0.053676 and 0.0315632 for OLS. Interestingly, notice that even at high correlation level, the estimates produced by the Bayesian method are closer to the parameter values and the standard deviations decrease as the sample size increases. Hence, the Bayesian estimation method might be a better choice when lagged endogenous variables are included in a simultaneous equation model with auto-correlated disturbances since it appeared to give better results compared to the classical approach.


INTRODUCTION
Simultaneous equations model (SEM) is a very important field of econometrics.Haavelmo (1943) presented some important statistical implications of a linear SEM such as estimation of the stochastic equations which should not be done separately; the restrictions imposed upon the same variables by other equations ought to be taken into consideration.Simultaneous equations model could be underidentified, just-identified or over-identified, depending on how each parameter of the model uniquely contributes to the endogenous variable.The just-identified model, where the equations are exactly identified is considered in this research work.The indirect least squares method, two-stage least squares method, k-class estimators, three-stage least squares method, full information maximum likelihood method, Jackknife instrumental variable method due to Angrist, et. al. (1999), Lancaster (2004) and Blomquist and Dahlberg (1999) methods are the well-known classical inferential approaches that have been in use.They are majorly extensions of the two basic techniques of single-equation methods, the ordinary least squares and maximum likelihood estimators.The 'true' model structure is assumed unknown, and is being estimated.However, Dreze (1962) argued that such classical inference has a shortcoming in that, the available information on parameters is ignored; for instance, it is known that the marginal propensity to consume is in the unit interval, an information that could be made use of.The Bayesian inference however combines prior information on the parameter of interest with the likelihood function to give the posterior value.The Posterior distribution thus provides updated information on the parameter(s) under study O'Hagan (1994), Press (1969) and Koop (2003).Fair (1970) worked on various classical methods for estimating simultaneous equation models with lagged endogenous variables and first order serially correlated errors.His methods differed in the number of instrumental variables used.The asymptotic and small sample properties of the estimators were derived to ensure consistent estimates.Kelejian and Prucha(2004) developed estimation theory for a simultaneous system of spatially interrelated crosssectional equation, an extension of the widely used single equation model of Cli andOrd (1973, 1981).In modeling the disturbance process, they allowed for both spatial correlation as well as correlation across equation.They suggested computationally simple limited and full information instrumental variable estimators for the parameters of the system and gave formal large sample results.They introduced both a limited information estimator, termed the FGS2SLS estimator, and a full information estimator, termed the FGS3SLS estimator, and derived their asymptotic properties.The estimators are based on an approximation of the optimal instruments and as a result they are computationally simple even in large samples.Fingleton and LeGallo (2008) worked on estimation methods for models including an endogenous spatial lag, additional endogenous variables due to system feedback and an autoregressive or a moving average error process.They extended Kelejian, Prucha (1998) andFingleton and Le Gallo's (2006) feasible generalized spatial two stage least squares estimators and also considered HAC estimation in a spatial framework as suggested by Kelejian and Prucha (1999).An empirical example using real estate data illustrating the different estimators was used.The finite sample properties of the estimators were finally investigated by means of Monte Carlo simulation.Olubusoye and Okewole (2014) considered a twoequation model containing a combination of justidentified and over-identified equations.Bayesian analysis of multi-equation econometric model as well as a Monte Carlo study carried out using WinBUGS (windows version of the software: Bayesian analysis using Gibbs Sampling).Three different variances; 10, 100 and 1000were specified to assess the sensitivity to the prior variance specification.The result of the Monte Carlo study showed that a prior variance of 10 gave the smallest mean squared error.The kernel density plots also showed that the distribution of the posterior estimates from the prior variance of 10 was the closest to thedistribution obtained theoretically.Adepoju and Idowu(2015) considered a case in which lagged endogenous variables were included among the predetermined variables to investigate the effects lag inclusion on three simultaneous equation estimators.Six sample sizes: 20, 30, 40, 100, 500 and 1000 each replicated 1000 times were simulated using Monte Carlo method.The estimation techniques considered were: Ordinary Least Squares (OLS); Two-Stage Least Squares (2SLS) and Three-Stage Least Squares (3SLS).The estimators were then evaluated using total absolute bias, standard error, variance and root mean square error of the estimates respectively.The result showed that OLS provided the best estimates for all the cases considered followed by 2SLS.The presence of lagged variables means OLS estimator does not give a linear unbiased estimator because of its inconsistency, and as a result an instrument variable estimator might instead be used.Hence, 2SLS will be the best estimator to be used.In assessing the robustness of the estimators, two major criteria were employed, namely Akaike Information Criterion (AIC) and Bayesian Information Criterion (or Schwartz Bayesian Information Criterion (SBC)).2SLS performed better with these criteria compared to 3SLS, making it a more robust estimator.Finally, all the estimators revealed a remarkable asymptotic pattern.A comparative study of the classical and the Bayesian approaches is thus necessary so as to take advantage of their strengths and investigates more on possible ways of improving on their weaknesses.The need to carry out valid, generally acceptable, appropriate and convenient estimation of the SEM has brought about quite a number of researches on the classical and the Bayesian procedures.A research carried out by Gao and Lahiri (2001) focused on weak instruments where in cases with very weak instruments, there was no estimator that was superior to another, while in the case of weak endogeneity, Zellner's MELO (Minimum expected loss), a Bayesian procedure, was the best.Their result showed that under certain scenario (See Gao and Lahiri, 2001), of all the estimators, the BMOM (Bayesian method of moments) performed best.However, Jacknife instrumental variable estimator, a classical procedure due to Angrist, Imbens and Krueger (1999) and Blomquist and Dahlberg (1999), had a poor performance throughout.These studies reflect some Bayesian estimation methods of SEM; BMOM proposed by Zellner, the methods used by Chao and Phillips (1998), Geweke (1989Geweke ( , 1996)), Geisser (1965), Kleibergen and Zivot(2003).Others related works on simultaneous equation problems areZellner (1971, 1979, 1997a, 1997b), Nagar (1959 and1969), Li andPoirier (2003), Poirier (1995) and Kleibergen andVan Dijk (1998, 2002) and Kloek and Van Dijk(1978).The major objective of this study is to compare the asymptotic behaviours of classical and Bayesian estimators at different levels of first order serially correlated errors.The rest of the paper is structured as follows: section 2 specifies the model considered in this study, section 3 discusses the classical and the Bayesian Methodologies, section 4 describes the data generation process using the Monte Carlo method and section 5 presents the results of the experiment while section 6 concludes the paper.

THE MODEL
The model is The model in ( 1) is a just identified model where y ଵ୲ and y ଶ୲ are observations on two endogenous variables, y ଵ୲ିଵ and y ଶ୲ିଵ are lagged variables of y ଵ୲ and y ଶ୲ respectively.The matrix form of our model ( 1) is 2 × n matrix of observations on two endogenous variables,Γ is a matrix 5 2 × of coefficients for the endogenous variables.X is 1 × n vector of observations on the predetermined variables, B is a 2 2 × matrix of coefficients for the predetermined variables, and U= (U ଵ ,U ଶ ) is an 2 × n matrix of random disturbance terms where n is the number of observations.
To obtain the reduced form of (1), consider the following equations; (5) and ( 6) can be written in a linear form as

BAYESIAN ESTIMATION OFSIMULTANEOUS EQUATION MODEL WITH LAGGED
Where (7) and ( 8) are: The two equations ( 7) and ( 8) are used to determine the values of the endogenous variables at each point in terms of the predetermined variables and structural disturbance terms.

2.0
Classical and Bayesian Methodology

Classical Methodology
Given a model Where ‫ݕ‬ * = Py,ܺ * = PX and ߝ * = Pߝ.Since ߑ is a positive definite matrix, it follows that there exists an N N × matrix P such that P ߑܲ ூ = ‫ܫ‬ ே The Ordinary Least Squares method estimatesare obtained as follows, By differentiating with respect toߚ, we have Equating to zero, the estimate of ߚ gives

The Likelihood Function
Using the definition of the multivariate normal density, the likelihood function for the model can be written as: where, ‫ݒ‬ = ܰ − ݇, ߚ ఫ = (ܺ ூ * ܺ) ିଵ ܺ ூ * ‫ݕ‬ is the ordinary least squares estimator and ‫ݒ‬ is the variance for the error (Mean Square Error).

The Priors for Normal Linear Regression Models (NLRM)
Since the likelihood function of models determines the structure or distribution of the prior especially for easy interpretations and computations therefore, the natural conjugate prior Normal-Gamma density is used.Thus, if we elicit a prior for ߚconditional on ℎ of the form: ߚ|ℎ ~ ܰ(ߚ * , ܸ * ) … ( 13) and a prior for ℎ of the form: ℎ ~ ‫ݏ(ܩ‬ * ିଶ , ‫ݒ‬ * ) … ( 14) then, the joint prior for the two parameters is given as: 238 ADEPOJU ADEDAYO A., ALABA OLUWAYEMISI O. AND OGUNDUNMADETAYO P.

Generation of Monte Carlo Data
The main task is the generation of stochastic dependent (endogenous variables) ܻ ଵ௧ and ܻ ଶ௧ which are subsequently used in estimating the parameters of the model.

4.1Generation of Random Disturbance Terms, U
This is the method of generating a random disturbance terms ܷ ଵ௧ , ܽ݊݀ ܷ ଶ௧ .
Recall that

4.1.1Generation of Random Disturbance Terms, V
This is the method of generating a random disturbance terms ܸ ଵ௧ , ܽ݊݀ ܸ ଶ௧ .ߑisthe variance-covariance matrix (ܸ ௧ ܸ ூ ௧ ) decomposed into the non-singular upper triangular matrix ܲ ଵ and lower triangular matrix Let ܲ ଵ = ቀ ߩ ଵଵ ߩ ଵଶ 0 ߩ ଶଶ ቁ be upper triangular matrix and Solving the above four equations; ߩ ଵଵ = 1.802776 ߩ ଵଶ = 0.8660254 ߩ ଶଵ = 0.8660254 ߩ ଶଶ = 1.732051Thus, the pair of standard deviates can be transformed into pair of random normal variables using Making use of the data generated, the parameter estimates are obtained using the classical and the Bayesian methods.The following criteria are used in comparing the two estimation methods: Meanand Mean Squared Error (MSE) of the estimates.

Generation of Endogenous Variables
With the numerical values already assigned to the structural parameters, we have all the values required for the generation of the endogenous variables.Considering the AR(1) for the disturbance terms ܷ ଵ௧ , ܽ݊݀ ܷ ଶ௧ given below: Solving ‫ݕ‬ ଵ௧ and ‫ݕ‬ ଶ௧ given in equations ( 7) and ( 8) above after estimating the resolved parameters with the assumed structural parameters, then we have Where ݁ ଵ௧ and ݁ ଶ௧ are generated from N(0,1).The estimation of the disturbance terms ܷ ଵ௧ , ܽ݊݀ ܷ ଶ௧ are generated recursively using the given model above.

Estimation of the structural parameters
Having generated the values of exogenous and endogenous variables as well as disturbance terms, the next step is the estimation of the structural parameters for 0.2, 0.5 and 0.8 correlation levels.These structural parameters are obtained using statistical softwarecalled R (version 3.3.3).

Criteria for evaluating the performance of the estimators
For easy comparison and evaluation of the performance of the estimators, the following criteria are used.

(i)
Mean values of the parameter estimates.(ii) Standard deviation The following classical estimation methods used in this study are: OLS and Bayesian approach.

Result and Discussion
The point estimates' summary presented in Tables 1-3 reflect some properties of the two estimation methods under discussion.Interestingly, the results of the OLS and Bayesian inference are not too different.Table 1 gives the estimates of the two estimators for 2 .0 = ρ and sample sizes 10, 15, 20 and 25.In all cases, the standard deviations of the parameter estimates decrease as the sample size increases.However, the Bayesian method performs better than the OLS since it produces smaller standard deviations.The observations in Table 1 are similar to those recorded in Table 2 when 5 .0 = ρ , the standard deviations also are smaller for both estimators.Interestingly, the parameter estimates and the standard deviations are closer for the two estimators.In Table 3 for 8 .0 = ρ , the estimates are much closer to the parameter values than for the other cases considered.The standard deviations produced are also better for Bayesian than for the OLS estimator but closer.Clearly, as the sample size increases, standard errorsof the estimators decrease.In Table 3, where the rho of the error term (݁ ௧ ) was raised to 0.8, the estimates from the two methods were more concentrated around the class containing the true value than in the first and second Tables where the rho(s)were 0.2 and 0.5 respectively.This is an indication that the distribution of the exogenous variables also affects the properties of the estimators.We noticed that 240 ADEPOJU ADEDAYO A., ALABA OLUWAYEMISI O. AND OGUNDUNMADETAYO P.
in Tables I and 2, the standard deviation was questionably large for the classical method when N=20, this is as a result of the characteristic of the Bayesian method.
Table 1: Estimates and the standard deviations (in parenthesis) of the estimators for

DISCUSSION
Tables 1-3presented above reflect some properties of the two estimation methods under discussion.For the classical method, the OLSwas considered because they give the same estimate for this model being a justidentified model.For all the sample sizes considered under different correlation levels, the Bayesian estimates performed better than the classical estimates mostly for the small sample cases.
The performance of the two estimators was considered for a two-equation simultaneous model with lagged endogenous variables and auto-correlated errors.This study is similar to the work of Fair (1970), though Fair used classical approaches with different instrumental variablesand merely derived asymptotic and small sample properties of the estimators without estimating the parameters of the model but in this work, Bayesian approach is used to estimate the parameters of the simultaneous equation with lagged endogenous variables and first serially correlated error terms and the results compared with the classical method.
The parameters of equations 1 and 2 are assumed to beα ଵଶ = 0. Autocorrelated error was set at rho= 0.2, 0.5 and 0.8(low, moderate and high correlation levels).The distribution of these estimates was closer for the two estimation methods.As expected, as the sample size increases, the standard error of the estimators reduces.At rho = 0.2, the estimates from the Bayesian method was more concentrated around the true value than in the classical method considered, this is an indication that the lag of the exogenous variables also affects the properties of the estimators.It is noticed that at rho = 0.5, which implies moderated auto-correlated level, the standard deviations are questionably large for the classical method when N=20 and N= 25, this is as a result of lagged endogenous variables that are uncharacteristic of the Bayesian method.

CONCLUSION
Estimation of simultaneous equation model in econometric research should be approached with care.The choice of estimation method as observed in this research work affects the estimates in terms of bias and consistency especially when dealing with small samples.The Bayesian estimation method has gained a lot of attention recently which makes practical statistical inference more interesting (Lahiri et. al. (2000), Herman and Dijk (2002)).The present study showed that the Bayesian estimation method performs better than the classical for smaller sample sizes but it becomes simply impossible to estimate the parameters when the sample size is large.The model in this study is a just identified model, and the classical estimation method give relatively close estimates to the assumed parameter values.The Bayesian estimation method being more easily applied might be a better choice since it appears to give better results compared to the classical approach.
x ଵ୲ , x ଶ୲ , x ଷ୲ are observations on exogenous variables, v ଵ୲ and v ଶ୲ are the disturbance terms, a ଵଶ , b ଵଵ , b ଵଶ , b ଵସ , a ଶଵ , b ଶଶ , b ଶଷ and b ଶହ are scalar parameters.