Regression with Dummy Explanatory Variables: Some Methodological Issues

Peer Reviewed
1 January 2013

Tanzania Economic Review

The study considers the inclusion of dummy explanatory variables (in regression estimation) in both cross –section and time series data. In survey data, a continuous variable may be categorized into a dummy variable dataset whenever the quality, reliability and internal consistency of the continuous data is put into question. In the process of categorization, vita information may be compromised: in other words there is loss of information. Furthermore, an arbitrary choice of cutoff points may yield different regression estimates depending on the cutoff point. To illustrate this we considered the latest Tanzania DHS dataset to obtain a regression estimate of length of inter-birth interval on the length of breast-feeding at different cutoff points. The results suggest that a multiple regression with multiple dummy explanatory variables as yielding a predictable result with a relatively better fit. The use and misuse of dummy explanatory variable in time series is also summarized by introducing a structural break to estimate the effect of nominal GDP and policy changes on imports. A modeling exercise that takes into consideration the theoretical issues and appropriate empirical implementation was able to yield a non-counter intuitive result. In general, when adopting dummy explanatory variable to represent continuous variable or to capture a structural break, the formulation, specification and estimation process should be based on sound economic theory and on scientifically based estimation technique.

Topics
Country
Publication | 18 December 2013