Econometrics Toolbox™ includes a self-contained framework that allows you to implement Bayesian linear regression. The framework contains two groups of prior models for the regression coefficients β and the disturbance variance σ2:
Standard Bayesian linear regression prior models — The five prior model objects in this group range from the simple conjugate normal-inverse-gamma prior model through flexible prior models specified by draws from the prior distributions or a custom function. Although standard prior models can serve several purposes, they are best suited for posterior estimation, simulation (from the joint or, for most models, conditional posterior), and forecasting from the posterior predictive distribution.
Bayesian prior models for predictor variable selection — The models in this group can perform Bayesian lasso regression or stochastic search variable selection (SSVS). They are best suited for posterior estimation, during which the predictor selection algorithm occurs. The resulting posterior model is represented by draws from the Gibbs sampler (an empirical model object), and the estimation summary contains the results from the predictor selection algorithm. The estimation procedure does not remove insignificant or redundant variables for you, but the coefficients of a well-tuned model are close to zero. Therefore, as with standard models, you can simulate draws from the posterior distribution or forecast from the posterior predictive distribution without having to remove any variables.
The general workflow for estimating features of the posterior distributions, and then forecasting responses given new predictor data, is:
Use bayeslm
to create a prior model object that represents your assumptions about the distribution or a prior model object that is suited for predictor selection.
Pass the prior model object and the predictor and response data to the estimate
function. By default, estimate
returns a model object that represents the posterior distribution.
Pass the model object representing the posterior distribution to forecast
.
For details on the standard Bayesian linear regression workflow, see Workflow for Standard Bayesian Linear Regression Models, and for Bayesian predictor selection, see Workflow for Bayesian Predictor Selection.
The following procedure describes the workflow for estimating features of the posterior distributions, and then forecasting responses given predictor data.
Choose a joint prior distribution for (β,σ2). Then, using bayeslm
, create the Bayesian linear regression model object that completely specifies your beliefs about the joint prior distribution. This table contains the available prior model objects.
Model Object | Joint Prior Distribution of (β,σ2) | When to Create |
---|---|---|
conjugateblm |
| Create this model object when all of the following are true:
|
semiconjugateblm |
| Create this model object when all of the following are true:
|
diffuseblm | Create this model object when all of the following are true:
| |
customblm | A function handle of a custom function that computes the log of the joint prior distribution. | Create this model object when you want to specify the log of the joint prior distribution. This specification allows for maximum flexibility. |
Given the data, estimate features of the posterior distributions. The functions to use in this step depend on your analysis goals.
Function | Goals |
---|---|
estimate |
|
simulate |
|
If you have a custom prior model (customblm
object), then choose a Markov chain Monte Carlo (MCMC) sampler when you call estimate
or simulate
. This table contains a list of supported MCMC samplers. After choosing a sampler, try the default tuning parameter values first.
MCMC Sampler | Specify Using | Description |
---|---|---|
Hamiltonian Monte Carlo (HMC) | 'Sampler',"hmc" | Because the HMC sampler tunes itself, and resulting samples mix well and converge to their stationary distribution more quickly, try this sampler first. To increase sampling speed, supply the gradient of the log PDF for all or some of the parameters. |
Random walk Metropolis | 'Sampler',"metropolis" | If the sample size is reasonably large and the prior does not dominate the likelihood, then try this sampler. Supported proposal distributions are multivariate normal and multivariate t distributions. Tuning parameters include the distribution, its scale matrix, and its degrees of freedom. |
Slice | 'Sampler',"slice" (default) | To achieve adequate mixing and convergence, carefully tune the typical sampling-interval width. Values are application dependent. |
After estimating a nonanalytical posterior by using an MCMC sampler, inspect the posterior or conditional posterior draws for adequate mixing. For more details, see Posterior Estimation and Simulation Diagnostics.
If the quality of the samples is not satisfactory, then create a sampler options structure by using sampleroptions
, which allows you to specify the tuning parameter values that are appropriate for the sampler. For example, to specify a random walk Metropolis sampler that uses a multivariate t proposal distribution with 5 degrees of freedom, enter:
options = sampleroptions('Sampler',"metropolis",'Distribution',"mvt",... 'DegreeOfFreedom',5)
estimate
or simulate
by using the 'Options'
name-value pair argument.Forecast responses given the new predictor data using forecast
. The forecast
function constructs forecasts from the posterior predictive distribution. Analytical posterior predictive distributions are available for conjugateblm
and diffuseblm
prior models. For all other prior models, forecast
resorts to Monte Carlo sampling. As with estimation and simulation, you can choose an MCMC sampler for customblm
models. If forecast
uses an MCMC sampler, you should inspect the posterior or conditional posterior draws for adequate mixing.
The following procedure describes the workflow for performing Bayesian predictor selection for linear regression models, and then forecasting responses given predictor data.
Econometrics Toolbox supports two Bayesian predictor selection algorithms: Bayesian lasso regression and SSVS. Choose a predictor selection algorithm, which implies a joint prior distribution for (β,σ2). Then, using bayeslm
, create the Bayesian linear regression prior model object that performs the selected predictor selection algorithm, and optionally specify the tuning parameter value. This table contains the available prior model objects for predictor selection. For details on the forms of the prior distributions, see Posterior Estimation and Inference.
Model Object | Predictor Selection Algorithm | Tuning Parameter | When to Create |
---|---|---|---|
mixconjugateblm | SSVS [1] |
|
|
mixsemiconjugateblm | SSVS | Same as mixconjugateblm |
|
lassoblm | Bayesian lasso regression [3] |
| You want to force insignificant and redundant predictor variables to have coefficients with posterior mode of zero and fairly tight 95% posterior credible intervals around that mode. |
Because predictor selection algorithms can be sensitive to differing scales of the predictor variables (Bayesian lasso regression, in particular), determine the scale of the predictors by passing the data to boxplot
, or by estimating their means and standard deviations by using mean
and std
, respectively.
For SSVS prior models, plot the prior distributions of the coefficients by using plot
. The plots give you an idea about how the densities of the two Gaussian components balance. You can adjust the variances of a coefficient by using dot notation. For example, to attribute prior variances of 1000 and 0.1 to the variable-inclusion and variable-exclusion components, respectively, for coefficient
, enter:j
PriorMdl.V(j,:) = [1000 0.1];
V
, see [1].For lasso prior models, determine a regularization path, that is, a series of values for λ to iterate through during posterior estimation. Values are data dependent.
If the predictor variables have similar scales, specify a left boundary that forces most variables into the model (that is, attribute almost no penalty), specify a right boundary that forces almost all coefficients to 0, and specify enough values in between the boundaries to search the space adequately. By default, the software attributes a shrinkage of 0.01
to the intercept and 1
to all coefficients.
If the scales of the predictor variables differ by orders of magnitude, then you can rescale or standardize the data. However, these actions make interpreting the coefficients difficult. Instead, specify a regularization path and use it to create a matrix of repeated rows. For those variables whose scales are small or large, multiply the corresponding rows by the appropriate order of magnitude to magnify the penalty for small coefficients or reduce the penalty for large coefficients.
For example, suppose a regression model has three predictors. The first two predictors have similar scales, but the third predictor has a scale that is 3 orders larger. Suppose the 100-element regularization path is in the 1-by-100 vector Lambda
. To create a matrix of shrinkage values, enter the following code:
p = 3 % Number of predictors numcoeffs = p + 1; % Number of coefficients, intercept and predictors LambdaMat = repmat(Lambda,4,1); LambdaMat(4,:) = 1e3*LambdaMat(4,:);
For more on specifying λ, see [3].
In a for
loop, pass the predictor and response data to estimate
to estimate features of the posterior distributions for each value of the tuning parameter. For each iteration, store the posterior model object and estimation summary table in a cell array. estimate
uses the Gibbs sampler to sample successively from the full conditionals (see Analytically Tractable Posteriors). You can change aspects of the Gibbs sampler, such as the thinning factor, using the available options.
Posterior models (empiricalblm
model objects) store the draws from the full conditionals, among other things.
Estimation summary tables are MATLAB® tables that include:
Estimates of the mean (Mean
) and covariance matrix (Covariances
) of the marginal posterior π(β|y,x), and the mean and variance of π(σ2|y,x).
95% equitailed, credible intervals (CI95
), which are the 0.025-quantile and the 0.975-quantile of the retained Monte Carlo sample.
Posterior variable-inclusion probabilities (Regime
) for SSVS prior models only. Values below a threshold that you determine indicate that the corresponding predictor is insignificant or redundant.
Although you estimate multiple posterior models, a good practice is to inspect the posterior draws for adequate mixing, particularly for models estimated using the boundaries of the tuning parameter values. For more details, see Posterior Estimation and Simulation Diagnostics.
Determine the best-performing posterior model. Two examples are:
Choose the simplest model that minimizes the mean squared error (MSE, the means of Sigma2
) among all posterior models. This method is simple, but resulting models might not generalize well.
Choose the simplest model that minimizes forecast MSE.
Partition the data into estimation and forecast samples.
For all selected tuning parameter values, estimate the posterior distribution by using the estimation sample data.
Compute the forecast MSE using the forecast sample and responses forecasted by using the posterior predictive distribution.
Forecast responses given new predictor data using forecast
. The forecast
function constructs forecasts from the posterior predictive distribution by implementing a Gibbs sampler. Inspect the draws for adequate mixing.
[1] George, E. I., and R. E. McCulloch. "Variable Selection Via Gibbs Sampling." Journal of the American Statistical Association. Vol. 88, No. 423, 1993, pp. 881–889.
[2] Marin, J. M., and C. P. Robert. Bayesian Core: A Practical Approach to Computational Bayesian Statistics. New York: Springer Science+Business Media, LLC, 2007.
[3] Park, T., and G. Casella. "The Bayesian Lasso." Journal of the American Statistical Association. Vol. 103, No. 482, 2008, pp. 681–686.
conjugateblm
| customblm
| diffuseblm
| empiricalblm
| lassoblm
| mixconjugateblm
| mixsemiconjugateblm
| semiconjugateblm