estimate

Fit autoregressive integrated moving average (ARIMA) model to data

Syntax

EstMdl = estimate(Mdl,y)

EstMdl = estimate(Mdl,y,Name,Value)

[EstMdl,EstParamCov,logL,info] = estimate(___)

Description

EstMdl = estimate(Mdl,y) estimates parameters of the partially specified ARIMA(p,D,q) model Mdl given the observed univariate time series y using maximum likelihood. EstMdl is the corresponding fully specified ARIMA model that stores the parameter estimates.

example

EstMdl = estimate(Mdl,y,Name,Value) uses additional options specified by one or more name-value pair arguments. For example, 'X',X includes a linear regression component in the model for the exogenous data in X.

example

[EstMdl,EstParamCov,logL,info] = estimate(___) also returns the variance-covariance matrix associated with the estimated parameters EstParamCov, optimized loglikelihood objective function value logL, and summary information info, using any of the input argument combinations in the previous syntaxes.

Examples

collapse all

Estimate ARMA Model

Open Live Script

Fit an ARMA(2,1) model to simulated data.

Simulate Data from Known Model

Suppose that the data generating process (DGP) is

$y_{t} = 0.5 y_{t - 1} - 0.3 y_{t - 2} + ε_{t} + 0.2 ε_{t - 1},$

where $ε_{t}$ is a series of iid Gaussian random variables with mean 0 and variance 0.1.

Create the ARMA(2,1) model representing the DGP.

DGP = arima('AR',{0.5,-0.3},'MA',0.2,...
    'Constant',0,'Variance',0.1)

DGP = 
  arima with properties:

     Description: "ARIMA(2,0,1) Model (Gaussian Distribution)"
    Distribution: Name = "Gaussian"
               P: 2
               D: 0
               Q: 1
        Constant: 0
              AR: {0.5 -0.3} at lags [1 2]
             SAR: {}
              MA: {0.2} at lag [1]
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: 0.1

DGP is a fully specified arima model object.

Simulate a random 500 observation path from the ARMA(2,1) model.

rng(5); % For reproducibility
T = 500;
y = simulate(DGP,T);

y is a 500-by-1 column vector representing a simulated response path from the ARMA(2,1) model DGP.

Estimate Model

Create an ARMA(2,1) model template for estimation.

Mdl = arima(2,0,1)

Mdl = 
  arima with properties:

     Description: "ARIMA(2,0,1) Model (Gaussian Distribution)"
    Distribution: Name = "Gaussian"
               P: 2
               D: 0
               Q: 1
        Constant: NaN
              AR: {NaN NaN} at lags [1 2]
             SAR: {}
              MA: {NaN} at lag [1]
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: NaN

Mdl is a partially specified arima model object. Only required, nonestimable parameters that determine the model structure are specified. NaN-valued properties, including $ϕ_{1}$ , $ϕ_{2}$ , $θ_{1}$ , $c$ , and $σ^{2}$ , are unknown model parameters to be estimated.

Fit the ARMA(2,1) model to y.

EstMdl = estimate(Mdl,y)

 
    ARIMA(2,0,1) Model (Gaussian Distribution):
 
                  Value      StandardError    TStatistic      PValue  
                _________    _____________    __________    __________

    Constant    0.0089018       0.018417       0.48334         0.62886
    AR{1}         0.49563        0.10323        4.8013      1.5767e-06
    AR{2}        -0.25495       0.070155       -3.6341      0.00027897
    MA{1}         0.27737        0.10732        2.5846       0.0097492
    Variance      0.10004      0.0066577        15.027      4.9017e-51

EstMdl = 
  arima with properties:

     Description: "ARIMA(2,0,1) Model (Gaussian Distribution)"
    Distribution: Name = "Gaussian"
               P: 2
               D: 0
               Q: 1
        Constant: 0.00890178
              AR: {0.495632 -0.254951} at lags [1 2]
             SAR: {}
              MA: {0.27737} at lag [1]
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: 0.100043

MATLAB® displays a table containing an estimation summary, which includes parameter estimates and inferences. For example, the Value column contains corresponding maximum-likelihood estimates, and the PValue column contains $p$ -values for the asymptotic $t$ -test of the null hypothesis that the corresponding parameter is 0.

EstMdl is a fully specified, estimated arima model object; its estimates resemble the parameter values of the DGP.

Apply Equality Constraints to Parameters During Estimation

Open Live Script

Fit an AR(2) model to simulated data while holding the model constant fixed during estimation.

Simulate Data from Known Model

Suppose the DGP is

$y_{t} = 0.5 y_{t - 1} - 0.3 y_{t - 2} + ε_{t},$

where $ε_{t}$ is a series of iid Gaussian random variables with mean 0 and variance 0.1.

Create the AR(2) model representing the DGP.

DGP = arima('AR',{0.5,-0.3},...
    'Constant',0,'Variance',0.1);

Simulate a random 500 observation path from the model.

rng(5); % For reproducibility
T = 500;
y = simulate(DGP,T);

Create Model Object Specifying Constraint

Assume that the mean of $y_{t}$ is 0, which implies that $c$ is 0.

Create an AR(2) model for estimation. Set $c$ to 0.

Mdl = arima('ARLags',1:2,'Constant',0)

Mdl = 
  arima with properties:

     Description: "ARIMA(2,0,0) Model (Gaussian Distribution)"
    Distribution: Name = "Gaussian"
               P: 2
               D: 0
               Q: 0
        Constant: 0
              AR: {NaN NaN} at lags [1 2]
             SAR: {}
              MA: {}
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: NaN

Mdl is a partially specified arima model object. Specified parameters include all required parameters and the model constant. NaN-valued properties, including $ϕ_{1}$ , $ϕ_{2}$ , and $σ^{2}$ , are unknown model parameters to be estimated.

Estimate Model

Fit the AR(2) model template containing the constraint to y.

EstMdl = estimate(Mdl,y)

 
    ARIMA(2,0,0) Model (Gaussian Distribution):
 
                 Value      StandardError    TStatistic      PValue  
                ________    _____________    __________    __________

    Constant           0             0            NaN             NaN
    AR{1}        0.56342      0.044225          12.74      3.5474e-37
    AR{2}       -0.29355      0.041786        -7.0252       2.137e-12
    Variance     0.10022      0.006644         15.085      2.0476e-51

EstMdl = 
  arima with properties:

     Description: "ARIMA(2,0,0) Model (Gaussian Distribution)"
    Distribution: Name = "Gaussian"
               P: 2
               D: 0
               Q: 0
        Constant: 0
              AR: {0.563425 -0.293554} at lags [1 2]
             SAR: {}
              MA: {}
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: 0.100222

EstMdl is a fully specified, estimated arima model object; its estimates resemble the parameter values of the AR(2) model DGP. The value of $c$ in the estimation summary and object display is 0, and corresponding inferences are trivial or do not apply.

Initialize Model Estimation Using Presample Response Data

Open Live Script

Because an ARIMA model is a function of previous values, estimate requires presample data to initialize the model early in the sampling period. Although, estimate backcasts for presample data by default, you can specify required presample data instead. The P property of an arima model object specifies the required number of presample observations.

Load Data

Load the US equity index data set Data_EquityIdx.

load Data_EquityIdx

The table DataTable includes the time series variable NYSE, which contains daily NYSE composite closing prices from January 1990 through December 1995.

Convert the table to a timetable.

dt = datetime(dates,'ConvertFrom','datenum','Format','yyyy-MM-dd');
TT = table2timetable(DataTable,'RowTimes',dt);
T = size(TT,1); % Total sample size

Create Model Template

Suppose that an ARIMA(1,1,1) model is appropriate to model NYSE composite series during the sample period.

Create an ARIMA(1,1,1) model template for estimation.

Mdl = arima(1,1,1)

Mdl = 
  arima with properties:

     Description: "ARIMA(1,1,1) Model (Gaussian Distribution)"
    Distribution: Name = "Gaussian"
               P: 2
               D: 1
               Q: 1
        Constant: NaN
              AR: {NaN} at lag [1]
             SAR: {}
              MA: {NaN} at lag [1]
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: NaN

Mdl is a partially specified arima model object.

Partition Sample

Create vectors of indices that partition the sample into presample and estimation sample periods, so that the presample occurs first and contains Mdl.P = 2 observations, and the estimation sample contains the remaining observations.

presample = 1:Mdl.P;
estsample = (Mdl.P + 1):T;

Estimate Model

Fit an ARIMA(1,1,1) model to the estimation sample. Specify the presample responses.

EstMdl = estimate(Mdl,TT{estsample,"NYSE"},'Y0',TT{presample,"NYSE"});

 
    ARIMA(1,1,1) Model (Gaussian Distribution):
 
                 Value      StandardError    TStatistic    PValue 
                ________    _____________    __________    _______

    Constant     0.15775      0.097888         1.6115      0.10706
    AR{1}       -0.21985       0.15652        -1.4046      0.16015
    MA{1}        0.28529       0.15393         1.8534      0.06382
    Variance       17.17       0.20065         85.573            0

EstMdl is a fully specified, estimated arima model object.

Specify Initial Parameter Values for Optimization

Open Live Script

Fit an ARIMA(1,1,1) model to the daily close of the NYSE Composite Index. Specify initial parameter values obtained from an analysis of a pilot sample.

Load Data

Load the US equity index data set Data_EquityIdx.

load Data_EquityIdx

The table DataTable includes the time series variable NYSE, which contains daily NYSE composite closing prices from January 1990 through December 1995.

Convert the table to a timetable.

dt = datetime(dates,'ConvertFrom','datenum','Format','yyyy-MM-dd');
TT = table2timetable(DataTable,'RowTimes',dt);

Fit Model to Pilot Sample

Suppose that an ARIMA(1,1,1) model is appropriate to model NYSE composite series during the sample period.

Create an ARIMA(1,1,1) model template for estimation.

Mdl = arima(1,1,1);

Mdl is a partially specified arima model object.

Treat the first two years as a pilot sample for obtaining initial parameter values when fitting the model to the remaining three years of data. Fit the model to the pilot sample.

endPilot = datetime(1991,12,31);
pilottr = timerange(TT.Time(1),endPilot,'days');

EstMdl0 = estimate(Mdl,TT{pilottr,"NYSE"},'Display','off');

EstMdl0 is a fully specified, estimated arima model object.

Estimate Model

Fit an ARIMA(1,1,1) model to the estimation sample. Specify the estimated parameters from the pilot sample fit as initial values for optimization.

esttr = timerange(endPilot + days(1),TT.Time(end),'days');

c0 = EstMdl0.Constant;
ar0 = EstMdl0.AR;
ma0 = EstMdl0.MA;
var0 = EstMdl0.Variance;

EstMdl = estimate(Mdl,TT{esttr,"NYSE"},'Constant0',c0,'AR0',ar0,...
   'MA0',ma0,'Variance0',var0);

 
    ARIMA(1,1,1) Model (Gaussian Distribution):
 
                 Value     StandardError    TStatistic    PValue 
                _______    _____________    __________    _______

    Constant    0.17424       0.11648         1.4959      0.13468
    AR{1}       -0.2262       0.18587         -1.217      0.22362
    MA{1}       0.29047       0.18276         1.5893      0.11199
    Variance     20.053       0.27603          72.65            0

EstMdl is a fully specified, estimated arima model object.

Estimate ARIMA Model Containing Exogenous Predictors (ARIMAX)

Open Live Script

Fit an ARIMAX model to simulated time series data.

Simulate Predictor and Response Data

Create the ARIMAX(2,1,0) model for the DGP, represented by $y_{t}$ in the equation

$(1 - 0.5 L + 0.3 L^{2}) (1 - L)^{1} y_{t} = 2 + 1.5 x_{1, t} + 2.6 x_{2, t} - 0.3 x_{3, t} + ε_{t},$

where $ε_{t}$ is a series of iid Gaussian random variables with mean 0 and variance 0.1.

DGP = arima('AR',{0.5,-0.3},'D',1,'Constant',2,...
    'Variance',0.1,'Beta',[1.5 2.6 -0.3]);

Assume that the exogenous variables $x_{1, t}$ , $x_{2, t}$ , and $x_{3, t}$ are represented by the AR(1) processes

$\begin{array}{cccccccccccccccccccc} x_{1, t} = 0.1 x_{1, t - 1} + η_{1, t} \\ x_{2, t} = 0.2 x_{2, t - 1} + η_{2, t} \\ x_{3, t} = 0.3 x_{3, t - 1} + η_{3, t}, \end{array}$

where $η_{i, t}$ follows a Gaussian distribution with mean 0 and variance 0.01 for $i \in {1, 2, 3}$ . Create ARIMA models that represent the exogenous variables.

MdlX1 = arima('AR',0.1,'Constant',0,'Variance',0.01);
MdlX2 = arima('AR',0.2,'Constant',0,'Variance',0.01);
MdlX3 = arima('AR',0.3,'Constant',0,'Variance',0.01);

Simulate length 1000 exogenous series from the AR models. Store the simulated data in a matrix.

T = 1000;
rng(10); % For reproducibility
x1 = simulate(MdlX1,T);
x2 = simulate(MdlX2,T);
x3 = simulate(MdlX3,T);
X = [x1 x2 x3];

X is a 1000-by-3 matrix of simulated time series data. Each row corresponds to an observation in the time series, and each column corresponds to an exogenous variable.

Simulate a length 1000 series from the DGP. Specify the simulated exogenous data.

y = simulate(DGP,T,'X',X);

y is a 1000-by-1 vector of response data.

Estimate Model

Create an ARIMA(2,1,0) model template for estimation.

Mdl = arima(2,1,0)

Mdl = 
  arima with properties:

     Description: "ARIMA(2,1,0) Model (Gaussian Distribution)"
    Distribution: Name = "Gaussian"
               P: 3
               D: 1
               Q: 0
        Constant: NaN
              AR: {NaN NaN} at lags [1 2]
             SAR: {}
              MA: {}
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: NaN

The model description (Description property) and value of Beta suggest that the partially specified arima model object Mdl is agnostic of the exogenous predictors.

Estimate the ARIMAX(2,1,0) model; specify the exogenous predictor data. Because estimate backcasts for presample responses (a process that requires presample predictor data for ARIMAX models), fit the model to the latest T – Mdl.P responses. (Alternatively, you can specify presample responses by using the 'Y0' name-value pair argument.)

EstMdl = estimate(Mdl,y((Mdl.P + 1):T),'X',X);

 
    ARIMAX(2,1,0) Model (Gaussian Distribution):
 
                 Value      StandardError    TStatistic      PValue   
                ________    _____________    __________    ___________

    Constant      1.7519       0.021143        82.859                0
    AR{1}        0.56076       0.016511        33.963      7.9497e-253
    AR{2}       -0.26625       0.015966       -16.676       1.9636e-62
    Beta(1)       1.4764        0.10157        14.536       7.1228e-48
    Beta(2)       2.5638        0.10445        24.547      4.6633e-133
    Beta(3)     -0.34422       0.098623       -3.4903       0.00048249
    Variance     0.10673      0.0047273        22.577      7.3161e-113

EstMdl is a fully specified, estimated arima model object.

When you estimate the model by using estimate and supply the exogenous data by specifying the 'X' name-value pair argument, MATLAB® recognizes the model as an ARIMAX(2,1,0) model and includes a linear regression component for the exogenous variables.

The estimated model is

$(1 - 0.56 L + 0.27 L^{2}) {(1 - L)}^{1} y_{t} = 1.75 + {1.48 x}_{1, t} + 2.56 x_{2, t} - 0.34 x_{3, t} + ε_{t},$

which resembles the DGP represented by Mdl0. Because MATLAB returns the AR coefficients of the model expressed in difference-equation notation, their signs are opposite in the equation.

Compute Estimated Standard Errors

Open Live Script

Load the US equity index data set Data_EquityIdx.

load Data_EquityIdx

The table DataTable includes the time series variable NYSE, which contains daily NYSE composite closing prices from January 1990 through December 1995.

Convert the table to a timetable.

dt = datetime(dates,'ConvertFrom','datenum','Format','yyyy-MM-dd');
TT = table2timetable(DataTable,'RowTimes',dt);

Suppose that an ARIMA(1,1,1) model is appropriate to model NYSE composite series during the sample period

Fit an ARIMA(1,1,1) model to the data, and return the estimated parameter covariance matrix.

Mdl = arima(1,1,1);
[EstMdl,EstParamCov] = estimate(Mdl,TT{:,"NYSE"});

 
    ARIMA(1,1,1) Model (Gaussian Distribution):
 
                 Value      StandardError    TStatistic     PValue 
                ________    _____________    __________    ________

    Constant     0.15746      0.097832         1.6095       0.10751
    AR{1}       -0.21997       0.15642        -1.4063       0.15964
    MA{1}        0.28541       0.15382         1.8555      0.063527
    Variance      17.159       0.20038         85.632             0

EstParamCov

EstParamCov = 4×4

    0.0096   -0.0002    0.0002    0.0023
   -0.0002    0.0245   -0.0240   -0.0060
    0.0002   -0.0240    0.0237    0.0057
    0.0023   -0.0060    0.0057    0.0402

EstMdl is a fully specified, estimated arima model object. Rows and columns of EstParamCov correspond to the rows in the table of estimates and inferences; for example, ${C o v}_{}^{ˆ} ({ϕ_{}^{ˆ}}_{1}, {θ_{}^{ˆ}}_{1}) = - 0.024$ .

Compute estimated parameter standard errors by taking the square root of the diagonal elements of the covariance matrix.

estParamSE = sqrt(diag(EstParamCov))

estParamSE = 4×1

    0.0978
    0.1564
    0.1538
    0.2004

Compute a Wald-based 95% confidence interval on $ϕ$ .

T = size(TT,1); % Effective sample size
phihat = EstMdl.AR{1};
sephihat = estParamSE(2);
ciphi = phihat + tinv([0.025 0.975],T - 3)*sephihat

ciphi = 1×2

   -0.5267    0.0867

The interval contains 0, which suggests that $ϕ$ is insignificant.

Compute Fitted Response Values

Open Live Script

Load the US equity index data set Data_EquityIdx.

load Data_EquityIdx

The table DataTable includes the time series variable NYSE, which contains daily NYSE composite closing prices from January 1990 through December 1995.

Convert the table to a timetable.

dt = datetime(dates,'ConvertFrom','datenum','Format','yyyy-MM-dd');
TT = table2timetable(DataTable,'RowTimes',dt);
T = size(TT,1);

Suppose that an ARIMA(1,1,1) model is appropriate to model NYSE composite series during the sample period.

Fit an ARIMA(1,1,1) model to the data. Specify the required presample and turn off the estimation display.

Mdl = arima(1,1,1);

preidx = 1:Mdl.P;
estidx = (Mdl.P + 1):T;
EstMdl = estimate(Mdl,TT{estidx,"NYSE"},...
    'Y0',TT{preidx,"NYSE"},'Display','off');

Infer residuals $\hat{ε_{t}}$ from the estimated model, specify the required presample.

resid = infer(EstMdl,TT{estidx,"NYSE"},...
    'Y0',TT{preidx,"NYSE"});

resid is a (T – Mdl.P)-by-1 vector of residuals.

Compute the fitted values $\hat{y_{t}}$ .

yhat = TT{estidx,"NYSE"} - resid;

Plot the observations and the fitted values on the same graph.

plot(TT.Time(estidx),TT{estidx,"NYSE"},'r',TT.Time(estidx),yhat,'b--','LineWidth',2)

The fitted values closely track the observations.

Plot the residuals versus the fitted values.

plot(yhat,resid,'.')
ylabel('Residuals')
xlabel('Fitted values')

Residual variance appears larger for larger fitted values. One remedy for this behavior is to apply the log transform to the data.

Input Arguments

collapse all

`Mdl` — Partially specified ARIMA model
`arima` model object

Partially specified ARIMA model used to indicate constrained and estimable model parameters, specified as an arima model object returned by arima or estimate. Properties of Mdl describe the model structure and specify the parameters.

estimate fits unspecified (NaN-valued) parameters to the data y.

estimate treats specified parameters as equality constraints during estimation.

`y` — Single path of response data
numeric column vector

Single path of response data to which the model Mdl is fit, specified as a numeric column vector. The last observation of y is the latest observation.

Data Types: double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'Y0',Y0,'X',X uses the vector Y0 as presample responses required for estimation, and includes a linear regression component for the exogenous predictor data in X.

Estimation Options

collapse all

`'X'` — Exogenous predictor data
matrix

Exogenous predictor data for the linear regression component, specified as the comma-separated pair consisting of 'X' and a matrix.

The columns of X are separate, synchronized time series. The last row contains the latest observations.

If you do not specify presample response data using the 'Y0' name-value pair argument, the number of rows of X must be at least numel(y) + Mdl.P. Otherwise, the number of rows of X must be at least the length of y.

If the number of rows of X exceeds the number needed, estimate uses the latest observations only.

estimate synchronizes X and y so that the latest observations (last rows) occur simultaneously.

By default, estimate does not estimate the regression coefficients, regardless of their presence in Mdl.

Data Types: double

`'Options'` — Optimization options
`optimoptions` optimization controller

Optimization options, specified as the comma-separated pair consisting of 'Options' and an optimoptions optimization controller. For details on modifying the default values of the optimizer, see optimoptions or fmincon in Optimization Toolbox™.

For example, to change the constraint tolerance to 1e-6, set Options = optimoptions(@fmincon,'ConstraintTolerance',1e-6,'Algorithm','sqp'). Then, pass Options into estimate using 'Options',Options.

By default, estimate uses the same default options as fmincon, except Algorithm is 'sqp' and ConstraintTolerance is 1e-7.

`'Display'` — Command Window display option
`'params'` (default) | `'diagnostics'` | `'full'` | `'iter'` | `'off'` | string vector | cell vector of character vectors

Command Window display option, specified as the comma-separated pair consisting of 'Display' and one or more of the values in this table.

Value	Information Displayed
`'diagnostics'`	Optimization diagnostics
`'full'`	Maximum likelihood parameter estimates, standard errors, t statistics, iterative optimization information, and optimization diagnostics
`'iter'`	Iterative optimization information
`'off'`	None
`'params'`	Maximum likelihood parameter estimates, standard errors, and t statistics

Example: 'Display','off' is well suited for running a simulation that estimates many models.

Example: 'Display',{'params','diagnostics'} displays all estimation results and the optimization diagnostics.

Data Types: char | cell | string

Presample Specifications

collapse all

`'Y0'` — Presample response data
numeric column vector

Presample response data for initializing the model, specified as the comma-separated pair consisting of 'Y0' and a numeric column vector.

The length of Y0 must be at least Mdl.P. If Y0 has extra rows, estimate uses only the latest Mdl.P presample responses. The last row contains the latest presample responses.

By default, estimate backward forecasts (backcasts) for the necessary amount of presample responses.

For details on partitioning data for estimation, see Time Base Partitions for ARIMA Model Estimation.

Data Types: double

`'E0'` — Presample innovations
numeric column vector

Presample innovations ε_t for initializing the model, specified as the comma-separated pair consisting of 'E0' and a numeric column vector.

The length of E0 must be at least Mdl.Q. If E0 has extra rows, estimate uses only the latest Mdl.Q presample innovations. The last row contains the latest presample innovation.

If Mdl.Variance is a conditional variance model object, such as a garch model, estimate can require more than Mdl.Q presample innovations.

By default, estimate sets all required presample innovations to 0, which is their mean.

Data Types: double

`'V0'` — Presample conditional variances
numeric positive column vector

Presample conditional variances σ²_t for initializing any conditional variance model, specified as the comma-separated pair consisting of 'V0' and a numeric positive column vector.

The length of V0 must be at least the number of observations required to initialize the conditional variance model (see estimate). If V0 has extra rows, estimate uses only the latest observations. The last row contains the latest observation.

If the variance is constant, estimate ignores V0.

By default, estimate sets the necessary presample conditional variances to the average of the squared inferred innovations.

Data Types: double

Initial Value Specifications

collapse all

`'Constant0'` — Initial estimate of model constant
numeric scalar

Initial estimate of the model constant c, specified as the comma-separated pair consisting of 'Constant0' and a numeric scalar.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

`'AR0'` — Initial estimates of nonseasonal AR polynomial coefficients
numeric vector

Initial estimates of the nonseasonal AR polynomial coefficients $ϕ (L)$ , specified as the comma-separated pair consisting of 'AR0' and a numeric vector.

The length of AR0 must equal the number of lags associated with nonzero coefficients in the nonseasonal AR polynomial. Elements of AR0 correspond to elements of Mdl.AR.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

`'SAR0'` — Initial estimates of seasonal autoregressive polynomial coefficients
numeric vector

Initial estimates of the seasonal autoregressive polynomial coefficients $Φ (L)$ , specified as the comma-separated pair consisting of 'SAR0' and a numeric vector.

The length of SAR0 must equal the number of lags associated with nonzero coefficients in the seasonal autoregressive polynomial SARLags. Elements of SAR0 correspond to elements of Mdl.SAR.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

`'MA0'` — Initial estimates of nonseasonal moving average polynomial coefficients
numeric vector

Initial estimates of the nonseasonal moving average polynomial coefficients $θ (L)$ , specified as the comma-separated pair consisting of 'MA0' and a numeric vector.

The length of MA0 must equal the number of lags associated with nonzero coefficients in the nonseasonal moving average polynomial MALags. Elements of MA0 correspond to elements of Mdl.MA.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

`'SMA0'` — Initial estimates of seasonal moving average polynomial coefficients
numeric vector

Initial estimates of the seasonal moving average polynomial coefficients $Θ (L)$ , specified as the comma-separated pair consisting of 'SMA0' and a numeric vector.

The length of SMA0 must equal the number of lags associated with nonzero coefficients in the seasonal moving average polynomial SMALags. Elements of SMA0 correspond to elements of Mdl.SMA.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

`'Beta0'` — Initial estimates of regression coefficients
numeric vector

Initial estimates of the regression coefficients β, specified as the comma-separated pair consisting of 'Beta0' and a numeric vector.

The length of Beta0 must equal the number of columns of X. Elements of Beta0 correspond to the predictor variables represented by the columns of X.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

`'DoF0'` — Initial estimate of t-distribution degrees-of-freedom parameter
`10` (default) | positive scalar

Initial estimate of the t-distribution degrees-of-freedom parameter ν, specified as the comma-separated pair consisting of 'DoF0' and a positive scalar. DoF0 must exceed 2.

Data Types: double

`'Variance0'` — Initial estimates of variances of innovations
positive scalar | cell vector of name-value pair arguments

Initial estimates of variances of innovations, specified as the comma-separated pair consisting of 'Variance0' and a positive scalar or a cell vector of name-value pair arguments.

`Mdl.Variance` Value	Description	`'Variance0'` Value
Numeric scalar or `NaN`	Constant variance	Positive scalar
`garch`, `egarch`, or `gjr` model object	Conditional variance model	Cell vector of name-value pair arguments for specifying initial estimates, see the `estimate` function of the conditional variance model objects

By default, estimate derives initial estimates using standard time series techniques.

Example: For a model with a constant variance, set 'Variance0',2 to specify an initial variance estimate of 2.

Example: For a composite conditional mean and variance model, set 'Variance0',{'Constant0',2,'ARCH0',0.1} to specify an initial estimate of 2 for the conditional variance model constant, and an initial estimate of 0.1 for the lag 1 coefficient in the ARCH polynomial.

Data Types: double | cell

Note

NaNs in input data indicate missing values. estimate uses listwise deletion to delete all sampled times (rows) in the input data containing at least one missing value. Specifically, estimate performs these steps:

Synchronize, or merge, the presample data sets E0, V0, and Y0 and the effective sample data X and y to create the separate sets Presample and EffectiveSample.
Remove all rows from Presample and EffectiveSample containing at least one NaN.

Listwise deletion reduces the sample size and can create irregular time series.

Output Arguments

collapse all

`EstMdl` — Estimated ARIMA model
`arima` model object

Estimated ARIMA model, returned as an arima model object.

EstMdl is a copy of Mdl that has NaN values replaced with parameter estimates. EstMdl is fully specified.

`EstParamCov` — Estimated covariance matrix of maximum likelihood estimates
positive semidefinite numeric matrix

Estimated covariance matrix of maximum likelihood estimates known to the optimizer, returned as a positive semidefinite numeric matrix.

The rows and columns contain the covariances of the parameter estimates. The standard error of each parameter estimate is the square root of the main diagonal entries.

The rows and columns corresponding to any parameters held fixed as equality constraints are zero vectors.

Parameters corresponding to the rows and columns of EstParamCov appear in the following order:

Constant
Nonzero AR coefficients at positive lags, from the smallest to largest lag
Nonzero SAR coefficients at positive lags, from the smallest to largest lag
Nonzero MA coefficients at positive lags, from the smallest to largest lag
Nonzero SMA coefficients at positive lags, from the smallest to largest lag
Regression coefficients (when you specify exogenous data X), ordered by the columns of X
Variance parameters, a scalar for constant variance models and vector for conditional variance models (see estimate for the order of parameters)
Degrees of freedom (t-innovation distribution only)

Data Types: double

`logL` — Optimized loglikelihood objective function value
numeric scalar

Optimized loglikelihood objective function value, returned as a numeric scalar.

Data Types: double

`info` — Optimization summary
structure array

Optimization summary, returned as a structure array with the fields described in this table.

Field	Description
`exitflag`	Optimization exit flag (see `fmincon` in Optimization Toolbox)
`options`	Optimization options controller (see `optimoptions` and `fmincon` in Optimization Toolbox)
`X`	Vector of final parameter estimates
`X0`	Vector of initial parameter estimates

For example, you can display the vector of final estimates by entering info.X in the Command Window.

Data Types: struct

Tips

To access values of the estimation results, including the number of free parameters in the model, pass EstMdl to summarize.

Algorithms

estimate infers innovations and conditional variances (when present) of the underlying response series, and then uses constrained maximum likelihood to fit the model Mdl to the response data y.
Because you can specify presample data inputs Y0, E0, and V0 of differing lengths, estimate assumes that all specified sets have these characteristics:
- The final observation (row) in each set occurs simultaneously.
- The first observation in the estimation sample immediately follows the last observation in the presample, with respect to the sampling frequency.
If you specify the 'Display' name-value pair argument, the value overrides the Diagnostics and Display settings of the 'Options' name-value pair argument. Otherwise, estimate displays optimization information using 'Options' settings.
estimate uses the outer product of gradients (OPG) method to perform covariance matrix estimation.

References

[1] Box, George E. P., Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Enders, Walter. Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, Inc., 1995.

[3] Greene, William. H. Econometric Analysis. 6th ed. Upper Saddle River, NJ: Prentice Hall, 2008.

[4] Hamilton, James. D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

Documentation

estimate

Syntax

Description

Examples

Estimate ARMA Model

Apply Equality Constraints to Parameters During Estimation

Initialize Model Estimation Using Presample Response Data

Specify Initial Parameter Values for Optimization

Estimate ARIMA Model Containing Exogenous Predictors (ARIMAX)

Compute Estimated Standard Errors

Compute Fitted Response Values

Input Arguments

Mdl — Partially specified ARIMA model arima model object

y — Single path of response data numeric column vector

Name-Value Pair Arguments

'X' — Exogenous predictor data matrix

'Options' — Optimization options optimoptions optimization controller

'Display' — Command Window display option 'params' (default) | 'diagnostics' | 'full' | 'iter' | 'off' | string vector | cell vector of character vectors

'Y0' — Presample response data numeric column vector

'E0' — Presample innovations numeric column vector

'V0' — Presample conditional variances numeric positive column vector

'Constant0' — Initial estimate of model constant numeric scalar

'AR0' — Initial estimates of nonseasonal AR polynomial coefficients numeric vector

'SAR0' — Initial estimates of seasonal autoregressive polynomial coefficients numeric vector

'MA0' — Initial estimates of nonseasonal moving average polynomial coefficients numeric vector

'SMA0' — Initial estimates of seasonal moving average polynomial coefficients numeric vector

'Beta0' — Initial estimates of regression coefficients numeric vector

'DoF0' — Initial estimate of t-distribution degrees-of-freedom parameter 10 (default) | positive scalar

'Variance0' — Initial estimates of variances of innovations positive scalar | cell vector of name-value pair arguments

Output Arguments

EstMdl — Estimated ARIMA model arima model object

EstParamCov — Estimated covariance matrix of maximum likelihood estimates positive semidefinite numeric matrix

logL — Optimized loglikelihood objective function value numeric scalar

info — Optimization summary structure array

Tips

Algorithms

References

See Also

Objects

Functions

Topics

Econometrics Toolbox Documentation

Support

`Mdl` — Partially specified ARIMA model
`arima` model object

`y` — Single path of response data
numeric column vector

`'X'` — Exogenous predictor data
matrix

`'Options'` — Optimization options
`optimoptions` optimization controller

`'Display'` — Command Window display option
`'params'` (default) | `'diagnostics'` | `'full'` | `'iter'` | `'off'` | string vector | cell vector of character vectors

`'Y0'` — Presample response data
numeric column vector

`'E0'` — Presample innovations
numeric column vector

`'V0'` — Presample conditional variances
numeric positive column vector

`'Constant0'` — Initial estimate of model constant
numeric scalar

`'AR0'` — Initial estimates of nonseasonal AR polynomial coefficients
numeric vector

`'SAR0'` — Initial estimates of seasonal autoregressive polynomial coefficients
numeric vector

`'MA0'` — Initial estimates of nonseasonal moving average polynomial coefficients
numeric vector

`'SMA0'` — Initial estimates of seasonal moving average polynomial coefficients
numeric vector

`'Beta0'` — Initial estimates of regression coefficients
numeric vector

`'DoF0'` — Initial estimate of t-distribution degrees-of-freedom parameter
`10` (default) | positive scalar

`'Variance0'` — Initial estimates of variances of innovations
positive scalar | cell vector of name-value pair arguments

`EstMdl` — Estimated ARIMA model
`arima` model object

`EstParamCov` — Estimated covariance matrix of maximum likelihood estimates
positive semidefinite numeric matrix

`logL` — Optimized loglikelihood objective function value
numeric scalar

`info` — Optimization summary
structure array