int __stdcall NDK_MLR_FITTED | ( | double ** | X, |
size_t | nXSize, | ||
size_t | nXVars, | ||
LPBYTE | mask, | ||
size_t | nMaskLen, | ||
double * | Y, | ||
size_t | nYSize, | ||
double | intercept, | ||
WORD | nRetType | ||
) |
Returns the fitted values of the conditional mean, residuals or leverage measures.
- Returns
- status code of the operation
- Return values
-
NDK_SUCCESS Operation successful NDK_FAILED Operation unsuccessful. See Macros for full list.
- Parameters
-
[in] X is the independent (explanatory) variables data matrix, such that each column represents one variable. [in] nXSize is the number of observations (rows) in X. [in] nXVars is the number of independent (explanatory) variables (columns) in X. [in] mask is the boolean array to choose the explanatory variables in the model. If missing, all variables in X are included. [in] nMaskLen is the number of elements in the "mask." [in] Y is the response or dependent variable data array (one dimensional array of cells). [in] nYSize is the number of observations in Y. [in] intercept is the constant or intercept value to fix (e.g. zero). If missing (i.e. NaN), an intercept will not be fixed and is computed normally. [in] nRetType is a switch to select the return output (1=fitted values (default), 2=residuals, 3=standardized residuals, 4=leverage, 5=Cook's distance). - Fitted/conditional mean
- Residuals
- Standardized residuals
- Leverage factor (H)
- Cook's distance (D)
- Remarks
-
- The underlying model is described here.
- The regression fitted (aka estimated) conditional mean is calculated as follows: \[\hat y_i = E \left[ Y| x_i1\cdots x_ip \right] = \alpha + \hat \beta_1 \times x_i1 + \cdots + \beta_p \times x_ip\] Residuals are defined as follows: \[e_i = y_i - \hat y_i\] The standardized residuals are calculated as follow: \[\bar e_i = \frac{e_i}{\hat \sigma_i}\] Where:
- \(\hat y\) is the estimated regression value.
- \(e\) is the error term in the regression.
- \(\hat e\) is the standardized error term.
- \(\hat \sigma_i \) is the standard error for the i-th observation.
- For the influential data analysis, SLR_FITTED computes two values: leverage statistics and Cook's distance for observations in our sample data.
- Leverage statistics describe the influence that each observed value has on the fitted value for that same observation. By definition, the diagonal elements of the hat matrix are the leverages. \[H = X \left(X^\top X \right)^{-1} X^\top\] \[L_i = h_{ii}\] Where:
- \(H\) is the Hat matrix for uncorrelated error terms.
- \(\mathbf{X}\) is a (N x p+1) matrix of explanatory variables where the first column is all ones.
- \(L_i\) is the leverage statistics for the i-th observation.
- \(h_{ii}\) is the i-th diagonal element in the hat matrix.
- Cook's distance measures the effect of deleting a given observation. Data points with large residuals (outliers) and/or high leverage may distort the outcome and accuracy of a regression. Points with a large Cook's distance are considered to merit closer examination in the analysis. \[D_i = \frac{e_i^2}{p \ \mathrm{MSE}}\left[\frac{h_{ii}}{(1-h_{ii})^2}\right]\] Where
- \(D_i\) is the cook's distance for the i-th observation.
- \(h_{ii}\) is the leverage statistics (or the i-th diagonal element in the hat matrix).
- \(\mathrm{MSE}\) is the mean square error of the regression model.
- \(p\) is the number of explanatory variables.
- \(e_i\) is the error term (residual) for the i-th observation.
- The sample data may include missing values.
- Each column in the input matrix corresponds to a separate variable.
- Each row in the input matrix corresponds to an observation.
- Observations (i.e. row) with missing values in X or Y are removed.
- The number of rows of the response variable (Y) must be equal to number of rows of the explanatory variables (X).
- The MLR_FITTED function is available starting with version 1.60 APACHE.
- Requirements
-
Header SFSDK.H Library SFSDK.LIB DLL SFSDK.DLL
Namespace: | NumXLAPI |
Class: | SFSDK |
Scope: | Public |
Lifetime: | Static |
int __stdcall NDK_MLR_FITTED | ( | double ** | X, |
size_t | nXSize, | ||
size_t | nXVars, | ||
LPBYTE | mask, | ||
size_t | nMaskLen, | ||
double * | Y, | ||
size_t | nYSize, | ||
double | intercept, | ||
WORD | nRetType | ||
) |
Returns the fitted values of the conditional mean, residuals or leverage measures.
- Returns
- status code of the operation
- Return values
-
NDK_SUCCESS Operation successful NDK_FAILED Operation unsuccessful. See Macros for full list.
- Parameters
-
[in] X is the independent (explanatory) variables data matrix, such that each column represents one variable. [in] nXSize is the number of observations (rows) in X. [in] nXVars is the number of independent (explanatory) variables (columns) in X. [in] mask is the boolean array to choose the explanatory variables in the model. If missing, all variables in X are included. [in] nMaskLen is the number of elements in the "mask." [in] Y is the response or dependent variable data array (one dimensional array of cells). [in] nYSize is the number of observations in Y. [in] intercept is the constant or intercept value to fix (e.g. zero). If missing (i.e. NaN), an intercept will not be fixed and is computed normally. [in] nRetType is a switch to select the return output (1=fitted values (default), 2=residuals, 3=standardized residuals, 4=leverage, 5=Cook's distance). - Fitted/conditional mean
- Residuals
- Standardized residuals
- Leverage factor (H)
- Cook's distance (D)
- Remarks
-
- The underlying model is described here.
- The regression fitted (aka estimated) conditional mean is calculated as follows: \[\hat y_i = E \left[ Y| x_i1\cdots x_ip \right] = \alpha + \hat \beta_1 \times x_i1 + \cdots + \beta_p \times x_ip\] Residuals are defined as follows: \[e_i = y_i - \hat y_i\] The standardized residuals are calculated as follow: \[\bar e_i = \frac{e_i}{\hat \sigma_i}\] Where:
- \(\hat y\) is the estimated regression value.
- \(e\) is the error term in the regression.
- \(\hat e\) is the standardized error term.
- \(\hat \sigma_i \) is the standard error for the i-th observation.
- For the influential data analysis, SLR_FITTED computes two values: leverage statistics and Cook's distance for observations in our sample data.
- Leverage statistics describe the influence that each observed value has on the fitted value for that same observation. By definition, the diagonal elements of the hat matrix are the leverages. \[H = X \left(X^\top X \right)^{-1} X^\top\] \[L_i = h_{ii}\] Where:
- \(H\) is the Hat matrix for uncorrelated error terms.
- \(\mathbf{X}\) is a (N x p+1) matrix of explanatory variables where the first column is all ones.
- \(L_i\) is the leverage statistics for the i-th observation.
- \(h_{ii}\) is the i-th diagonal element in the hat matrix.
- Cook's distance measures the effect of deleting a given observation. Data points with large residuals (outliers) and/or high leverage may distort the outcome and accuracy of a regression. Points with a large Cook's distance are considered to merit closer examination in the analysis. \[D_i = \frac{e_i^2}{p \ \mathrm{MSE}}\left[\frac{h_{ii}}{(1-h_{ii})^2}\right]\] Where
- \(D_i\) is the cook's distance for the i-th observation.
- \(h_{ii}\) is the leverage statistics (or the i-th diagonal element in the hat matrix).
- \(\mathrm{MSE}\) is the mean square error of the regression model.
- \(p\) is the number of explanatory variables.
- \(e_i\) is the error term (residual) for the i-th observation.
- The sample data may include missing values.
- Each column in the input matrix corresponds to a separate variable.
- Each row in the input matrix corresponds to an observation.
- Observations (i.e. row) with missing values in X or Y are removed.
- The number of rows of the response variable (Y) must be equal to number of rows of the explanatory variables (X).
- The MLR_FITTED function is available starting with version 1.60 APACHE.
- Requirements
-
Header SFSDK.H Library SFSDK.LIB DLL SFSDK.DLL
- References
- * Hamilton, J .D.; Time Series Analysis , Princeton University Press (1994), ISBN 0-691-04289-6
- * Tsay, Ruey S.; Analysis of Financial Time Series John Wiley & SONS. (2005), ISBN 0-471-690740
- * D. S.G. Pollock; Handbook of Time Series Analysis, Signal Processing, and Dynamics; Academic Press; Har/Cdr edition(Nov 17, 1999), ISBN: 125609906
- * Box, Jenkins and Reisel; Time Series Analysis: Forecasting and Control; John Wiley & SONS.; 4th edition(Jun 30, 2008), ISBN: 470272848