Model-Based Covariance Estimation for Regression M- and GM-Estimators

Tobias Schoch

1 Introduction

The population regression model is given by

ξ:Yi=xiTθ+σviEi,θRp,σ>0,iU,

where the population U is of size N; the parameters θ and σ are unknown; the xi's are known values (possibly containing outliers), xiRp, 1p<N; the vi's are known positive (heteroscedasticity) constants; the errors Ei are independent and identically distributed (i.i.d.) random variables with zero expectation and unit variance; it is assumed that iUxixiT/vi is a non-singular (p×p) matrix.

It is assumed that a sample s is drawn from U with sampling design p(s) such that the independence structure of model ξ is maintained. The sample regression GM-estimator of θ is defined as the root to the estimating equation Ψ^n(θ,σ)=0 (for all σ>0), where

Ψ^n(θ,σ)=iswiΨi(θ,σ)withΨi(θ,σ)=η(yixiTθσvi,xi)xiσvi,

where the function η:R×RpR parametrizes the following estimators

M-estimatorMallows GM-estimatorSchweppe GM-Estimator
η(r,x)=ψ(r)η(r,x)=ψ(r)h(x)η(r,x)=ψ(rh(x))h(x)

where ψ:RR is a continuous, bounded, and odd (possibly redescending) function, and h:RpR+ is a weight function.

2 Covariance estimation

The model-based covariance matrix of θ is (Hampel et al., 1986. Chap. 6.3)

(1)covξ(θ,σ)=M1(θ,σ)Q(θ,σ)MT(θ,σ)for knownσ>0,

where

M(θ,σ)=i=1NEξ{Ψi(θ,σ)},whereΨi(θ,σ)=θΨi(Yi,xi;θ,σ)|θ=θ,
Q(θ,σ)=1Ni=1NEξ{Ψi(Yi,xi;θ,σ)Ψi(Yi,xi;θ,σ)T},

and Eξ denotes expectation with respect to model ξ. For the sample regression GM-estimator θ^n, the matrices M and Q must be estimated. In place of M and Q in (1), we have

M-estimatorMallows GM-estimatorSchweppe GM-Estimator
M^M=ψXTWXM^Mal=ψXTWHXM^Sch=XTWS1X
Q^M=ψ2XTWXQ^Mal=ψ2XTWH2XQ^Sch=XTWS2X

where

W=diagi=1,,n{wi}andH=diagi=1,,n{h(xi)},
ψ=1N^iswiψ(riσ^vi)andψ2=1N^iswiψ2(riσ^vi),
S1=diagi=1,,n{s1i},withs1i=1N^jswjψ(rjh(xi)σ^vj),

and

S2=diagi=1,,n{s2i},withs2i=1N^jswjψ2(rjh(xi)σ^vj).

Remarks

3 Implementation

The main function—which is only a wrapper function— is cov_reg_model . The following display shows pseudo code of the main function.

The functions cov_m_est(), cov_mallows_gm_est(), and cov_schweppe_gm_est() implement the covariance estimators; see below. All functions are based on the subroutines in BLAS (Anderson et al., 1999) and LAPACK (Blackford et al., 2002).

To fix notation, denote the Hadamard product of the matrices A and B by AB and suppose that is applied element by element.

M-estimator (cov_m_est). The covariance matrix is (up to a scalar) equal to

(2)(XTWX)1

and is computed as follows:

Mallows GM-estimator (cov_mallows_gm_est). The covariance matrix is (up to a scalar) equal to

(3)(XTWHX)1XTWH2X(XTWHX)1

and is computed as follows:

Schweppe GM-estimator (cov_schweppe_gm_est). The covariance matrix is (up to a scalar) equal to

(4)(XTWS1X)1XTWS2X(XTWS1X)1.

Put s1=diag(S1), s2=diag(S2), and let / denote elemental division (i.e., the inverse of the Hadamard product). The covariance matrix in (4) is computed as follows

Note. Marazzi (1987) uses the Cholesky factorization (see subroutines RTASKV and RTASKW) which is computationally a bit cheaper than our QR factorization.

Literature

ANDERSON, E., BAI, Z., BISCHOF, C., BLACKFORD, L. S. , DEMMEL, J., DONGARRA, J., CROZ, J. D. , GREENHAUM, A., HAMMARLING, S., MCKENNEY, A. AND SORENSEN, D. (1999). LAPACK Users’ Guide. Philadelphia: Society for Industrial and Applied Mathematics (SIAM), 3rd edition. DOI: 10.1137/1.9780898719604

BLACKFORD, L. S., PETITET, A., POZO, R., REMINGTON, K., WHALEY, R. C., DEMMEL, J., DONGARRA, J., DUFF, I., HAMMARLING, S., HENRY, G., HEROUX, M., KAUFMAN, L. AND LUMSDAINE, A. (2002). An Updated Set of Basic Linear Algebra Subprograms (BLAS). ACM Transactions on Mathematical Software 28, 135–151. DOI: 10.1145/567806.567807

HAMPEL, F. R., RONCHETTI, E. M., ROUSSEEUW, P. J. AND STAHEL, W. A. (1986). Robust Statistics: The Approach Based on Influence Functions, New York: John Wiley and Sons. DOI: 10.1002/9781118186435

HUBER, P. J. (1981). Robust Statistics, New York: John Wiley and Sons. DOI: 10.1002/ 0471725250

MARAZZI, A. (1987). Subroutines for robust and bounded influence regression in ROBETH, Cahiers de Recherches et de Documentation, 3 ROBETH 2, Division de Statistique et Informatique, Institut Universitaire de Médecine Sociale et Préventive, Lausanne, ROBETH-85 Document No. 2, August 1985, revised April 1987.

MARAZZI, A. (1993). Algorithms, Routines, and S Functions for Robust Statistics: The FORTRAN Library ROBETH with an interface to S-PLUS, New York: Chapman and Hall.

MARAZZI, A. (2020). robeth: R Functions for Robust Statistics, R package version 2.7-6.

VENABLES, W. N. AND RIPLEY, B. D. (2002). Modern Applied Statistics with S, New York: Springer, 4th edition. DOI: 10.1007/978-0-387-21706-2