Robust Generalized Regression Predictor

Tobias Schoch

1 Introduction

The population regression model is given by

(1)ξ:Yi=xiTθ+σviEi,θRp,σ>0,iU,

where

Remarks. The i.i.d. assumption on the errors Ei is rather strict. This assumption can be replaced by the assumption that the Ei are identically distributed r.v. such that Eξ(Eixi,,xN)=0 and Eξ(EiEjxi,,xN)=1 if i=j and zero otherwise for all i,jU, where Eξ denotes expectation w.r.t. model ξ in (1). Another generalization obtains by requiring that Eξ(Eixi)=Eξ(xiEi)=0 in place of the conditional expectation. If the distribution of the errors Ei is asymmetric with non-zero mean, the regression intercept and the errors are confounded. The slope parameters, however, are identifiable with asymmetric distributions (Carroll and Welsh, 1988). In the context of GREG prediction, however, we deal with prediction under the model. Thus, identifiability is not an issue.

It is assumed that a sample s is drawn from U with sampling design p(s) such that the independence (orthogonality) structure of the model errors in (1) is maintained. The sample regression M- and GM-estimator of θ are defined as the root to the following estimating equations (cf. Hampel et al., 1986, Chapter 6.3)

iswiviψk(ri)xi=0(M-estimator),iswivih(xi)ψk(ri)xi=0(Mallows GM-estimator),iswivih(xi)ψk(rih(xi))xi=0(Schweppe GM-estimator),

where

(2)ri=yixiTθσvi,

The Huber and Tukey bisquare (biweight) ψ-functions are denoted by, respectively, ψkhub and ψktuk. The sample-based estimators of θ can be written as a weighted least squares problem

iswiviui(ri,k)(yixiTθ^n)xi=0,

where

(3)ui(ri,k)={ψk(ri)riM-estimator,h(xi)ψk(ri)riMallows GM-estimator,ψk(ri)ri,whereri=rih(xi)Schweppe GM-estimator,

and k denotes the robustness tuning constant.

 

2 Representation of the robust GREG as a QR-predictor

The robust GREG predictor of the population y-total can be written in terms of the g-weights (see e.g., Särndal et al., 1992, Chapter 6) as

(4)t^yrob=isgiyi,

where the g-weights are defined as (Duchesne, 1999)

(5)gi=bi+(txt^bx)T(isqixixiT)1qixi,

where t^bx=isbixi and tx=iUxi. The sampling weights, wi, are "embedded" into the g-weights in (4).

In contrast to the non-robust "standard" GREG predictor, the g-weights in (5) depend on the study variable, yi, through the choice of the constants (qi,bi)={(qi,bi):is}. This will be easily recognized once we define the set of constants. The predictors of the population y-total that are defined in terms of the constants (qi,ri) form the class of QR-predictor due to Wright (1983).

Important. We denote the constants by (qi, bi) instead of (qi, ri) because ri is our notation for the residuals.

In passing we note that t^yrob can be expressed in a "standard" GREG representation. Let

θ^=(isqixixiT)1isqixiyi,

then t^yrob in (4) can be written as

t^yrob=isbiyi+(txt^bx)Tθ^=txTθ^+isbi(yixiTθ^).

In the next two sections, we define the constants (qi,bi) of the QR-predictor.

2.1 Constants qi of the QR-predictor

The set of constants {qi} is defined as

(6)qi=wiui(ri,k)vi,i=1,,n,

where vi is given in (1) and ui(ri,k) is defined in (3). The tuning constant k in ui(ri,k) is the one that is used to estimate θ.

2.2 Constants bi of the QR-predictor

The constants {bi} are predictor-specific. They depend on the argument type. Moreover, the bi's depend on the robustness tuning constant k —which is an argument of svymean_reg() and svytotal_reg()—to control the robustness of the prediction. To distinguish it from the tuning constant k, which is used in fitting model ξ in (1), it will be denoted by κ. Seven sets {bi} are available.

2.3 Implementation

Let q=(q1,,qn)T and b=(b1,,bn)T, where qi and bi are defined in, respectively, (6) and Section 2.2. Put Z=qX, where denotes Hadamard multiplication and the square root is applied element by element. The vector-valued g-weights, g=(g1,,gn)T, in (5) can be written as

gT=bT+(txt^bx)T(ZTZ)1ZT=H,say(q)T.

Define the QR factorization Z=QR, where Q is an orthogonal matrix and R is an upper triangular matrix (both of conformable size). Note that the matrix QR-factorization and Wright's QR-estimators have nothing in common besides the name; in particular, q and Q are unrelated. With this we have

H=(ZTZ)1ZT=R1QT

and multiplying both sides by R, we get RH=QT which can be solved easily for H since R is an upper triangular matrix (see base::backsolve()). Thus, the g-weights can be computed as

g=b+HT(txt^bx)q,

where the (p×n) matrix H need not be explicitly transposed when using base::crossprod(). The terms b and t^bx are easy to compute. Thus,

t^yrob=gTy,wherey=(y1,,yn)T.

3 Variance estimation

Important. Inference of the regression estimator is only implemented under the assumption of representative outliers (in the sense of Chambers, 1986). We do not cover inference in presence of nonrepresentative outliers.

Our discussion for variance estimation follows the line of reasoning in Särndal et al., (1992, 233-234) on the variance of the non-robust GREG estimator. To this end, denote by Ei=yixiTθN, iU, the census residuals, where θN is the census parameter. With this, any g-weighted predictor can be written as

(7)t^yrob=isgiyi=isgi(xiTθN+Ei)=iUxiTθN+isgiEi,

where we have used the fact that the g-weights in (5) satisfy the calibration property

isgixi=iUxi.

The first term on the r.h.s. of the last equality in (7) is a population quantity and does therefore not contribute to the variance of t^yrob. Thus, we can calculate the variance of the robust GREG predictor by

(8)var(t^yrob)=var(isgiEi)

under the assumptions that (1) the Ei are known quantities and (2) the gi do not depend on the yi.

Disregarding the fact that the g-weights are sample dependent and substituting the sample residual ri for Ei in (8), Särndal et al. (1992, 233-234 and Result 6.6.1) propose to estimate the variance of the GREG predictor by the g-weighted variance of the total isgiri. Following the same train of thought and disregarding in addition that the gi depend on yi, the variance of t^yrob can be approximated by

var^(t^yrob)var^(isgiri),

where var^() denotes a variance estimator of a total for the sampling design p(s).

Literature

BEAUMONT, J.-F. AND A. ALAVI (2004). Robust Generalized Regression Estimation, Survey Methodology 30, 195–208.

BEAUMONT, J.-F. AND L.-P. RIVEST (2009). Dealing with outliers in survey data, in Sample Surveys: Theory, Methods and Inference, ed. by D. Pfeffermann and C. R. Rao, Amsterdam: Elsevier, vol. 29A of Handbook of Statistics, chap. 11, 247–280. DOI:10.1016/ S0169-7161(08)00011-4

CARROLL, R. J. AND A. H. WELSH (1988). A Note on Asymmetry and Robustness in Linear Regression. The American Statistician 42, 285–287. DOI:10.1080/00031305.1988.10475591

CHAMBERS, R. (1986). Outlier Robust Finite Population Estimation. Journal of the American Statistical Association 81, 1063–1069, DOI:10.1080/01621459.1986.10478374.

DUCHESNE, P. (1999). Robust calibration estimators, Survey Methodology 25, 43–56.

HAMPEL, F. R., E. M. RONCHETTI, P. J. ROUSSEEUW, AND W. A. STAHEL (1986). Robust Statistics: The Approach Based on Influence Functions, New York: John Wiley & Sons. DOI:10.1002/9781118186435

HULLIGER, B. (1995). Outlier Robust Horvitz–Thompson Estimators. Survey Methodology 21, 79–87.

LEE, H. (1995). Outliers in business surveys, in Business survey methods, ed. by B. G. Cox, D. A. Binder, B. N. Chinnappa, A. Christianson, M. J. Colledge, and P. S. Kott, New York: John Wiley and Sons, chap. 26, 503–526. DOI:10.1002/9781118150504.ch26

SÄRNDAL, C.-E., B. SWENSSON, AND J. WRETMAN (1992). Model Assisted Survey Sampling, New York: Springer.

SÄRNDAL, C.-E. AND R. L. WRIGHT (1984). Cosmetic Form of Estimators in Survey Sampling. Scandinavian Journal of Statistics 11, 146–156.

WRIGHT, R. L. (1983). Finite population sampling with multivariate auxiliary information, Journal of the American Statistical Association 78, 879–884. DOI:10.1080/01621459.1983.10477035.