Simex estimator for polynomial errors-in-variables model

For polynomial errors-in-variables model, the Simex estimator is constructed in such way that it is consistent, as the samples size grows and the size of auxiliary sample is fixed. Then the estimator is modified in such a way that it shows good results for small samples without losing its asymptotic p...

Full description

Saved in:
Bibliographic Details
Date:2007
Main Authors: Gontar, O., Malenko, A.
Format: Article
Language:English
Published: Інститут математики НАН України 2007
Online Access:https://nasplib.isofts.kiev.ua/handle/123456789/4478
Tags: Add Tag
No Tags, Be the first to tag this record!
Journal Title:Digital Library of Periodicals of National Academy of Sciences of Ukraine
Cite this:Simex estimator for polynomial errors-in-variables model / O. Gontar, A. Malenko // Theory of Stochastic Processes. — 2007. — Т. 13 (29), № 1-2. — С. 57-65. — Бібліогр.: 6 назв.— англ.

Institution

Digital Library of Periodicals of National Academy of Sciences of Ukraine
_version_ 1859659537832738816
author Gontar, O.
Malenko, A.
author_facet Gontar, O.
Malenko, A.
citation_txt Simex estimator for polynomial errors-in-variables model / O. Gontar, A. Malenko // Theory of Stochastic Processes. — 2007. — Т. 13 (29), № 1-2. — С. 57-65. — Бібліогр.: 6 назв.— англ.
collection DSpace DC
description For polynomial errors-in-variables model, the Simex estimator is constructed in such way that it is consistent, as the samples size grows and the size of auxiliary sample is fixed. Then the estimator is modified in such a way that it shows good results for small samples without losing its asymptotic properties for large samples. Simulation studies corroborate the theoretical findings.
first_indexed 2025-11-30T09:12:47Z
format Article
fulltext Theory of Stochastic Processes Vol.13 (29), no.1-2, 2007, pp.57-65 OLENA GONTAR AND ANDRII MALENKO SIMEX ESTIMATOR FOR POLYNOMIAL ERRORS-IN-VARIABLES MODEL For polynomial errors-in-variables model, the Simex estimator is con- structed in such way that it is consistent, as the samples size grows and the size of auxiliary sample is fixed. Then the estimator is modi- fied in such a way that it shows good results for small samples without losing its asymptotic properties for large samples. Simulation studies corroborate the theoretical findings. 1. Introduction We consider polynomial measurement error model⎧⎨⎩ yi = m∑ j=0 βjξ j i + εi, xi = ξi + δi, (1) where yi, xi are observed and ξi are unobservable independent random vari- ables, i = 1, n. Suppose that δi are i.i.d. normal random variables and their variance σ2 δ is known. It is well known that the naive estimator of regression parameters β0, β1, . . . , βm, which ignores measurement errors is inconsistent. Cheng and Schneeweiss (1998) proposed the adjusted least squares β̂ALS estimator in the model (1) which is consistent. This estimator can be viewed as resulting from the principle of corrected score due to Stefanski (1989) and Nakamura (1990). A small sample modification of β̂ALS estimator was proposed in Cheng et al. (2000), such that it shows good results for small samples without losing its asymptotic properties for large samples. Another estimator was introduced by Cook and Stefanski (1994) and is called Simex. The key idea underlying Simex is the fact that the effect of measurement error on an estimator can be determined experimentally via 2000 Mathematics Subject Classification 62J02, 62F12, 62-07. Key words and phrases. Simex estimator, errors-in-variables models, Hermite poly- nomials 57 58 OLENA GONTAR, ANDRII MALENKO simulation. This is achieved by studying the naive regression estimator as a function f of measurement error variance in the regressors. The purpose of this paper is to construct the consistent Simex estima- tor of the regression parameter. The observed variables are used for mod- eling the function f . This idea is close to the idea of Polzehl and Zwanzig (2005). Simulation studies show that for finite sample the Simex estimator in polynomial regression can sometimes produce extremely large estimat- ing errors as well as the ALS estimator. It is proposed how to modify this estimator for small samples still preserving its asymptotic properties. The paper is organized as follows. In the next section the polynomial errors-in-variables model is introduced and auxiliary lemmas are proved. Section 3 is devoted to construction of Simex estimator and the proof of its consistency. The small sample modification is proposed in Section 4. Section 5 gives some simulation results and shows the effect of modification, and Section 6 concludes. In the paper expectation is denoted as E, the almost sure convergence as P1→, and the convergence in probability as P→. 2. Model and additional lemmas We consider the polynomial errors-in-variables model of order m ≥ 1,{ yi = β0 + β1ξi + . . . + βmξm i + εi, xi = ξi + δi, i = 1, n. Here {ξi, i ≥ 1}, {εi, i ≥ 1}, {δi, i ≥ 1} are i.i.d. and mutually independent sequences. We assume that E|ξ1|m < ∞, δ1 ∼ N(0, σ2 δ ), σ2 δ is known, Eε1 = 0, Eε2 1 < ∞. The variances of ξ1, ε1, and δ1 are supposed to be positive. Denote Xi = (1, xi, . . . , x m i )t. The naive, or ordinary least squares esti- mator of β is β̂naive = M−1 XXMXY , where MXX := XX t, MXY := Xy. Here the bar means averaging over n. To introduce β̂ALS estimator consider the Hermite polynomials hk(x, t) of x which possess the following properties: h−1(x, t) = h0(x, t) = 1, hk+1(x, t) = xhk(x, t) + tkhk−1(x, t), k ≥ 1, and let H(x, t) be the matrix of the following structure: Hrs(x, t) = hr+s(x, t), r, s = 0, . . . , m. Denote the matrix 1 n n∑ i=1 H(xi,−σ2 δ ) as MH and the vector {h0(xi,−σ2 δ ), h1(xi,−σ2 δ ), . . . , hm(xi,−σ2 δ )}t as hi. Then β̂ALS is defined as a solution to a linear equation: MH β̂ALS = 1 n n∑ i=1 hi. (2) SIMEX ESTIMATOR 59 To construct the simex estimator fix a number B. Consider standard normal i.i.d. sequence {ηib, i ≥ 1, b = 1, B}, which is independent of other random variables in the model. Denote xib(λ) = xi +ηib √ λ, i ≥ 1, b = 1, B, and Xib(λ) = {1, xib(λ), . . . , xm ib (λ)}t. Introduce MXX(λ) = X(λ)X t(λ), MXY (λ) = X(λ)y. Hereafter the bar means averaging over n and b, e.g., X(λ)y = 1 nB n∑ i=1 B∑ b=1 Xib(λ)yi. The corresponding naive estimate of β is β̂naive(λ) = M−1 XX(λ)MXY (λ). For each λ introduce the matrix MH(λ) = 1 n n∑ i=1 H(xi, λ). From Lemma1 below it follows that MXX(λ) = MH(λ) + o(1), as n → ∞, a.s. (3) Lemma 1. Let ξ and δ be independent random variables with δ ∼ N(0, σ2 δ ). Then E((ξ + δ)n|ξ) = hn(ξ, σ2 δ). Proof. To prove the next equality one should use partial integration E((ξ + δ)n+1|ξ) = ξE((ξ + δ)n|ξ) + E(δ(ξ + δ)n|ξ) = = ξE((ξ + δ)n|ξ) + (n − 1)σ2 δE(δ(ξ + δ)n−1|ξ). Then induction is used. � Lemma 2. Let X = (1, x, . . . , xm)t and h(x, t) = (h0(x, t), . . . , hm(x, t))t, and T be a transition matrix: h(x, t) = T (t)X. Then T (t + s) = T (t)T (s) and T (−t) = T−1(t), t, s ∈ R. Proof. Assume that s and t are positive real numbers. Let x = ξ + δ + γ, where δ ∼ N(0, s), γ ∼ N(0, t), s > 0, t > 0, and ξ, δ, γ are mutually independent. Let ρ = (1, ξ, . . . , ξm)t, ψ = (1, ξ + δ, . . . , (ξ + δ)m)t. By Lemma 1 we can write E(X|ξ) = h(ξ, s + t) = T (s + t)ρ. But E(X|ξ) = E(E(X|ξ, δ), ξ) = E(h(ξ + δ, t)|ξ) = = E(T (t)ψ|ξ) = T (t)h(ξ, s) = T (t)T (s)ρ. Thus for positive real numbers we proved that T (t + s) = T (t)T (s). This equality is extended for arbitrary real numbers, because the Hermite poly- nomials can be constructed for any real parameter t and entries of T (t) are polynomials on t. Then T (−t)T (t) = T (0) = I. � 60 OLENA GONTAR, ANDRII MALENKO Now using Lemmas 1 and 2 we get that E(MXY (λ)|X, y) = T (λ)MXY , therefore a.s. MXY (λ) = T (λ)MXY + o(1), as n → ∞. (4) 3. Simex estimator The following model is proposed for fitting the naive estimators: β̂(λ, θ) = M−1 H (λ)T (λ)θ. Let K ≥ 1, 0 = λ1 < λ2 < . . . < λK . The parameter θ is estimated by least squares method as θ̂ = argmin θ K∑ k=1 ||β̂naive(λk) − β̂(θ, λk)||2. Thus θ̂ equals θ̂t = ( K∑ k=1 M t XY (λk)M −1 XX(λk)M −1 H (λk)T (λk) )( K∑ k=1 T t(λk)M −2 H (λ)T (λk) )−1 Using (3) and (4) it is easy to see that a.s. θ̂ = MXY + o(1), as n → ∞. (5) We define the Simex estimator as β̂Simex := β̂(−σ2 δ , θ̂) = M−1 H (−σ2 δ )T (−σ2 δ )θ̂. Theorem 1. Under the model assumptions, the Simex estimator is strongly consistent: β̂Simex P1→ β, as n → ∞. Proof. Using Lemmas 1 and 2 we can prove that 1 n n∑ i=1 hk(xi, λ) P1→ Ehk(x, λ) = EE((x + √ λε)k| x) = = E(x+ √ λε)k = E(ξ+δ+ √ λε)k = EE((ξ+δ+ √ λε)k| ξ) = Ehk(ξ, λ+σ2 δ ). Thus substituting (−σ2 δ ) for λ we obtain 1 n n∑ i=1 hk(xi,−σ2 δ ) → Ehk(ξ,−σ2 δ + σ2 δ ) = Ehk(ξ, 0) = Eξt. Hence MH(−σ2 δ ) P1→ Eρρt, where ρ := (1, ξ, . . . , ξm)t. Using (5) and Lemma1 again, we obtain θ̂ P1→ EMXY = EE(MXY | ξ) = ET (σ2 δ )ρρtβ = T (σ2 δ )Eρρtβ. SIMEX ESTIMATOR 61 The consistency of Simex estimator is obvious from Lemma 2: β̂Simex = β̂(−σ2 δ , θ̂) P1→ (Eρρt)−1T (−σ2 δ )T (σ2 δ )Eρρtβ = β. � Remark. In the special case K = 1 we have λ1 = 0 and transition ma- trix T (λ1) = T (0) = Im (the identity matrix). The matrix MH(λ1) = MH(0) = = MXX . We notice that MXX(λ1) = MXX(0) = MXX , and MXY (λ1) = = MXY (0) = MXY . Hence we obtain that θ̂ = MXY , β̂Simex = M−1 H (−σ2 δ )T (−σ2 δ )MXY . Therefore MH(−σ2 δ )β̂Simex = T (−σ2 δ )MXY . (6) Thus β̂Simex is the solution to the equation (6). But this equation (in current notations) is the same as the equation (2) for the ALS estimator of β. So in the case K = 1 the Simex estimator coincides with the ALS estimator. 4. Small sample modification From β̂Simex = M−1 H (−σ2 δ )T (−σ2 δ )θ̂ we can write that β̂Simex is the solution to the following equation: MH(−σ2 δ )β̂Simex = T (−σ2 δ )θ̂. (7) We have MH(−σ2 δ ) P1→ Eρρt, as n → ∞, and therefore it is positive defi- nite for n ≥ n0(w) a.s. But for small samples, however, MH(−σ2 δ ) can be indefinite and this can cause significant bias for the Simex estimator. Intro- duce Vi = h(xi,−σ2 δ )h t(xi,−σ2 δ )−H(xi,−σ2 δ ). Taking average over n we can write V = h(−σ2 δ )h t(−σ2 δ ) − MH(−σ2 δ ). Using this relation, the estimation equation (7) can be rewritten as( h(−σ2 δ )h t(−σ2 δ ) − V ) β̂Simex = T (−σ2 δ )θ̂. (8) Define λ as the smallest positive root of the equation det(A − λB) = 0, where A = ( y2 yht(−σ2 δ ) h(−σ2 δ )y h(−σ2 δ )h t(−σ2 δ ) ) , B = ( 0 0 0 V ) . We assume that A is positive definite. To construct small sample modification of Simex estimator we use the same approach as in Cheng et al. (2000) is used. Proofs of the next two theorems are similar to that paper. The modified Simex estimator can be found as a solution to the equation:( h(−σ2 δ )h t(−σ2 δ ) − aV ) β̂MSimex = T (−σ2 δ )θ̂. (9) 62 OLENA GONTAR, ANDRII MALENKO Here a is defined as{ a = (n − α)/n, if λ > 1 + 1 n , a = λ(n − α)/(n + 1), if λ ≤ 1 + 1 n , (10) with some α < n to be chosen so that the resulting estimator possesses better small sample properties. The number α = m+1 is the lowest α that one should choose, see the discussion in Cheng et al.(2000). Theorem 2. The following inequality holds a.s.: h(−σ2 δ )h t(−σ2 δ ) − aV ≥ α + 1 n + 1 h(−σ2 δ )h t(−σ2 δ ) > 0. (11) (Hereafter inequalities for matrices are understood in Lowener order.) Proof. As A is positive definite it can be decomposed as A = CCt with a nonsingular matrix C. Define B̃ = C−1BC−t. Let d be the largest eigenvalue of B̃. As the second diagonal element of V , h2 1(−σ2 δ ) − h2(−σ2 δ ) = σ2 δ , is positive, B̃ has at least one positive eigenvalue, and therefore d > 0. It follows that λ = 1 d . Let D be the diagonal matrix of eigenvalues of B̃ and E be a matrix, the columns of which are the corresponding normalized eigenvectors. Then B̃ = EDEt, EEt = I. It follows that B = CEDEtCt, with T = CE we have A = TT t, and B = TDT t. Hence for any scalar c, A − cB = T (I − cD)T t, (12) with a nonsingular matrix T. In the first case, when λ > 1+ 1 n , we see that d < n(n+1) and therefore D < n(n + 1)I. Hence I − aD = I − n − α n D > α + 1 n + 1 I. In the second case λ ≤ 1 + 1 n . In general d−1D ≤ I. This implies that I − aD = I − λ(n − α) n + 1 D = I − (n − α) n + 1 d−1D ≥ α + 1 n + 1 I. Thus in both cases we obtain A − aB ≥ α+1 n+1 A > 0. Deleting the first row and column of these matrices results in (11). � Theorem 3. The modified estimator β̂MSimex is asymptotically equivalent to unmodified one β̂Simex: √ n(β̂MSimex − β̂Simex) P→ 0, as n → ∞. SIMEX ESTIMATOR 63 Proof. First, prove that P (λ > 1) converges to 1, as n → ∞. Condition λ > 1 is equivalent to d < 1 or D < I. According to (12) this is equivalent to A > B. As in the proof of Theorem 2, it can be shown that A > B is equivalent to h(−σ2 δ )h t(−σ2 δ ) − V > 0. Since h(−σ2 δ )h t(−σ2 δ ) − V = MH(−σ2 δ ) converges to the matrix E(ρρt), which is positive definite with probability 1, one can state that P (A > B) = = P (λ > 1) which converges to 1, as n → ∞. Now for λ > 1 we have, by the definition of a, that n − α n + 1 < a ≤ n − α n , and after some algebra α + 1√ n > (1 − a) √ n ≥ α√ n . This inequality holds with probability tending to 1, as n → ∞. Since outer parts of this inequality converge to 0, we have (1− a) √ n P→ 0, as n → ∞. By subtracting equation (9) from (8) we derive after some algebra (h(−σ2 δ )h t(−σ2 δ ) − aV )(β̂Simex − β̂MSimex) √ n = (1 − a) √ n V β̂Simex. The right-hand side converges to 0, whereas h(−σ2 δ )h t(−σ2 δ ) − aV > 0, therefore √ n(β̂MSimex − β̂Simex) P→ 0, as n → ∞. � 5. Simulation results Simulation was made in R-package. We studied the quadratic model yi = b0 + b1ξi + b2ξ 2 i + εi, xi = ξi + δi. We specified εi and δi as normally distributed variables with Eεiδi = 0 and σ2 δ = σ2 ε = 0.25 and σ2 ξ = 1. The sample size n equals 20. For Simex estimator the following values were used: B = 100, K = 11, λk = kσ2 δ , k = 0, 10. True values were b0 = 5, b1 = 6, b2 = 3. The simulation results are plotted below for the parameter b2. The naive estimator is denoted by solid circle, the ALS by square, the Simex by star, and the modified Simex by triangle. Circles correspond to naive estimators with larger variance. Solid line describes the behavior of fitted model and dashed line denotes the true value of the parameter. In the first picture MH(−σ2 δ ) is not positive definite, and as a result the Simex estimator has extremely large estimating error (β̂Simex=35.03, while β̂MSimex=2.91). 64 OLENA GONTAR, ANDRII MALENKO 0 1 2 3 4 0 1 2 3 4 Lambda values V al ue s of n ai ve e st im at or fo r b2 Naive SIMEX ALS Modified Simex In the second picture MH(−σ2 δ ) is positive definite, and the Simex estimator is a good one (β̂Simex=3.19 and β̂MSimex=3.04). 0 1 2 3 4 0 1 2 3 4 Lambda values V al ue s of n ai ve e st im at or fo r b2 Naive SIMEX ALS Modified Simex 6. Conclusion In the article the Simex estimator for polynomial errors-in-variables model is constructed. It differs from the classical Simex estimator proposed by Cook and Stefanski(1995) due to the fact that the observed variables are SIMEX ESTIMATOR 65 used to model the naive estimator as a function of extra variance. The consistency of constructed Simex estimator is proved. Then this estimator is modified such that it shows good results for small samples without losing its asymptotic properties for large samples. Simulation studies made in statistical package R corroborate the theoretical result. Bibliography 1. Cheng, C.-L., and Schneeweiss, H. (1998). Polynomial regression with er- rors in the variables. J.R. Statist. Soc. B, 60, 189-199. 2. Cheng, C.-L., Schneeweiss, H., and Thamerus, M. (2000). A small sample estimator for a polynomial regression with errors in the variables. J.R. Statist. B, 62, 699-709. 3. Cook, J., and Stefanski, L.A. (1995). A simulation extrapolation method for parametric measurement error models. Journal of the American Statistical Association, 89, 1314-1328. 4. Nakamura, T. (1990). Corrected Score functions for errors-in-variables models: methodology and application to generalized linear models. Bio- metrika, 77, 127-137. 5. Polzehl, J., and Zwanzig, S.(2005) Simex and TLS: an equivalence result. WIAS, Technical Report 999, Berlin. 6. Stefanski, L. A. (1989). Unbiased estimation of a nonlinear function of a normal mean with application to measurement error model. Com- putation in Statistics, Series A, 18, 4335-4358. Department of Probability Theory and Mathematical Statistics, Kyiv National Taras Shevchenko University, Kyiv, Ukraine E-mail address: gontaro@ukr.net. Department of Probability Theory and Mathematical Statistics, Kyiv National Taras Shevchenko University, Kyiv, Ukraine E-mail address: exipilis@yandex.ru.
id nasplib_isofts_kiev_ua-123456789-4478
institution Digital Library of Periodicals of National Academy of Sciences of Ukraine
issn 0321-3900
language English
last_indexed 2025-11-30T09:12:47Z
publishDate 2007
publisher Інститут математики НАН України
record_format dspace
spelling Gontar, O.
Malenko, A.
2009-11-19T10:11:19Z
2009-11-19T10:11:19Z
2007
Simex estimator for polynomial errors-in-variables model / O. Gontar, A. Malenko // Theory of Stochastic Processes. — 2007. — Т. 13 (29), № 1-2. — С. 57-65. — Бібліогр.: 6 назв.— англ.
0321-3900
https://nasplib.isofts.kiev.ua/handle/123456789/4478
For polynomial errors-in-variables model, the Simex estimator is constructed in such way that it is consistent, as the samples size grows and the size of auxiliary sample is fixed. Then the estimator is modified in such a way that it shows good results for small samples without losing its asymptotic properties for large samples. Simulation studies corroborate the theoretical findings.
en
Інститут математики НАН України
Simex estimator for polynomial errors-in-variables model
Article
published earlier
spellingShingle Simex estimator for polynomial errors-in-variables model
Gontar, O.
Malenko, A.
title Simex estimator for polynomial errors-in-variables model
title_full Simex estimator for polynomial errors-in-variables model
title_fullStr Simex estimator for polynomial errors-in-variables model
title_full_unstemmed Simex estimator for polynomial errors-in-variables model
title_short Simex estimator for polynomial errors-in-variables model
title_sort simex estimator for polynomial errors-in-variables model
url https://nasplib.isofts.kiev.ua/handle/123456789/4478
work_keys_str_mv AT gontaro simexestimatorforpolynomialerrorsinvariablesmodel
AT malenkoa simexestimatorforpolynomialerrorsinvariablesmodel