Simex estimator for polynomial errors-in-variables model
For polynomial errors-in-variables model, the Simex estimator is constructed in such way that it is consistent, as the samples size grows and the size of auxiliary sample is fixed. Then the estimator is modified in such a way that it shows good results for small samples without losing its asymptotic p...
Saved in:
| Date: | 2007 |
|---|---|
| Main Authors: | , |
| Format: | Article |
| Language: | English |
| Published: |
Інститут математики НАН України
2007
|
| Online Access: | https://nasplib.isofts.kiev.ua/handle/123456789/4478 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Journal Title: | Digital Library of Periodicals of National Academy of Sciences of Ukraine |
| Cite this: | Simex estimator for polynomial errors-in-variables model / O. Gontar, A. Malenko // Theory of Stochastic Processes. — 2007. — Т. 13 (29), № 1-2. — С. 57-65. — Бібліогр.: 6 назв.— англ. |
Institution
Digital Library of Periodicals of National Academy of Sciences of Ukraine| _version_ | 1859659537832738816 |
|---|---|
| author | Gontar, O. Malenko, A. |
| author_facet | Gontar, O. Malenko, A. |
| citation_txt | Simex estimator for polynomial errors-in-variables model / O. Gontar, A. Malenko // Theory of Stochastic Processes. — 2007. — Т. 13 (29), № 1-2. — С. 57-65. — Бібліогр.: 6 назв.— англ. |
| collection | DSpace DC |
| description | For polynomial errors-in-variables model, the Simex estimator is constructed in such way that it is consistent, as the samples size grows and the size of auxiliary sample is fixed. Then the estimator is modified in such a way that it shows good results for small samples without losing its asymptotic properties for large samples. Simulation studies corroborate the theoretical findings.
|
| first_indexed | 2025-11-30T09:12:47Z |
| format | Article |
| fulltext |
Theory of Stochastic Processes
Vol.13 (29), no.1-2, 2007, pp.57-65
OLENA GONTAR AND ANDRII MALENKO
SIMEX ESTIMATOR FOR POLYNOMIAL
ERRORS-IN-VARIABLES MODEL
For polynomial errors-in-variables model, the Simex estimator is con-
structed in such way that it is consistent, as the samples size grows
and the size of auxiliary sample is fixed. Then the estimator is modi-
fied in such a way that it shows good results for small samples without
losing its asymptotic properties for large samples. Simulation studies
corroborate the theoretical findings.
1. Introduction
We consider polynomial measurement error model⎧⎨⎩ yi =
m∑
j=0
βjξ
j
i + εi,
xi = ξi + δi,
(1)
where yi, xi are observed and ξi are unobservable independent random vari-
ables, i = 1, n. Suppose that δi are i.i.d. normal random variables and their
variance σ2
δ is known.
It is well known that the naive estimator of regression parameters β0,
β1, . . . , βm, which ignores measurement errors is inconsistent. Cheng and
Schneeweiss (1998) proposed the adjusted least squares β̂ALS estimator in
the model (1) which is consistent. This estimator can be viewed as resulting
from the principle of corrected score due to Stefanski (1989) and Nakamura
(1990). A small sample modification of β̂ALS estimator was proposed in
Cheng et al. (2000), such that it shows good results for small samples
without losing its asymptotic properties for large samples.
Another estimator was introduced by Cook and Stefanski (1994) and is
called Simex. The key idea underlying Simex is the fact that the effect of
measurement error on an estimator can be determined experimentally via
2000 Mathematics Subject Classification 62J02, 62F12, 62-07.
Key words and phrases. Simex estimator, errors-in-variables models, Hermite poly-
nomials
57
58 OLENA GONTAR, ANDRII MALENKO
simulation. This is achieved by studying the naive regression estimator as
a function f of measurement error variance in the regressors.
The purpose of this paper is to construct the consistent Simex estima-
tor of the regression parameter. The observed variables are used for mod-
eling the function f . This idea is close to the idea of Polzehl and Zwanzig
(2005). Simulation studies show that for finite sample the Simex estimator
in polynomial regression can sometimes produce extremely large estimat-
ing errors as well as the ALS estimator. It is proposed how to modify this
estimator for small samples still preserving its asymptotic properties.
The paper is organized as follows. In the next section the polynomial
errors-in-variables model is introduced and auxiliary lemmas are proved.
Section 3 is devoted to construction of Simex estimator and the proof of its
consistency. The small sample modification is proposed in Section 4. Section
5 gives some simulation results and shows the effect of modification, and
Section 6 concludes. In the paper expectation is denoted as E, the almost
sure convergence as
P1→, and the convergence in probability as
P→.
2. Model and additional lemmas
We consider the polynomial errors-in-variables model of order m ≥ 1,{
yi = β0 + β1ξi + . . . + βmξm
i + εi,
xi = ξi + δi,
i = 1, n.
Here {ξi, i ≥ 1}, {εi, i ≥ 1}, {δi, i ≥ 1} are i.i.d. and mutually independent
sequences. We assume that E|ξ1|m < ∞, δ1 ∼ N(0, σ2
δ ), σ2
δ is known,
Eε1 = 0, Eε2
1 < ∞. The variances of ξ1, ε1, and δ1 are supposed to be
positive.
Denote Xi = (1, xi, . . . , x
m
i )t. The naive, or ordinary least squares esti-
mator of β is β̂naive = M−1
XXMXY , where MXX := XX t, MXY := Xy. Here
the bar means averaging over n.
To introduce β̂ALS estimator consider the Hermite polynomials hk(x, t)
of x which possess the following properties:
h−1(x, t) = h0(x, t) = 1, hk+1(x, t) = xhk(x, t) + tkhk−1(x, t), k ≥ 1,
and let H(x, t) be the matrix of the following structure: Hrs(x, t) = hr+s(x, t),
r, s = 0, . . . , m. Denote the matrix 1
n
n∑
i=1
H(xi,−σ2
δ ) as MH and the vector
{h0(xi,−σ2
δ ), h1(xi,−σ2
δ ), . . . , hm(xi,−σ2
δ )}t as hi. Then β̂ALS is defined as
a solution to a linear equation:
MH β̂ALS =
1
n
n∑
i=1
hi. (2)
SIMEX ESTIMATOR 59
To construct the simex estimator fix a number B. Consider standard
normal i.i.d. sequence {ηib, i ≥ 1, b = 1, B}, which is independent of other
random variables in the model. Denote xib(λ) = xi +ηib
√
λ, i ≥ 1, b = 1, B,
and Xib(λ) = {1, xib(λ), . . . , xm
ib (λ)}t.
Introduce MXX(λ) = X(λ)X t(λ), MXY (λ) = X(λ)y. Hereafter the
bar means averaging over n and b, e.g., X(λ)y = 1
nB
n∑
i=1
B∑
b=1
Xib(λ)yi. The
corresponding naive estimate of β is β̂naive(λ) = M−1
XX(λ)MXY (λ). For each
λ introduce the matrix MH(λ) = 1
n
n∑
i=1
H(xi, λ). From Lemma1 below it
follows that
MXX(λ) = MH(λ) + o(1), as n → ∞, a.s. (3)
Lemma 1. Let ξ and δ be independent random variables with δ ∼ N(0, σ2
δ ).
Then
E((ξ + δ)n|ξ) = hn(ξ, σ2
δ).
Proof. To prove the next equality one should use partial integration
E((ξ + δ)n+1|ξ) = ξE((ξ + δ)n|ξ) + E(δ(ξ + δ)n|ξ) =
= ξE((ξ + δ)n|ξ) + (n − 1)σ2
δE(δ(ξ + δ)n−1|ξ).
Then induction is used. �
Lemma 2. Let X = (1, x, . . . , xm)t and h(x, t) = (h0(x, t), . . . , hm(x, t))t,
and T be a transition matrix: h(x, t) = T (t)X. Then T (t + s) = T (t)T (s)
and T (−t) = T−1(t), t, s ∈ R.
Proof. Assume that s and t are positive real numbers. Let x = ξ + δ + γ,
where δ ∼ N(0, s), γ ∼ N(0, t), s > 0, t > 0, and ξ, δ, γ are mutually
independent. Let
ρ = (1, ξ, . . . , ξm)t, ψ = (1, ξ + δ, . . . , (ξ + δ)m)t.
By Lemma 1 we can write E(X|ξ) = h(ξ, s + t) = T (s + t)ρ. But
E(X|ξ) = E(E(X|ξ, δ), ξ) = E(h(ξ + δ, t)|ξ) =
= E(T (t)ψ|ξ) = T (t)h(ξ, s) = T (t)T (s)ρ.
Thus for positive real numbers we proved that T (t + s) = T (t)T (s). This
equality is extended for arbitrary real numbers, because the Hermite poly-
nomials can be constructed for any real parameter t and entries of T (t) are
polynomials on t. Then T (−t)T (t) = T (0) = I. �
60 OLENA GONTAR, ANDRII MALENKO
Now using Lemmas 1 and 2 we get that E(MXY (λ)|X, y) = T (λ)MXY ,
therefore a.s.
MXY (λ) = T (λ)MXY + o(1), as n → ∞. (4)
3. Simex estimator
The following model is proposed for fitting the naive estimators:
β̂(λ, θ) = M−1
H (λ)T (λ)θ.
Let K ≥ 1, 0 = λ1 < λ2 < . . . < λK . The parameter θ is estimated by least
squares method as θ̂ = argmin
θ
K∑
k=1
||β̂naive(λk) − β̂(θ, λk)||2. Thus θ̂ equals
θ̂t =
(
K∑
k=1
M t
XY (λk)M
−1
XX(λk)M
−1
H (λk)T (λk)
)(
K∑
k=1
T t(λk)M
−2
H (λ)T (λk)
)−1
Using (3) and (4) it is easy to see that a.s.
θ̂ = MXY + o(1), as n → ∞. (5)
We define the Simex estimator as
β̂Simex := β̂(−σ2
δ , θ̂) = M−1
H (−σ2
δ )T (−σ2
δ )θ̂.
Theorem 1. Under the model assumptions, the Simex estimator is strongly
consistent:
β̂Simex
P1→ β, as n → ∞.
Proof. Using Lemmas 1 and 2 we can prove that
1
n
n∑
i=1
hk(xi, λ)
P1→ Ehk(x, λ) = EE((x +
√
λε)k| x) =
= E(x+
√
λε)k = E(ξ+δ+
√
λε)k = EE((ξ+δ+
√
λε)k| ξ) = Ehk(ξ, λ+σ2
δ ).
Thus substituting (−σ2
δ ) for λ we obtain
1
n
n∑
i=1
hk(xi,−σ2
δ ) → Ehk(ξ,−σ2
δ + σ2
δ ) = Ehk(ξ, 0) = Eξt.
Hence MH(−σ2
δ )
P1→ Eρρt, where ρ := (1, ξ, . . . , ξm)t. Using (5) and Lemma1
again, we obtain
θ̂
P1→ EMXY = EE(MXY | ξ) = ET (σ2
δ )ρρtβ = T (σ2
δ )Eρρtβ.
SIMEX ESTIMATOR 61
The consistency of Simex estimator is obvious from Lemma 2:
β̂Simex = β̂(−σ2
δ , θ̂)
P1→ (Eρρt)−1T (−σ2
δ )T (σ2
δ )Eρρtβ = β. �
Remark. In the special case K = 1 we have λ1 = 0 and transition ma-
trix T (λ1) = T (0) = Im (the identity matrix). The matrix MH(λ1) =
MH(0) = = MXX . We notice that MXX(λ1) = MXX(0) = MXX , and
MXY (λ1) = = MXY (0) = MXY . Hence we obtain that θ̂ = MXY , β̂Simex =
M−1
H (−σ2
δ )T (−σ2
δ )MXY . Therefore
MH(−σ2
δ )β̂Simex = T (−σ2
δ )MXY . (6)
Thus β̂Simex is the solution to the equation (6). But this equation (in current
notations) is the same as the equation (2) for the ALS estimator of β. So
in the case K = 1 the Simex estimator coincides with the ALS estimator.
4. Small sample modification
From β̂Simex = M−1
H (−σ2
δ )T (−σ2
δ )θ̂ we can write that β̂Simex is the solution
to the following equation:
MH(−σ2
δ )β̂Simex = T (−σ2
δ )θ̂. (7)
We have MH(−σ2
δ )
P1→ Eρρt, as n → ∞, and therefore it is positive defi-
nite for n ≥ n0(w) a.s. But for small samples, however, MH(−σ2
δ ) can be
indefinite and this can cause significant bias for the Simex estimator. Intro-
duce Vi = h(xi,−σ2
δ )h
t(xi,−σ2
δ )−H(xi,−σ2
δ ). Taking average over n we can
write V = h(−σ2
δ )h
t(−σ2
δ ) − MH(−σ2
δ ). Using this relation, the estimation
equation (7) can be rewritten as(
h(−σ2
δ )h
t(−σ2
δ ) − V
)
β̂Simex = T (−σ2
δ )θ̂. (8)
Define λ as the smallest positive root of the equation det(A − λB) = 0,
where
A =
(
y2 yht(−σ2
δ )
h(−σ2
δ )y h(−σ2
δ )h
t(−σ2
δ )
)
, B =
(
0 0
0 V
)
.
We assume that A is positive definite.
To construct small sample modification of Simex estimator we use the
same approach as in Cheng et al. (2000) is used. Proofs of the next two
theorems are similar to that paper.
The modified Simex estimator can be found as a solution to the equation:(
h(−σ2
δ )h
t(−σ2
δ ) − aV
)
β̂MSimex = T (−σ2
δ )θ̂. (9)
62 OLENA GONTAR, ANDRII MALENKO
Here a is defined as{
a = (n − α)/n, if λ > 1 + 1
n
,
a = λ(n − α)/(n + 1), if λ ≤ 1 + 1
n
,
(10)
with some α < n to be chosen so that the resulting estimator possesses
better small sample properties. The number α = m+1 is the lowest α that
one should choose, see the discussion in Cheng et al.(2000).
Theorem 2. The following inequality holds a.s.:
h(−σ2
δ )h
t(−σ2
δ ) − aV ≥ α + 1
n + 1
h(−σ2
δ )h
t(−σ2
δ ) > 0. (11)
(Hereafter inequalities for matrices are understood in Lowener order.)
Proof. As A is positive definite it can be decomposed as A = CCt with a
nonsingular matrix C. Define B̃ = C−1BC−t. Let d be the largest eigenvalue
of B̃. As the second diagonal element of V , h2
1(−σ2
δ ) − h2(−σ2
δ ) = σ2
δ , is
positive, B̃ has at least one positive eigenvalue, and therefore d > 0. It
follows that λ = 1
d
. Let D be the diagonal matrix of eigenvalues of B̃
and E be a matrix, the columns of which are the corresponding normalized
eigenvectors. Then B̃ = EDEt, EEt = I. It follows that B = CEDEtCt,
with T = CE we have A = TT t, and B = TDT t. Hence for any scalar c,
A − cB = T (I − cD)T t, (12)
with a nonsingular matrix T.
In the first case, when λ > 1+ 1
n
, we see that d < n(n+1) and therefore
D < n(n + 1)I. Hence
I − aD = I − n − α
n
D >
α + 1
n + 1
I.
In the second case λ ≤ 1 + 1
n
. In general d−1D ≤ I. This implies that
I − aD = I − λ(n − α)
n + 1
D = I − (n − α)
n + 1
d−1D ≥ α + 1
n + 1
I.
Thus in both cases we obtain A − aB ≥ α+1
n+1
A > 0. Deleting the first row
and column of these matrices results in (11). �
Theorem 3. The modified estimator β̂MSimex is asymptotically equivalent
to unmodified one β̂Simex:
√
n(β̂MSimex − β̂Simex)
P→ 0, as n → ∞.
SIMEX ESTIMATOR 63
Proof. First, prove that P (λ > 1) converges to 1, as n → ∞. Condition
λ > 1 is equivalent to d < 1 or D < I. According to (12) this is equivalent
to A > B. As in the proof of Theorem 2, it can be shown that A > B is
equivalent to h(−σ2
δ )h
t(−σ2
δ ) − V > 0.
Since h(−σ2
δ )h
t(−σ2
δ ) − V = MH(−σ2
δ ) converges to the matrix E(ρρt),
which is positive definite with probability 1, one can state that P (A > B) =
= P (λ > 1) which converges to 1, as n → ∞. Now for λ > 1 we have, by
the definition of a, that
n − α
n + 1
< a ≤ n − α
n
,
and after some algebra
α + 1√
n
> (1 − a)
√
n ≥ α√
n
.
This inequality holds with probability tending to 1, as n → ∞. Since outer
parts of this inequality converge to 0, we have (1− a)
√
n
P→ 0, as n → ∞.
By subtracting equation (9) from (8) we derive after some algebra
(h(−σ2
δ )h
t(−σ2
δ ) − aV )(β̂Simex − β̂MSimex)
√
n = (1 − a)
√
n V β̂Simex.
The right-hand side converges to 0, whereas h(−σ2
δ )h
t(−σ2
δ ) − aV > 0,
therefore
√
n(β̂MSimex − β̂Simex)
P→ 0, as n → ∞. �
5. Simulation results
Simulation was made in R-package. We studied the quadratic model
yi = b0 + b1ξi + b2ξ
2
i + εi, xi = ξi + δi.
We specified εi and δi as normally distributed variables with Eεiδi = 0 and
σ2
δ = σ2
ε = 0.25 and σ2
ξ = 1. The sample size n equals 20. For Simex
estimator the following values were used: B = 100, K = 11, λk = kσ2
δ ,
k = 0, 10. True values were b0 = 5, b1 = 6, b2 = 3.
The simulation results are plotted below for the parameter b2. The naive
estimator is denoted by solid circle, the ALS by square, the Simex by star,
and the modified Simex by triangle. Circles correspond to naive estimators
with larger variance. Solid line describes the behavior of fitted model and
dashed line denotes the true value of the parameter.
In the first picture MH(−σ2
δ ) is not positive definite, and as a result the
Simex estimator has extremely large estimating error (β̂Simex=35.03, while
β̂MSimex=2.91).
64 OLENA GONTAR, ANDRII MALENKO
0 1 2 3 4
0
1
2
3
4
Lambda values
V
al
ue
s
of
n
ai
ve
e
st
im
at
or
fo
r
b2
Naive
SIMEX
ALS
Modified Simex
In the second picture MH(−σ2
δ ) is positive definite, and the Simex estimator
is a good one (β̂Simex=3.19 and β̂MSimex=3.04).
0 1 2 3 4
0
1
2
3
4
Lambda values
V
al
ue
s
of
n
ai
ve
e
st
im
at
or
fo
r
b2
Naive
SIMEX
ALS
Modified Simex
6. Conclusion
In the article the Simex estimator for polynomial errors-in-variables model
is constructed. It differs from the classical Simex estimator proposed by
Cook and Stefanski(1995) due to the fact that the observed variables are
SIMEX ESTIMATOR 65
used to model the naive estimator as a function of extra variance. The
consistency of constructed Simex estimator is proved. Then this estimator
is modified such that it shows good results for small samples without losing
its asymptotic properties for large samples. Simulation studies made in
statistical package R corroborate the theoretical result.
Bibliography
1. Cheng, C.-L., and Schneeweiss, H. (1998). Polynomial regression with er-
rors in the variables. J.R. Statist. Soc. B, 60, 189-199.
2. Cheng, C.-L., Schneeweiss, H., and Thamerus, M. (2000). A small sample
estimator for a polynomial regression with errors in the variables. J.R.
Statist. B, 62, 699-709.
3. Cook, J., and Stefanski, L.A. (1995). A simulation extrapolation method for
parametric measurement error models. Journal of the American Statistical
Association, 89, 1314-1328.
4. Nakamura, T. (1990). Corrected Score functions for errors-in-variables
models: methodology and application to generalized linear models. Bio-
metrika, 77, 127-137.
5. Polzehl, J., and Zwanzig, S.(2005) Simex and TLS: an equivalence
result. WIAS, Technical Report 999, Berlin.
6. Stefanski, L. A. (1989). Unbiased estimation of a nonlinear function
of a normal mean with application to measurement error model. Com-
putation in Statistics, Series A, 18, 4335-4358.
Department of Probability Theory and Mathematical Statistics,
Kyiv National Taras Shevchenko University, Kyiv, Ukraine
E-mail address: gontaro@ukr.net.
Department of Probability Theory and Mathematical Statistics,
Kyiv National Taras Shevchenko University, Kyiv, Ukraine
E-mail address: exipilis@yandex.ru.
|
| id | nasplib_isofts_kiev_ua-123456789-4478 |
| institution | Digital Library of Periodicals of National Academy of Sciences of Ukraine |
| issn | 0321-3900 |
| language | English |
| last_indexed | 2025-11-30T09:12:47Z |
| publishDate | 2007 |
| publisher | Інститут математики НАН України |
| record_format | dspace |
| spelling | Gontar, O. Malenko, A. 2009-11-19T10:11:19Z 2009-11-19T10:11:19Z 2007 Simex estimator for polynomial errors-in-variables model / O. Gontar, A. Malenko // Theory of Stochastic Processes. — 2007. — Т. 13 (29), № 1-2. — С. 57-65. — Бібліогр.: 6 назв.— англ. 0321-3900 https://nasplib.isofts.kiev.ua/handle/123456789/4478 For polynomial errors-in-variables model, the Simex estimator is constructed in such way that it is consistent, as the samples size grows and the size of auxiliary sample is fixed. Then the estimator is modified in such a way that it shows good results for small samples without losing its asymptotic properties for large samples. Simulation studies corroborate the theoretical findings. en Інститут математики НАН України Simex estimator for polynomial errors-in-variables model Article published earlier |
| spellingShingle | Simex estimator for polynomial errors-in-variables model Gontar, O. Malenko, A. |
| title | Simex estimator for polynomial errors-in-variables model |
| title_full | Simex estimator for polynomial errors-in-variables model |
| title_fullStr | Simex estimator for polynomial errors-in-variables model |
| title_full_unstemmed | Simex estimator for polynomial errors-in-variables model |
| title_short | Simex estimator for polynomial errors-in-variables model |
| title_sort | simex estimator for polynomial errors-in-variables model |
| url | https://nasplib.isofts.kiev.ua/handle/123456789/4478 |
| work_keys_str_mv | AT gontaro simexestimatorforpolynomialerrorsinvariablesmodel AT malenkoa simexestimatorforpolynomialerrorsinvariablesmodel |