Estimation of a distribution function by an indirect sample
The problem of estimation of a distribution function is considered in the case where the observer has access only to a part of the indicator random values. Some basic asymptotic properties of the constructed estimates are studied. The limit theorems are proved for continuous functionals related to t...
Saved in:
| Date: | 2010 |
|---|---|
| Main Authors: | , , , , , |
| Format: | Article |
| Language: | English |
| Published: |
Institute of Mathematics, NAS of Ukraine
2010
|
| Online Access: | https://umj.imath.kiev.ua/index.php/umj/article/view/2989 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Journal Title: | Ukrains’kyi Matematychnyi Zhurnal |
| Download file: | |
Institution
Ukrains’kyi Matematychnyi Zhurnal| _version_ | 1860509000581251072 |
|---|---|
| author | Babilua, P. Nadaraya, E. Sokhadze, G. A. Бабілуа, П. К. Надарая, Е. А. Сохадзе, Г. А. |
| author_facet | Babilua, P. Nadaraya, E. Sokhadze, G. A. Бабілуа, П. К. Надарая, Е. А. Сохадзе, Г. А. |
| author_sort | Babilua, P. |
| baseUrl_str | https://umj.imath.kiev.ua/index.php/umj/oai |
| collection | OJS |
| datestamp_date | 2020-03-18T19:41:53Z |
| description | The problem of estimation of a distribution function is considered in the case where the observer has access only to a part of the indicator random values. Some basic asymptotic properties of the constructed estimates are studied. The limit theorems are proved for continuous functionals related to the estimation of $F^n(x)$ in the space $C[a,\; 1 - a], 0 |
| first_indexed | 2026-03-24T02:34:09Z |
| format | Article |
| fulltext |
UDC 519.21
E. Nadaraya, P. Babilua, G. Sokhadze (Iv. Javakhishvili Tbilisi State Univ., Georgia)
THE ESTIMATION OF A DISTRIBUTION FUNCTION
BY AN INDIRECT SAMPLE
ОЦIНЮВАННЯ ФУНКЦIЇ РОЗПОДIЛУ
З ВИКОРИСТАННЯМ НЕПРЯМОЇ ВИБIРКИ
The problem of estimation of a distribution function is considered when the observer has an access only to
some indicator random values. Some basic asymptotic properties of the constructed estimates are studied. In
this paper, the limit theorems are proved for continuous functionals related to the estimate of F̂n(x) in the
space C[a, 1− a], 0 < a < 1/2.
Розглянуто задачу оцiнювання функцiї розподiлу у випадку, коли спостерiгач має доступ лише до деяких
iндикаторних випадкових значень. Вивчено деякi базовi асимптотичнi властивостi побудованих оцiнок.
У статтi доведено граничнi теореми для неперервних функцiоналiв щодо оцiнки F̂n(x) у просторi
C[a, 1− a], 0 < a < 1/2.
Let X1, X2, . . . , Xn be a sample of independent observations of a random non-negative
value X with a distribution function F (x). In problems of the theory of censored obser-
vations, sample values are pairs Yi = (Xi ∧ ti) and Zi = I(Yi = Xi), i = 1, n, where
ti are given numbers (ti 6= tj for i 6= j) or random values independent of Xi, i = 1, n.
Throughout the paper, I(A) denotes the indicator of the set A.
Our present study deals with a somewhat different case: an observer has an access
only to the values of random variables ξi = I(Xi < ti) with ti = cF
2i− 1
2n
, i = 1, n,
cF = inf{x ≥ 0: F (x) = 1} <∞.
The problem consists in estimating the distribution function F (x) by means of a
sample ξ1, ξ2, . . . , ξn. Such a problem arises for example from a region of corrosion
investigations, see [1] where an experiment related to corrosion is described.
As an estimate for F (x) we consider an expression of the form
F̂n(x) =
0, x ≤ 0,
F1n(x) · F−1
2n (x), 0 < x < cF ,
1, x ≥ cF ,
(1)
F1n(x) =
1
nh
n∑
j=1
K
(
x− tj
h
)
ξj ,
F2n(x) =
1
nh
n∑
j=1
K
(
x− tj
h
)
,
where K(x) is a probability density (kernel), K(x) = K(−x), x ∈ (−∞,∞), {h =
= h(n)} is a sequence of positive numbers converging to zero.
1. In this subsection we give the conditions of asymptotic unbiasedness and consis-
tency and the theorems on a limiting distribution F̂n(x).
c© E. NADARAYA, P. BABILUA, G. SOKHADZE, 2010
1642 ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1643
Lemma 1. Assume that
10. K(x) is a function of bounded variation. If nh→∞, then
1
nh
n∑
j=1
Km1−1
(
x− tj
h
)
Fm2−1(tj) =
=
1
cFh
cF∫
0
Km1−1
(
x− u
h
)
Fm2−1(u) du+O
(
1
nh
)
, (2)
uniformly with respect to x ∈ [0, cF ]; m1, m2 are natural numbers.
Proof. Let P (x) be a uniform distribution function on [0, cF ], and Pn(x) be
an empirical distribution function of “the sample” t1, t2, . . . , tn, i. e., Pn(x) =
= n−1
∑n
j=1
I(tj < x). It is obvious that
sup
0≤x≤cF
|Pn(x)− P (x)| = sup
0≤x≤cF
∣∣∣∣ 1n
[
n
x
cF
+
1
2
]
− x
cF
∣∣∣∣ ≤ 1
2n
. (3)
We have
1
nh
n∑
i=1
Km1−1
(
x− ti
h
)
Fm2−1(ti)−
− 1
cFh
cF∫
0
Km1−1
(
x− u
h
)
Fm2−1(u) du =
=
1
h
cF∫
0
Km1−1
(
x− u
h
)
Fm2−1(u) d(Pn(u)− P (u)). (4)
Applying the integration by parts formula to the integral in the right-hand part of (4)
and taking (3) into account, we obtain (2).
Lemma 1 is proved.
Below it is assumed without loss of generality that the interval [0, cF ] = [0, 1].
Theorem 1. Let F (x) be continuous and the conditions of the Lemma 1 be ful-
filled. Then the estimate (1) is asymptotically unbiased and consistent at all points
x ∈ [0, 1]. Moreover, F̂n(x) has an asymptotically normal distribution, i.e.,
√
nh
(
F̂n(x)− EF̂n(x)
)
σ−1(x)
d−→ N(0, 1),
σ2(x) = F (x)(1− F (x))
∫
K2(u) du,
where d denotes convergence in distribution, and N(0, 1) a random value having a
normal distribution with mean 0 and variance 1.
Proof. By Lemma 1 we have
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
1644 E. NADARAYA, P. BABILUA, G. SOKHADZE
EF1n(x) =
x
h∫
x−1
h
K(t)F (x+ ht) dt+O
(
1
nh
)
,
F2n(x) =
1
h
1∫
0
K
(
x− u
h
)
du+O
(
1
nh
)
,
(5)
and for n→∞
1
h
1∫
0
K
(
x− u
h
)
du −→ F2(x) =
1, x ∈ (0, 1),
1
2
, x = 0, x = 1,
x
h∫
x−1
h
K(t)F (x+ th) dt −→ F (x)F2(x).
Hence it follows that EF̂n(x)→ F (x), x ∈ [0, 1] as n→∞.
Analogously, it is not difficult to show that
Var F̂n(x) =
=
1
nh2
1∫
0
K2
(
x− u
h
)
F (u)(1− F (u)) du+O
(
1
(nh)2
)F−2
2n (x).
Hence we readily derive
nhVar F̂n(x) ∼ σ2(x) = F (x)(1− F (x))
∫
K2(u) du (6)
for x ∈ [0, 1].
Thus F̂n(x) is a consistent estimate for F (x), x ∈ [0, 1], and therefore
P
{
F̂n(x1) ≤ F̂n(x2)
}
−→ 1 as n→∞, x1 < x2, x1, x2 ∈ [0, 1].
Let us now establish that F̂n(x) has an asymptotically normal distribution. Since, by
virtue of (5), F2n(x)→ F2(x), it remains for us to verify the condition of the Liapunov
central limit theorem for F1n(x).
Let us denote
ηi = ηi(x) = (nh)−1K
(
x− ti
h
)
ξi
and show that
Ln =
n∑
j=1
E|ηj − Eηj |2+δ(VarF1n(x))−1− δ2 −→ 0, δ > 0. (7)
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1645
We have
n∑
j=1
E|ηj − Eηj |2+δ ≤ 2M1+δ(nh)−(2+δ)
n∑
j=1
K
(
x− tj
h
)
F (tj),
M = max
x∈R
K(x).
Hence, taking (2) into account, we find
n∑
i=1
E|ηi − Eηi|2+δ ≤ c1(nh)−(1+δ). (8)
Using the relation (6) and the inequality (8), we establish that Ln = O((nh)−
δ
2 ), i.e.,
(7) holds.
Theorem 1 is proved.
2. Uniform consistency. In this subsection we define the conditions, under which
the estimate F̂n(x) uniformly converges in probability (a. s.) to true F (x).
Let us introduce the Fourier transform of the function K(x)
ϕ(t) =
∞∫
−∞
eitxK(x) dx
and assume that
20. ϕ(t) is absolutely integrable. Following E. Parzen [2], F1n(x) can be represented
as
F1n(x) =
1
2π
∞∫
−∞
e−iu
x
hϕ(u)
1
nh
n∑
j=1
ξje
iu
tj
h du.
Thus
F1n(x)− EF1n(x) =
1
2π
∞∫
−∞
e−iu
x
hϕ(u)
1
nh
n∑
j=1
(ξj − F (tj))e
iu
tj
h du.
Denote
dn = sup
x∈Ωn
|F̂n(x)− EF̂n(x)|, Ωn = [hα, 1− hα], 0 < α < 1.
Theorem 2. Let K(x) satisfy conditions 10 and 20.
(a) Let F (x) be continuous and n
1
2hn →∞, then
Dn = sup
x∈Ωn
|F̂n(x)− F (x)| P−→ 0.
(b) If
∑∞
n=1
n−
p
2 h−p <∞, p > 2, then Dn → 0 a. s.
Proof. We have
sup
x∈Ωn
1− 1
h
1∫
0
K
(
x− u
h
)
du
≤−hα−1∫
−∞
K(u) du+
∞∫
hα−1
K(u) du−→0. (9)
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
1646 E. NADARAYA, P. BABILUA, G. SOKHADZE
This and (5) imply that
sup
x∈Ωn
|F2n(x)− 1| −→ 0, (10)
i.e., due to the uniform convergence for any ε0 > 0, 0 < ε0 < 1, and sufficiently large
n ≥ n0, we have F2n(x) ≥ 1− ε0 uniformly with respect to x ∈ Ωn. Therefore,
dn ≤ (1− ε0)−1 sup
x∈Ωn
|F1n(x)− EF1n(x)| ≤
≤ (1− ε0)−1 1
2π
∞∫
−∞
|ϕ(u)| 1
nh
∣∣∣∣∣∣
n∑
j=1
ηje
iu
tj
h
∣∣∣∣∣∣ du, ηj = ξj − F (tj).
Hence, by Hölder’s inequality, we obtain
dpn ≤ (1− ε0)−p
1
(2π)p
∞∫
−∞
|ϕ(u)|
∣∣∣∣∣∣
n∑
j=1
ηje
iu
tj
h
∣∣∣∣∣∣
p
du
∞∫
−∞
|ϕ(u)| du
p
q
,
1
p
+
1
q
= 1, p > 2.
Thus
Edpn ≤ c(ε, p, ϕ)
1
(nh)p
∞∫
−∞
|ϕ(u)|E
∣∣∣∣∣∣
∑
j,k
cos
((
tj − tk
h
)
u
)
ηjηk
∣∣∣∣∣∣
p
2
du, (11)
where
c(ε, p, ϕ) = (1− ε0)−p
1
(2π)p
∞∫
−∞
|ϕ(u)| du
p
q
.
Denote
A(u) =
∑
j,k
cos
((
tj − tk
h
)
u
)
ηjηk.
Then by (11) we write
Edpn ≤ 2
p
2−1c(ε0, p, ϕ)
1
(nh)p
×
×
∞∫
−∞
|ϕ(u)| |EA(u)|
p
2 du+
∞∫
−∞
|ϕ(u)|E|A(u)− EA(u)|
p
2 du
. (12)
Using Whittle’s inequality [3] for moments of quadratic form, we obtain
E|A(u)− EA(u)|
p
2 ≤
≤ 2
3
2 pc
(p
2
)
[c(p)]
1
2
(∑
i,j
cos2
((
tj − tk
h
)
u
)
γ2
j (p)γ2
k(p)
) p
4
,
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1647
where
γk(p) = (E|ηk|p)
1
p ≤ 1, c(p) =
2
p
2
√
π
Γ
(
p+ 1
2
)
.
Hence it follows that
E|A(u)− EA(u)|
p
2 = O(n
p
2 ) (13)
uniformly with respect to u ∈ (−∞,∞). It is also clear that
|EA(u)|
p
2 = O(n
p
2 ) (14)
uniformly with respect to u ∈ (−∞,∞).
Having combined the relations (12), (13) and (14), we obtain
Edpn = O
(
1
(
√
nh)p
)
, p > 2.
Therefore,
P
{
sup
x∈Ωn
∣∣∣F̂n(x)− EF̂n(x)
∣∣∣ ≥ ε} ≤ c3
εp(
√
nh)p
. (15)
Furthermore, we have
sup
x∈Ωn
∣∣∣EF̂n(x)− F (x)
∣∣∣ ≤
≤ 1
1− ε0
(
sup
x∈Ωn
|EF1n(x)− F (x)|+ sup
x∈Ωn
|1− F2n(x)|
)
. (16)
By virtue of (10), the second summand in the right-hand part of (16) tends to 0, whereas
the first summand is estimated as follows:
sup
x∈Ωn
|EF1n(x)− F (x)| ≤ S1n + S2n +O
(
1
nh
)
, (17)
S1n = sup
0≤x≤1
∣∣∣∣∣∣ 1h
1∫
0
(F (y)− F (x))K
(
x− y
h
)
dy
∣∣∣∣∣∣ ,
S2n = sup
x∈Ωn
1− 1
h
1∫
0
K
(
x− y
h
)
dy
,
and, by virtue of (9),
S2n −→ 0 (18)
as n→∞.
Let us now consider S1n. Note that
S1n ≤ sup
0≤x≤1
∣∣∣∣∣∣
1∫
0
|F (y)− F (x)| 1
h
K
(
x− y
h
)∣∣∣∣∣∣ dy =
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
1648 E. NADARAYA, P. BABILUA, G. SOKHADZE
= sup
0≤x≤1
x∫
x−1
|F (x− u)− F (x)| 1
h
K
(u
h
)
du ≤
≤ sup
0≤x≤1
∞∫
−∞
|F (x− u)− F (x)| 1
h
K
(u
h
)
du. (19)
Assume that δ > 0 and divide the integration domain in (19) into two domains
|u| ≤ δ and |u| > δ. Then
S1n ≤ sup
0≤x≤1
∫
|u|≤δ
|F (x− u)− F (x)| 1
h
K
(u
h
)
du+
+ sup
0≤x≤1
∫
|u|>δ
|F (x− u)− F (x)| 1
h
K
(u
h
)
du ≤
≤ sup
x∈R
sup
|u|≤δ
|F (x− u)− F (x)|+ 2
∫
|u|≥ δh
K(u) du. (20)
By a choice of δ > 0 the first summand in the right-hand part of (20) can be made
arbitrarily small. Choosing δ > 0 and letting n→∞, we find that the second summand
tends to zero. Therefore,
lim
n→∞
S1n = 0. (21)
Finally, from the relations (15) – (18) and (21) the proof of the theorem follows.
Remark 1. 1. If K(x) = 0, |x| ≥ 1 and α = 1, i. e., Ωn = [h, 1−h], then S2n = 0.
2. In the conditions of Theorem 2
sup
x∈[a,b]
|F̂n(x)− F (x)| −→ 0
in probability (a. s.) for any fixed interval [a, b] ⊂ [0, 1] since there may exist n0 such
that [a, b] ⊂ Ωn, n ≥ n0.
Assume that h = n−γ , γ > 0. The conditions of Theorem 2 are fulfilled: n
1
2hn →
→∞ if 0 < γ <
1
2
, and
∞∑
n=1
n−
p
2 h−pn <∞ if 0 < γ <
p− 2
2p
, p > 2.
3. Estimation of moments. In considering the problem, there naturally arises a
question of estimation of the integral functionals of F (x), for example, moments µm,
m ≥ 1:
µm = m
1∫
0
tm−1(1− F (t)) dt.
As estimates for µm we consider the statistics
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1649
µ̂nm = 1− m
n
n∑
j=1
ξj
1
h
1−h∫
h
tm−1K
(
t− tj
h
)
F−1
2n (t) dt.
Theorem 3. Let K(x) satisfy condition 10 and, in addition to this, K(x) = 0
outside the interval [−1, 1]. If nh → ∞ as n → ∞, then µ̂nk is an asymptotically
unbiased, consistent estimate for µm and moreover
√
n (µ̂nm − Eµ̂nm)
σ
d−→ N(0, 1), σ2 = m2
1∫
0
t2m−2F (t)(1− F (t)) dt.
Proof. Since K(x) has [−1, 1] as a support, we establish from (5) that F2n(x) =
= 1 +O
(
1
nh
)
uniformly with respect to x ∈ [h, 1− h].
Hence, by Lemma 1 we have
Eµ̂nm = 1− m
n
n∑
j=1
F (tj)
1
h
1−h∫
h
tm−1K
(
t− tj
h
)
F−1
2n (t) dt =
= 1−m
1−h∫
h
1
h
1∫
0
K
(
t− u
h
)
F (u) du
tm−1 dt+O
(
1
nh
)
=
= 1−m
1−h∫
h
1∫
−1
K(v)F (t+ vh) dv
tm−1 dt+O
(
1
nh
)
=
= 1−m
1∫
0
tm−1
1∫
−1
K(v)F (t+ vh) dv
dt+O(h) +O
(
1
nh
)
. (22)
By the Lebesgue theorem on majorized convergence, from (22) we establish that
Eµ̂nm −→ 1−m
1∫
0
F (t)tm−1 dt =
= m
1∫
0
tm−1(1− F (t)) dt = µm, m ≥ 1. (23)
Therefore, µ̂nm is an asymptotically unbiased estimate for µm.
Further, analogously to (22), it can be shown that
Var µ̂nm =
m2
n
1∫
0
F (t)(1−F (t))t2m−2
[
K
(
1−t
h
−1
)
−K
(
1− t
h
)]2
dt+
+O
(
h
n
)
+O
(
1
(nh)2
)
,
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
1650 E. NADARAYA, P. BABILUA, G. SOKHADZE
where
K(v) =
v∫
−∞
K(u) du.
By the same Lebesgue theorem we see that
nVar µ̂nm ∼ σ2 = m2
1∫
0
t2m−2F (t)(1− F (t)) dt. (24)
Therefore (23) and (24) imply that µ̂nm
P−→ µm.
To complete the proof of the theorem it remains to show that the statistics
√
n (µ̂nm−
− Eµ̂nm) have an asymptotically normal distribution with mean 0 and dispersion σ2.
For this it suffices to show that the Liapunov fraction Ln → 0. Indeed,
Ln = n−(2+δ)m2+δ
n∑
j=1
E|ξj − F (tj)|2+δ×
×
∣∣∣∣∣∣ 1h
1−h∫
h
tm−1K
(
t− tj
h
)
F−1
2n (t) dt
∣∣∣∣∣∣
2+δ
(Var µ̂nm)−(1+ δ
2 ) ≤
≤ c6n−(2+δ)
n∑
j=1
E|ξj − F (tj)|2+δ(Var µ̂nm)−(1+ δ
2 ) ≤
≤ c7n−1−δ(Var µ̂nm)−1− δ2 = O(n−
δ
2 ).
Theorem 3 is proved.
4. Limit theorems of functionals related to the estimate F̂n(x). In this subsection
the kernel K(x) ≥ 0 is chosen so that it would be a function of finite variation and
satisfy the conditions
K(−u) = K(u),
∫
K(u) du = 1,
K(u) = 0 for |u| ≥ 1.
Theorem 4. Let g(x) ≥ 0, x ∈ [a, 1 − a], 0 < a <
1
2
, be a measurable and
bounded function.
(a) If F (a) > 0 and nh2 →∞ as n→∞, then
Tn =
√
n
1−a∫
a
g1(x)
[
F̂n(x)− EF̂n(x)
]
dx
d−→ N(0, σ2), (25)
where
g1(x) = g(x)ψ(F (x)), ψ(t) =
1√
t(1− t)
.
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1651
(b) If F (a) > 0, nh2 →∞, nh4 → 0 as n→∞ and F (x) has bounded derivatives
up to second order, then as n→∞
Tn =
√
n
1−a∫
a
g1(x)
[
F̂n(x)− F (x)
]
dx
d−→ N(0, σ2),
σ2 =
1−a∫
a
g2(u) du.
Remark 2. We have introduced a > 0 in (25) in order to avoid the boundary effect
of the estimate F̂n(x) since near the interval boundary the estimate F̂n(x) being a kernel
type estimate behaves worse in the sense of order of bias tendency to zero than on any
inner interval [a, 1− a] ⊂ [0, 1], 0 < a <
1
2
.
Proof of Theorem 4. We have
Tn =
1√
n
n∑
j=1
(ξj − F (tj))
1
h
1−a∫
a
K
(
u− tj
h
)
g2n(u) du,
where
g2n(u) = g1(u)F−1
2n (u).
Hence
σ2
n = VarTn =
=
1
n
n∑
j=1
ψ−2(F (tj))
1
h
1−a∫
a
K
(
u− tj
h
)
g2n(u) du
2
. (26)
Since K(u) has [−1, 1] as a support and 0 < a ≤ u ≤ 1 − a, it can be easily verified
that
F2n(u) = 1 +O
(
1
nh
)
and g2n(u) = g1(u) +O
(
1
nh
)
uniformly on u ∈ [a, 1− a]. Therefore, from (26) we have
σ2
n =
1
n
n∑
j=1
ψ−2(F (tj))
1
h
1−a∫
a
K
(
u− tj
h
)
g1(u) du
2
+O
(
1
nh
)
.
By virtue of Lemma 1, we can easily show that
1
n
n∑
j=1
ψ−2(F (tj))
1
h
1−a∫
a
K
(
u− tj
h
)
g1(u) du
2
=
=
1∫
0
ψ−2(F (t)) dt
1
h
1−a∫
a
K
(
u− t
h
)
g1(u) du
2
+O
(
1
nh2
)
.
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
1652 E. NADARAYA, P. BABILUA, G. SOKHADZE
Therefore,
σ2
n =
1−a∫
a
ψ−2(F (t)) dt
1
h
1−a∫
a
K
(
u− t
h
)
g1(u) du
2
+
+ε(1)
n + ε(2)
n +O
(
1
nh2
)
, (27)
ε(1)
n =
a∫
0
ψ−2(F (t)) dt
1
h
1−a∫
a
K
(
u− t
h
)
g1(u) du
2
,
ε(2)
n =
1∫
1−a
ψ−2(F (t)) dt
1
h
1−a∫
a
K
(
u− t
h
)
g1(u) du
2
.
Since by F (u)(1− F (u)) ≤ 1
4
, g(u) ≤ c8 and
ψ(F (u)) ≤ 1
F (a)(1− F (1− a))
, a ≤ u ≤ 1− a,
it follows that g1(u) ≤ c9, we have
ε(1)
n ≤ c10
a∫
0
dt
1−a−t
h∫
a−t
h
K(u) du
2
, (28)
where a− t ≥ 0 and 1− a− t ≥ 0. The first inequality is obvious, whereas the second
one follows from the inequalities 0 ≤ t ≤ a and 0 < a <
1
2
.
Therefore,
lim
n→∞
1−a−t
h∫
a−t
h
K(u) du =
0, 0 ≤ t < a,
1
2
, t = a.
By the Lebesgue theorem on bounded convergence, from the latter expression and
(28) we obtain
ε(1)
n → 0 as n→∞. (29)
Analogously,
ε(2)
n → 0 as n→∞. (30)
Now let us establish that
1−a∫
a
ψ−2(F (t)) dt
1
h
1−a∫
a
K
(
u−t
h
)
g1(u) du
2
−→ σ2 =
1−a∫
a
g2(u) du
as n→∞.
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1653
We have∣∣∣∣∣∣∣
1−a∫
a
ψ−2(F (t)) dt
1
h
1−a∫
a
g1(u)K
(
u−t
h
)
du
2
−
1−a∫
a
ψ−2(F (t))g2
1(t) dt
∣∣∣∣∣∣∣ ≤
≤ c11
1−a∫
a
ψ−2(F (t)) dt
∣∣∣∣∣∣ 1h
1−a∫
a
g1(u)K
(
u− t
h
)
du− g1(t)
∣∣∣∣∣∣ ≤
≤ c12
1−a∫
a
dt
∣∣∣∣∣∣ 1h
1−a∫
a
g1(u)K
(
u−t
h
)
du−g1(t)
1−a∫
a
1
h
K
(
u−t
h
)
du
∣∣∣∣∣∣+
+c13
1−a∫
a
∣∣∣∣∣∣
1−a∫
a
1
h
K
(
u− t
h
)
du− 1
∣∣∣∣∣∣ dt = A1n +A2n. (31)
Since
1−a∫
a
1
h
K
(
u− t
h
)
du −→ 1
for all t ∈ (a, 1− a), we have
A2n → 0 as n→∞. (32)
Further, we continue the function g1(u) so that that outside [a, 1− a] it has zero values
and denote the continued function by g1(u). Then
A1n ≤ c14
∣∣∣∣∣∣
1∫
0
∞∫
−∞
|g1(x+ y)− g1(y)| dy
1
h
K
(x
h
)∣∣∣∣∣∣ dx ≤
≤ c15
1∫
−1
∞∫
−∞
|g1(y + uh)− g1(y)| dy
K(u) du =
= c15
1∫
−1
ω(uh)K(u) du −→ 0 as n→∞, (33)
where
ω(y) =
∞∫
−∞
|g1(y + x)− g1(x)| dx.
The (33) holds by virtue of the Lebesgue theorem on majorized convergence and
the fact that ω(uh) ≤ 2‖g1‖L1(−∞,∞) and ω(uh) → 0 as n → ∞. Thereby, taking
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
1654 E. NADARAYA, P. BABILUA, G. SOKHADZE
(27) – (33) into account, we have proved that
σ2
n −→ σ2 =
1−a∫
a
g2(u) du. (34)
Now let us verify the fulfillment of the conditions of the central limit theorems for
the sums
Tn =
1√
n
n∑
j=1
ajn(ξj − F (tj)),
ajn =
1−a∫
a
1
h
K
(
x− tj
h
)
g2n(x) dx.
We have
Ln =
n−(1+ δ
2 )
n∑
j=1
a2+δ
jn E|ξj − F (tj)|2+δ
(
√
VarTn)2+δ
= O(n−
δ
2 ),
since ajn ≤ c16, E|ξj − F (tj)|2+δ ≤ 1 for all 1 ≤ j ≤ n and VarTn → σ2.
Finally, the statement b) of the theorem follows from a) if we take into account that
√
n
1−a∫
a
g1(x)
[
EF̂n(x)− F (x)
]
dx =
=
√
n
1−a∫
a
g1(x)
1∫
−1
[
K(u)(F (x−uh)−F (x))
]
du
dx =
= O(
√
nh2) +O
(
1√
nh
)
. (35)
Theorem 4 is proved.
Lemma 2. 1. In the conditions of the item (a) of Theorem 4,
E|Tn|s ≤ c17
1−a∫
a
g(u) du
s
2
, s > 2. (36)
2. In the conditions of the item (b) of Theorem 4,
E|Tn|s ≤ c18
1−a∫
a
g(u) du
s
2
, s > 2. (37)
Proof. Tn is the linear form of ηj = ξj − F (tj), Eηj = 0, 1 ≤ j ≤ n. Hence to
prove (36) we use Whittle’s inequality [3].
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1655
It is obvious that E|ηj |s ≤ 1, j = 1, n. Therefore by Whittle’s inequality
E|Tn|s ≤ c(s)2s
1
nh2
n∑
j=1
1−a∫
a
K
(
u− tj
h
)
g2n(u) du
2
s
2
,
where g2n(u) = g1(u)F−1
2n (u).
This, by virtue of Lemma 1, yields
E|Tn|s ≤ c(s)2s
1∫
0
1
h
1−a∫
a
K
(
u− t
h
)
g2n(u) du
2
dt+
+O
(
1
nh2
) 1−a∫
a
g2n(u) du
2
s
2
. (38)
Further, since
g2n(u) ≤ g(u)
[
1
F (a)(1− F (1− a))
] [
1 +O
(
1
nh
)]
≤
≤ c19g(u), a ≤ u ≤ 1− a,
from (38) it follows that
E|Tn|s ≤ c20
sup
0≤t≤1
1
h
1−a∫
a
K
(
u− t
h
)
g2n(u) du
×
×
1∫
0
dt
1−a∫
a
1
h
K
(
u−t
h
)
g2n(u) du
s
2
+O
(
1
nh2
) s
2
1−a∫
a
g2n(u) du
s
2
≤
≤ c21
1−a∫
a
g(u) du
s
2
[1 + o(1)] ≤ c22
1−a∫
a
g(u) du
s
2
, s > 2.
Next we obtain
E|Tn|s ≤ 2s−1
E|Tn|s +
∣∣∣∣∣∣√n
1−a∫
a
g1(u)
[
EF̂n(u)− F (u)
]
du
∣∣∣∣∣∣
s
≤
≤ c23
1−a∫
a
g(u) du
s
2
+
∣∣∣∣∣∣O (√nh2
) 1−a∫
a
g(u) du
∣∣∣∣∣∣
s
≤ c24
1−a∫
a
g(u) du
s
2
.
Lemma 2 is proved.
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
1656 E. NADARAYA, P. BABILUA, G. SOKHADZE
Let us introduce the following random processes:
Tn(t) =
√
n
t∫
a
(
F̂n(u)− EF̂n(u)
)
ψ(F (u)) du,
Tn(t) =
√
n
t∫
a
(
F̂n(u)− F (u)
)
ψ(F (u)) du.
Theorem 5. 10. Let the conditions of the item (a) of Theorem 4 be fulfilled. Then
for all continuous functionals f(·) on C[a, 1 − a], the distribution f(Tn(t)) converges
to the distribution f(W (t − a)) where W (t − a), a ≤ t ≤ 1 − a, is a Wiener process
with a correlation function r(s, t) = min(t− a, s− a), W (t− a) = 0, t = a.
20. Let the conditions of the item (b) of Theorem 4 be fulfilled. Then for all continu-
ous functionals f(·) on C[a, 1− a], the distribution f(Tn(t)) converges to the distribu-
tion f(W (t− a)).
Proof. First we will show that the finite-dimensional distributions of processes Tn(t)
converge to the finite-dimensional distribution of a process W (t − a), t ≥ a. Let us
consider one moment of time t1. We have to show that
Tn(t1)
d−→W (t1 − a). (39)
To prove (39), it suffices to take g(x) = I[a,t1)(x) in (25). Then, by virtue of
Theorem 4,
Tn(t1)
d−→ N
0,
1−a∫
a
I[a,t1)(x) dx
= N(0, t1 − a).
Let us now consider two moments of time t1, t2, t1 < t2. We have to show that(
Tn(t1), Tn(t2)
) d−→ (W (t1 − a),W (t2 − a)) . (40)
To prove (40), it suffices to take in (25)
g(x) = (λ1 + λ2)I[a,t1)(x) + λ2I[t1,t2)(x),
where λ1 and λ2 are arbitrary finite numbers. Then, by virtue of Theorem 4,
λ1Tn(t1) + λ2Tn(t2)
d−→ N
(
0, (λ1 + λ2)2(t1 − a) + λ2
2(t2 − t1)
)
.
On the other hand,
λ1W (t1 − a) + λ2W (t2 − a) =
= (λ1 + λ2)
[
W (t1 − a)−W (0)
]
+ λ2
[
W (t2 − a)−W (t1 − a)
]
is distributed as N
(
0, (λ1 + λ2)2(t1 − a) + λ2
2(t2 − t1)
)
. Therefore (40) holds. The
case of three and more number of moments is considered analogously. Therefore the
finite-dimensional distributions of processes Tn(t) converge to the finite-dimensional
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1657
distributions of a Wiener process W (t− a), a ≤ t ≤ 1− a with a correlation function
r(t1, t2) = min(t1 − a, t2 − a), W (t− a) = 0, t = a.
Now we will show that the sequence {Tn(t)} is dense, i. e., the sequence of the
corresponding distributions is dense. For this it suffices to show that for any t1, t2 ∈
∈ [a, 1− a] and all n
E
∣∣Tn(t1)− Tn(t2)
∣∣s ≤ c25|t1 − t2|
s
2 , s > 2.
Indeed, this inequality is obtained from (36) for g(x) = I[t1,t2](x).
Further, taking (35), (37) and the statement b) of Theorem 4 into account, we easily
ascertain that the finite-dimensional distributions of processes Tn(t) converge to the
finite-dimensional distributions of a Wiener process W (t− a), and also that
E |Tn(t1)− Tn(t2)|s ≤ c26|t1 − t2|
s
2 , s > 2.
Hence, from Theorem 2 of the monograph [3, p. 583] the proof of the theorem follows.
Application. By virtue of Theorem 5 and the Corollary of Theorem 1 from [3,
p. 371] we can write that
P
{
T+
n = max
a≤t≤1−a
Tn(t) > λ
}
−→
−→ G(λ) =
2√
2π(1− 2a)
∞∫
λ
exp
{
− x2
2(1− 2a)
}
dx
(
a is a prescribed number, 0 < a <
1
2
)
as n→∞.
This result makes it possible to construct tests of a level α, 0 < α < 1, for testing
the hypothesis H0 by which
H0 : lim
n→∞
EF̂n(x) = F0(x), a ≤ x ≤ 1− a,
in the alternative hypothesis
H1 :
1−a∫
a
ψ(F0(x))
(
lim
n→∞
EF̂n(x)− F0(x)
)
dx > 0.
Let λα be the critical value, G(λα) = α. If as a result of the experiment it turns out
that T+
n ≥ λα, then the hypothesis H0 must be rejected.
Remark 3. Let ti be the partitioning points of an interval [0, cF ], cF = inf{x ≥
≥ 0: F (x) = 1} <∞, chosen from the relation H(tj) =
2j − 1
2n
, j = 1, n, where
H(x) =
x∫
0
h(u) du,
h(u) is some known density of a distribution on [0, cF ] and h(x) ≥ µ > 0 for all
x ∈ [0, cF ]. In that case, by a reasoning analogous to that used above we can obtain a
generalization of the results of the present study.
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
1658 E. NADARAYA, P. BABILUA, G. SOKHADZE
Remark 4. Some ideas of the proof of Theorem 4 are borrowed from the interesting
paper by A. V. Ivanov [5].
1. Mandzhgaladze K. V. On an estimator of a distribution function and its moments // Soobshch. Akad.
Nauk Gruz. SSR. – 1986. – 124, № 2. – S. 261 – 263.
2. Parzen E. On estimation of a probability density function and mode // Ann. Math. Statist.– 1962. – 33.
– P. 1065 – 1076.
3. Whittle P. Bounds for the moments of linear and quadratic forms in independent variables // Teor. Ver. i
Prim. – 1960. – 5. – S. 331 – 335.
4. Gikhman I. I., Skorokhod A. V. Introduction to the theory of random processes (in Russian). – Moskva:
Gosudarstv. Izdat. Fiz.-Mat. Lit., 1965.
5. Ivanov A. V. Properties of a nonparametric estimate of the regression function (in Russian) // Dokl. Akad.
Nauk Ukr. SSR. Ser. A. – 1979. – № 7. – S. 499 – 502, 589.
Received 23.07.10
ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
|
| id | umjimathkievua-article-2989 |
| institution | Ukrains’kyi Matematychnyi Zhurnal |
| keywords_txt_mv | keywords |
| language | English |
| last_indexed | 2026-03-24T02:34:09Z |
| publishDate | 2010 |
| publisher | Institute of Mathematics, NAS of Ukraine |
| record_format | ojs |
| resource_txt_mv | umjimathkievua/f3/305debf24b5dc0c8e7a74cba61005af3.pdf |
| spelling | umjimathkievua-article-29892020-03-18T19:41:53Z Estimation of a distribution function by an indirect sample Оцінювання функції розподілу з використанням непрямої вибірки Babilua, P. Nadaraya, E. Sokhadze, G. A. Бабілуа, П. К. Надарая, Е. А. Сохадзе, Г. А. The problem of estimation of a distribution function is considered in the case where the observer has access only to a part of the indicator random values. Some basic asymptotic properties of the constructed estimates are studied. The limit theorems are proved for continuous functionals related to the estimation of $F^n(x)$ in the space $C[a,\; 1 - a], 0 Розглянуто задачу оцінювання функції розподілу у випадку, коли спостерігач має доступ лише до деяких індикаторних випадкових значень. Вивчено деякі базові асимптотичні властивості побудованих оцінок. У статгі доведено граничні теореми для неперервних функціоналів щодо оцінки $F^n(x)$ у просторі $C[a,\; 1 - a], 0 Institute of Mathematics, NAS of Ukraine 2010-12-25 Article Article application/pdf https://umj.imath.kiev.ua/index.php/umj/article/view/2989 Ukrains’kyi Matematychnyi Zhurnal; Vol. 62 No. 12 (2010); 1642–1658 Український математичний журнал; Том 62 № 12 (2010); 1642–1658 1027-3190 en https://umj.imath.kiev.ua/index.php/umj/article/view/2989/2728 https://umj.imath.kiev.ua/index.php/umj/article/view/2989/2729 Copyright (c) 2010 Babilua P.; Nadaraya E.; Sokhadze G. A. |
| spellingShingle | Babilua, P. Nadaraya, E. Sokhadze, G. A. Бабілуа, П. К. Надарая, Е. А. Сохадзе, Г. А. Estimation of a distribution function by an indirect sample |
| title | Estimation of a distribution function by an indirect sample |
| title_alt | Оцінювання функції розподілу з використанням непрямої вибірки |
| title_full | Estimation of a distribution function by an indirect sample |
| title_fullStr | Estimation of a distribution function by an indirect sample |
| title_full_unstemmed | Estimation of a distribution function by an indirect sample |
| title_short | Estimation of a distribution function by an indirect sample |
| title_sort | estimation of a distribution function by an indirect sample |
| url | https://umj.imath.kiev.ua/index.php/umj/article/view/2989 |
| work_keys_str_mv | AT babiluap estimationofadistributionfunctionbyanindirectsample AT nadarayae estimationofadistributionfunctionbyanindirectsample AT sokhadzega estimationofadistributionfunctionbyanindirectsample AT babíluapk estimationofadistributionfunctionbyanindirectsample AT nadaraâea estimationofadistributionfunctionbyanindirectsample AT sohadzega estimationofadistributionfunctionbyanindirectsample AT babiluap ocínûvannâfunkcíírozpodíluzvikoristannâmneprâmoívibírki AT nadarayae ocínûvannâfunkcíírozpodíluzvikoristannâmneprâmoívibírki AT sokhadzega ocínûvannâfunkcíírozpodíluzvikoristannâmneprâmoívibírki AT babíluapk ocínûvannâfunkcíírozpodíluzvikoristannâmneprâmoívibírki AT nadaraâea ocínûvannâfunkcíírozpodíluzvikoristannâmneprâmoívibírki AT sohadzega ocínûvannâfunkcíírozpodíluzvikoristannâmneprâmoívibírki |