Estimation of a distribution function by an indirect sample

The problem of estimation of a distribution function is considered in the case where the observer has access only to a part of the indicator random values. Some basic asymptotic properties of the constructed estimates are studied. The limit theorems are proved for continuous functionals related to t...

Full description

Saved in:
Bibliographic Details
Date:2010
Main Authors: Babilua, P., Nadaraya, E., Sokhadze, G. A., Бабілуа, П. К., Надарая, Е. А., Сохадзе, Г. А.
Format: Article
Language:English
Published: Institute of Mathematics, NAS of Ukraine 2010
Online Access:https://umj.imath.kiev.ua/index.php/umj/article/view/2989
Tags: Add Tag
No Tags, Be the first to tag this record!
Journal Title:Ukrains’kyi Matematychnyi Zhurnal
Download file: Pdf

Institution

Ukrains’kyi Matematychnyi Zhurnal
_version_ 1860509000581251072
author Babilua, P.
Nadaraya, E.
Sokhadze, G. A.
Бабілуа, П. К.
Надарая, Е. А.
Сохадзе, Г. А.
author_facet Babilua, P.
Nadaraya, E.
Sokhadze, G. A.
Бабілуа, П. К.
Надарая, Е. А.
Сохадзе, Г. А.
author_sort Babilua, P.
baseUrl_str https://umj.imath.kiev.ua/index.php/umj/oai
collection OJS
datestamp_date 2020-03-18T19:41:53Z
description The problem of estimation of a distribution function is considered in the case where the observer has access only to a part of the indicator random values. Some basic asymptotic properties of the constructed estimates are studied. The limit theorems are proved for continuous functionals related to the estimation of $F^n(x)$ in the space $C[a,\; 1 - a], 0 
first_indexed 2026-03-24T02:34:09Z
format Article
fulltext UDC 519.21 E. Nadaraya, P. Babilua, G. Sokhadze (Iv. Javakhishvili Tbilisi State Univ., Georgia) THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE ОЦIНЮВАННЯ ФУНКЦIЇ РОЗПОДIЛУ З ВИКОРИСТАННЯМ НЕПРЯМОЇ ВИБIРКИ The problem of estimation of a distribution function is considered when the observer has an access only to some indicator random values. Some basic asymptotic properties of the constructed estimates are studied. In this paper, the limit theorems are proved for continuous functionals related to the estimate of F̂n(x) in the space C[a, 1− a], 0 < a < 1/2. Розглянуто задачу оцiнювання функцiї розподiлу у випадку, коли спостерiгач має доступ лише до деяких iндикаторних випадкових значень. Вивчено деякi базовi асимптотичнi властивостi побудованих оцiнок. У статтi доведено граничнi теореми для неперервних функцiоналiв щодо оцiнки F̂n(x) у просторi C[a, 1− a], 0 < a < 1/2. Let X1, X2, . . . , Xn be a sample of independent observations of a random non-negative value X with a distribution function F (x). In problems of the theory of censored obser- vations, sample values are pairs Yi = (Xi ∧ ti) and Zi = I(Yi = Xi), i = 1, n, where ti are given numbers (ti 6= tj for i 6= j) or random values independent of Xi, i = 1, n. Throughout the paper, I(A) denotes the indicator of the set A. Our present study deals with a somewhat different case: an observer has an access only to the values of random variables ξi = I(Xi < ti) with ti = cF 2i− 1 2n , i = 1, n, cF = inf{x ≥ 0: F (x) = 1} <∞. The problem consists in estimating the distribution function F (x) by means of a sample ξ1, ξ2, . . . , ξn. Such a problem arises for example from a region of corrosion investigations, see [1] where an experiment related to corrosion is described. As an estimate for F (x) we consider an expression of the form F̂n(x) =  0, x ≤ 0, F1n(x) · F−1 2n (x), 0 < x < cF , 1, x ≥ cF , (1) F1n(x) = 1 nh n∑ j=1 K ( x− tj h ) ξj , F2n(x) = 1 nh n∑ j=1 K ( x− tj h ) , where K(x) is a probability density (kernel), K(x) = K(−x), x ∈ (−∞,∞), {h = = h(n)} is a sequence of positive numbers converging to zero. 1. In this subsection we give the conditions of asymptotic unbiasedness and consis- tency and the theorems on a limiting distribution F̂n(x). c© E. NADARAYA, P. BABILUA, G. SOKHADZE, 2010 1642 ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1643 Lemma 1. Assume that 10. K(x) is a function of bounded variation. If nh→∞, then 1 nh n∑ j=1 Km1−1 ( x− tj h ) Fm2−1(tj) = = 1 cFh cF∫ 0 Km1−1 ( x− u h ) Fm2−1(u) du+O ( 1 nh ) , (2) uniformly with respect to x ∈ [0, cF ]; m1, m2 are natural numbers. Proof. Let P (x) be a uniform distribution function on [0, cF ], and Pn(x) be an empirical distribution function of “the sample” t1, t2, . . . , tn, i. e., Pn(x) = = n−1 ∑n j=1 I(tj < x). It is obvious that sup 0≤x≤cF |Pn(x)− P (x)| = sup 0≤x≤cF ∣∣∣∣ 1n [ n x cF + 1 2 ] − x cF ∣∣∣∣ ≤ 1 2n . (3) We have 1 nh n∑ i=1 Km1−1 ( x− ti h ) Fm2−1(ti)− − 1 cFh cF∫ 0 Km1−1 ( x− u h ) Fm2−1(u) du = = 1 h cF∫ 0 Km1−1 ( x− u h ) Fm2−1(u) d(Pn(u)− P (u)). (4) Applying the integration by parts formula to the integral in the right-hand part of (4) and taking (3) into account, we obtain (2). Lemma 1 is proved. Below it is assumed without loss of generality that the interval [0, cF ] = [0, 1]. Theorem 1. Let F (x) be continuous and the conditions of the Lemma 1 be ful- filled. Then the estimate (1) is asymptotically unbiased and consistent at all points x ∈ [0, 1]. Moreover, F̂n(x) has an asymptotically normal distribution, i.e., √ nh ( F̂n(x)− EF̂n(x) ) σ−1(x) d−→ N(0, 1), σ2(x) = F (x)(1− F (x)) ∫ K2(u) du, where d denotes convergence in distribution, and N(0, 1) a random value having a normal distribution with mean 0 and variance 1. Proof. By Lemma 1 we have ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 1644 E. NADARAYA, P. BABILUA, G. SOKHADZE EF1n(x) = x h∫ x−1 h K(t)F (x+ ht) dt+O ( 1 nh ) , F2n(x) = 1 h 1∫ 0 K ( x− u h ) du+O ( 1 nh ) , (5) and for n→∞ 1 h 1∫ 0 K ( x− u h ) du −→ F2(x) =  1, x ∈ (0, 1), 1 2 , x = 0, x = 1, x h∫ x−1 h K(t)F (x+ th) dt −→ F (x)F2(x). Hence it follows that EF̂n(x)→ F (x), x ∈ [0, 1] as n→∞. Analogously, it is not difficult to show that Var F̂n(x) = =  1 nh2 1∫ 0 K2 ( x− u h ) F (u)(1− F (u)) du+O ( 1 (nh)2 )F−2 2n (x). Hence we readily derive nhVar F̂n(x) ∼ σ2(x) = F (x)(1− F (x)) ∫ K2(u) du (6) for x ∈ [0, 1]. Thus F̂n(x) is a consistent estimate for F (x), x ∈ [0, 1], and therefore P { F̂n(x1) ≤ F̂n(x2) } −→ 1 as n→∞, x1 < x2, x1, x2 ∈ [0, 1]. Let us now establish that F̂n(x) has an asymptotically normal distribution. Since, by virtue of (5), F2n(x)→ F2(x), it remains for us to verify the condition of the Liapunov central limit theorem for F1n(x). Let us denote ηi = ηi(x) = (nh)−1K ( x− ti h ) ξi and show that Ln = n∑ j=1 E|ηj − Eηj |2+δ(VarF1n(x))−1− δ2 −→ 0, δ > 0. (7) ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1645 We have n∑ j=1 E|ηj − Eηj |2+δ ≤ 2M1+δ(nh)−(2+δ) n∑ j=1 K ( x− tj h ) F (tj), M = max x∈R K(x). Hence, taking (2) into account, we find n∑ i=1 E|ηi − Eηi|2+δ ≤ c1(nh)−(1+δ). (8) Using the relation (6) and the inequality (8), we establish that Ln = O((nh)− δ 2 ), i.e., (7) holds. Theorem 1 is proved. 2. Uniform consistency. In this subsection we define the conditions, under which the estimate F̂n(x) uniformly converges in probability (a. s.) to true F (x). Let us introduce the Fourier transform of the function K(x) ϕ(t) = ∞∫ −∞ eitxK(x) dx and assume that 20. ϕ(t) is absolutely integrable. Following E. Parzen [2], F1n(x) can be represented as F1n(x) = 1 2π ∞∫ −∞ e−iu x hϕ(u) 1 nh n∑ j=1 ξje iu tj h du. Thus F1n(x)− EF1n(x) = 1 2π ∞∫ −∞ e−iu x hϕ(u) 1 nh n∑ j=1 (ξj − F (tj))e iu tj h du. Denote dn = sup x∈Ωn |F̂n(x)− EF̂n(x)|, Ωn = [hα, 1− hα], 0 < α < 1. Theorem 2. Let K(x) satisfy conditions 10 and 20. (a) Let F (x) be continuous and n 1 2hn →∞, then Dn = sup x∈Ωn |F̂n(x)− F (x)| P−→ 0. (b) If ∑∞ n=1 n− p 2 h−p <∞, p > 2, then Dn → 0 a. s. Proof. We have sup x∈Ωn 1− 1 h 1∫ 0 K ( x− u h ) du ≤−hα−1∫ −∞ K(u) du+ ∞∫ hα−1 K(u) du−→0. (9) ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 1646 E. NADARAYA, P. BABILUA, G. SOKHADZE This and (5) imply that sup x∈Ωn |F2n(x)− 1| −→ 0, (10) i.e., due to the uniform convergence for any ε0 > 0, 0 < ε0 < 1, and sufficiently large n ≥ n0, we have F2n(x) ≥ 1− ε0 uniformly with respect to x ∈ Ωn. Therefore, dn ≤ (1− ε0)−1 sup x∈Ωn |F1n(x)− EF1n(x)| ≤ ≤ (1− ε0)−1 1 2π ∞∫ −∞ |ϕ(u)| 1 nh ∣∣∣∣∣∣ n∑ j=1 ηje iu tj h ∣∣∣∣∣∣ du, ηj = ξj − F (tj). Hence, by Hölder’s inequality, we obtain dpn ≤ (1− ε0)−p 1 (2π)p ∞∫ −∞ |ϕ(u)| ∣∣∣∣∣∣ n∑ j=1 ηje iu tj h ∣∣∣∣∣∣ p du  ∞∫ −∞ |ϕ(u)| du  p q , 1 p + 1 q = 1, p > 2. Thus Edpn ≤ c(ε, p, ϕ) 1 (nh)p ∞∫ −∞ |ϕ(u)|E ∣∣∣∣∣∣ ∑ j,k cos (( tj − tk h ) u ) ηjηk ∣∣∣∣∣∣ p 2 du, (11) where c(ε, p, ϕ) = (1− ε0)−p 1 (2π)p  ∞∫ −∞ |ϕ(u)| du  p q . Denote A(u) = ∑ j,k cos (( tj − tk h ) u ) ηjηk. Then by (11) we write Edpn ≤ 2 p 2−1c(ε0, p, ϕ) 1 (nh)p × ×  ∞∫ −∞ |ϕ(u)| |EA(u)| p 2 du+ ∞∫ −∞ |ϕ(u)|E|A(u)− EA(u)| p 2 du  . (12) Using Whittle’s inequality [3] for moments of quadratic form, we obtain E|A(u)− EA(u)| p 2 ≤ ≤ 2 3 2 pc (p 2 ) [c(p)] 1 2 (∑ i,j cos2 (( tj − tk h ) u ) γ2 j (p)γ2 k(p) ) p 4 , ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1647 where γk(p) = (E|ηk|p) 1 p ≤ 1, c(p) = 2 p 2 √ π Γ ( p+ 1 2 ) . Hence it follows that E|A(u)− EA(u)| p 2 = O(n p 2 ) (13) uniformly with respect to u ∈ (−∞,∞). It is also clear that |EA(u)| p 2 = O(n p 2 ) (14) uniformly with respect to u ∈ (−∞,∞). Having combined the relations (12), (13) and (14), we obtain Edpn = O ( 1 ( √ nh)p ) , p > 2. Therefore, P { sup x∈Ωn ∣∣∣F̂n(x)− EF̂n(x) ∣∣∣ ≥ ε} ≤ c3 εp( √ nh)p . (15) Furthermore, we have sup x∈Ωn ∣∣∣EF̂n(x)− F (x) ∣∣∣ ≤ ≤ 1 1− ε0 ( sup x∈Ωn |EF1n(x)− F (x)|+ sup x∈Ωn |1− F2n(x)| ) . (16) By virtue of (10), the second summand in the right-hand part of (16) tends to 0, whereas the first summand is estimated as follows: sup x∈Ωn |EF1n(x)− F (x)| ≤ S1n + S2n +O ( 1 nh ) , (17) S1n = sup 0≤x≤1 ∣∣∣∣∣∣ 1h 1∫ 0 (F (y)− F (x))K ( x− y h ) dy ∣∣∣∣∣∣ , S2n = sup x∈Ωn 1− 1 h 1∫ 0 K ( x− y h ) dy  , and, by virtue of (9), S2n −→ 0 (18) as n→∞. Let us now consider S1n. Note that S1n ≤ sup 0≤x≤1 ∣∣∣∣∣∣ 1∫ 0 |F (y)− F (x)| 1 h K ( x− y h )∣∣∣∣∣∣ dy = ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 1648 E. NADARAYA, P. BABILUA, G. SOKHADZE = sup 0≤x≤1 x∫ x−1 |F (x− u)− F (x)| 1 h K (u h ) du ≤ ≤ sup 0≤x≤1 ∞∫ −∞ |F (x− u)− F (x)| 1 h K (u h ) du. (19) Assume that δ > 0 and divide the integration domain in (19) into two domains |u| ≤ δ and |u| > δ. Then S1n ≤ sup 0≤x≤1 ∫ |u|≤δ |F (x− u)− F (x)| 1 h K (u h ) du+ + sup 0≤x≤1 ∫ |u|>δ |F (x− u)− F (x)| 1 h K (u h ) du ≤ ≤ sup x∈R sup |u|≤δ |F (x− u)− F (x)|+ 2 ∫ |u|≥ δh K(u) du. (20) By a choice of δ > 0 the first summand in the right-hand part of (20) can be made arbitrarily small. Choosing δ > 0 and letting n→∞, we find that the second summand tends to zero. Therefore, lim n→∞ S1n = 0. (21) Finally, from the relations (15) – (18) and (21) the proof of the theorem follows. Remark 1. 1. If K(x) = 0, |x| ≥ 1 and α = 1, i. e., Ωn = [h, 1−h], then S2n = 0. 2. In the conditions of Theorem 2 sup x∈[a,b] |F̂n(x)− F (x)| −→ 0 in probability (a. s.) for any fixed interval [a, b] ⊂ [0, 1] since there may exist n0 such that [a, b] ⊂ Ωn, n ≥ n0. Assume that h = n−γ , γ > 0. The conditions of Theorem 2 are fulfilled: n 1 2hn → →∞ if 0 < γ < 1 2 , and ∞∑ n=1 n− p 2 h−pn <∞ if 0 < γ < p− 2 2p , p > 2. 3. Estimation of moments. In considering the problem, there naturally arises a question of estimation of the integral functionals of F (x), for example, moments µm, m ≥ 1: µm = m 1∫ 0 tm−1(1− F (t)) dt. As estimates for µm we consider the statistics ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1649 µ̂nm = 1− m n n∑ j=1 ξj 1 h 1−h∫ h tm−1K ( t− tj h ) F−1 2n (t) dt. Theorem 3. Let K(x) satisfy condition 10 and, in addition to this, K(x) = 0 outside the interval [−1, 1]. If nh → ∞ as n → ∞, then µ̂nk is an asymptotically unbiased, consistent estimate for µm and moreover √ n (µ̂nm − Eµ̂nm) σ d−→ N(0, 1), σ2 = m2 1∫ 0 t2m−2F (t)(1− F (t)) dt. Proof. Since K(x) has [−1, 1] as a support, we establish from (5) that F2n(x) = = 1 +O ( 1 nh ) uniformly with respect to x ∈ [h, 1− h]. Hence, by Lemma 1 we have Eµ̂nm = 1− m n n∑ j=1 F (tj) 1 h 1−h∫ h tm−1K ( t− tj h ) F−1 2n (t) dt = = 1−m 1−h∫ h  1 h 1∫ 0 K ( t− u h ) F (u) du  tm−1 dt+O ( 1 nh ) = = 1−m 1−h∫ h  1∫ −1 K(v)F (t+ vh) dv  tm−1 dt+O ( 1 nh ) = = 1−m 1∫ 0 tm−1  1∫ −1 K(v)F (t+ vh) dv  dt+O(h) +O ( 1 nh ) . (22) By the Lebesgue theorem on majorized convergence, from (22) we establish that Eµ̂nm −→ 1−m 1∫ 0 F (t)tm−1 dt = = m 1∫ 0 tm−1(1− F (t)) dt = µm, m ≥ 1. (23) Therefore, µ̂nm is an asymptotically unbiased estimate for µm. Further, analogously to (22), it can be shown that Var µ̂nm = m2 n 1∫ 0 F (t)(1−F (t))t2m−2 [ K ( 1−t h −1 ) −K ( 1− t h )]2 dt+ +O ( h n ) +O ( 1 (nh)2 ) , ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 1650 E. NADARAYA, P. BABILUA, G. SOKHADZE where K(v) = v∫ −∞ K(u) du. By the same Lebesgue theorem we see that nVar µ̂nm ∼ σ2 = m2 1∫ 0 t2m−2F (t)(1− F (t)) dt. (24) Therefore (23) and (24) imply that µ̂nm P−→ µm. To complete the proof of the theorem it remains to show that the statistics √ n (µ̂nm− − Eµ̂nm) have an asymptotically normal distribution with mean 0 and dispersion σ2. For this it suffices to show that the Liapunov fraction Ln → 0. Indeed, Ln = n−(2+δ)m2+δ n∑ j=1 E|ξj − F (tj)|2+δ× × ∣∣∣∣∣∣ 1h 1−h∫ h tm−1K ( t− tj h ) F−1 2n (t) dt ∣∣∣∣∣∣ 2+δ (Var µ̂nm)−(1+ δ 2 ) ≤ ≤ c6n−(2+δ) n∑ j=1 E|ξj − F (tj)|2+δ(Var µ̂nm)−(1+ δ 2 ) ≤ ≤ c7n−1−δ(Var µ̂nm)−1− δ2 = O(n− δ 2 ). Theorem 3 is proved. 4. Limit theorems of functionals related to the estimate F̂n(x). In this subsection the kernel K(x) ≥ 0 is chosen so that it would be a function of finite variation and satisfy the conditions K(−u) = K(u), ∫ K(u) du = 1, K(u) = 0 for |u| ≥ 1. Theorem 4. Let g(x) ≥ 0, x ∈ [a, 1 − a], 0 < a < 1 2 , be a measurable and bounded function. (a) If F (a) > 0 and nh2 →∞ as n→∞, then Tn = √ n 1−a∫ a g1(x) [ F̂n(x)− EF̂n(x) ] dx d−→ N(0, σ2), (25) where g1(x) = g(x)ψ(F (x)), ψ(t) = 1√ t(1− t) . ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1651 (b) If F (a) > 0, nh2 →∞, nh4 → 0 as n→∞ and F (x) has bounded derivatives up to second order, then as n→∞ Tn = √ n 1−a∫ a g1(x) [ F̂n(x)− F (x) ] dx d−→ N(0, σ2), σ2 = 1−a∫ a g2(u) du. Remark 2. We have introduced a > 0 in (25) in order to avoid the boundary effect of the estimate F̂n(x) since near the interval boundary the estimate F̂n(x) being a kernel type estimate behaves worse in the sense of order of bias tendency to zero than on any inner interval [a, 1− a] ⊂ [0, 1], 0 < a < 1 2 . Proof of Theorem 4. We have Tn = 1√ n n∑ j=1 (ξj − F (tj)) 1 h 1−a∫ a K ( u− tj h ) g2n(u) du, where g2n(u) = g1(u)F−1 2n (u). Hence σ2 n = VarTn = = 1 n n∑ j=1 ψ−2(F (tj))  1 h 1−a∫ a K ( u− tj h ) g2n(u) du 2 . (26) Since K(u) has [−1, 1] as a support and 0 < a ≤ u ≤ 1 − a, it can be easily verified that F2n(u) = 1 +O ( 1 nh ) and g2n(u) = g1(u) +O ( 1 nh ) uniformly on u ∈ [a, 1− a]. Therefore, from (26) we have σ2 n = 1 n n∑ j=1 ψ−2(F (tj))  1 h 1−a∫ a K ( u− tj h ) g1(u) du 2 +O ( 1 nh ) . By virtue of Lemma 1, we can easily show that 1 n n∑ j=1 ψ−2(F (tj))  1 h 1−a∫ a K ( u− tj h ) g1(u) du 2 = = 1∫ 0 ψ−2(F (t)) dt  1 h 1−a∫ a K ( u− t h ) g1(u) du 2 +O ( 1 nh2 ) . ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 1652 E. NADARAYA, P. BABILUA, G. SOKHADZE Therefore, σ2 n = 1−a∫ a ψ−2(F (t)) dt  1 h 1−a∫ a K ( u− t h ) g1(u) du 2 + +ε(1) n + ε(2) n +O ( 1 nh2 ) , (27) ε(1) n = a∫ 0 ψ−2(F (t)) dt  1 h 1−a∫ a K ( u− t h ) g1(u) du 2 , ε(2) n = 1∫ 1−a ψ−2(F (t)) dt  1 h 1−a∫ a K ( u− t h ) g1(u) du 2 . Since by F (u)(1− F (u)) ≤ 1 4 , g(u) ≤ c8 and ψ(F (u)) ≤ 1 F (a)(1− F (1− a)) , a ≤ u ≤ 1− a, it follows that g1(u) ≤ c9, we have ε(1) n ≤ c10 a∫ 0 dt  1−a−t h∫ a−t h K(u) du  2 , (28) where a− t ≥ 0 and 1− a− t ≥ 0. The first inequality is obvious, whereas the second one follows from the inequalities 0 ≤ t ≤ a and 0 < a < 1 2 . Therefore, lim n→∞ 1−a−t h∫ a−t h K(u) du = 0, 0 ≤ t < a, 1 2 , t = a. By the Lebesgue theorem on bounded convergence, from the latter expression and (28) we obtain ε(1) n → 0 as n→∞. (29) Analogously, ε(2) n → 0 as n→∞. (30) Now let us establish that 1−a∫ a ψ−2(F (t)) dt  1 h 1−a∫ a K ( u−t h ) g1(u) du 2 −→ σ2 = 1−a∫ a g2(u) du as n→∞. ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1653 We have∣∣∣∣∣∣∣ 1−a∫ a ψ−2(F (t)) dt  1 h 1−a∫ a g1(u)K ( u−t h ) du 2 − 1−a∫ a ψ−2(F (t))g2 1(t) dt ∣∣∣∣∣∣∣ ≤ ≤ c11 1−a∫ a ψ−2(F (t)) dt ∣∣∣∣∣∣ 1h 1−a∫ a g1(u)K ( u− t h ) du− g1(t) ∣∣∣∣∣∣ ≤ ≤ c12 1−a∫ a dt ∣∣∣∣∣∣ 1h 1−a∫ a g1(u)K ( u−t h ) du−g1(t) 1−a∫ a 1 h K ( u−t h ) du ∣∣∣∣∣∣+ +c13 1−a∫ a ∣∣∣∣∣∣ 1−a∫ a 1 h K ( u− t h ) du− 1 ∣∣∣∣∣∣ dt = A1n +A2n. (31) Since 1−a∫ a 1 h K ( u− t h ) du −→ 1 for all t ∈ (a, 1− a), we have A2n → 0 as n→∞. (32) Further, we continue the function g1(u) so that that outside [a, 1− a] it has zero values and denote the continued function by g1(u). Then A1n ≤ c14 ∣∣∣∣∣∣ 1∫ 0  ∞∫ −∞ |g1(x+ y)− g1(y)| dy  1 h K (x h )∣∣∣∣∣∣ dx ≤ ≤ c15 1∫ −1  ∞∫ −∞ |g1(y + uh)− g1(y)| dy K(u) du = = c15 1∫ −1 ω(uh)K(u) du −→ 0 as n→∞, (33) where ω(y) = ∞∫ −∞ |g1(y + x)− g1(x)| dx. The (33) holds by virtue of the Lebesgue theorem on majorized convergence and the fact that ω(uh) ≤ 2‖g1‖L1(−∞,∞) and ω(uh) → 0 as n → ∞. Thereby, taking ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 1654 E. NADARAYA, P. BABILUA, G. SOKHADZE (27) – (33) into account, we have proved that σ2 n −→ σ2 = 1−a∫ a g2(u) du. (34) Now let us verify the fulfillment of the conditions of the central limit theorems for the sums Tn = 1√ n n∑ j=1 ajn(ξj − F (tj)), ajn = 1−a∫ a 1 h K ( x− tj h ) g2n(x) dx. We have Ln = n−(1+ δ 2 ) n∑ j=1 a2+δ jn E|ξj − F (tj)|2+δ ( √ VarTn)2+δ = O(n− δ 2 ), since ajn ≤ c16, E|ξj − F (tj)|2+δ ≤ 1 for all 1 ≤ j ≤ n and VarTn → σ2. Finally, the statement b) of the theorem follows from a) if we take into account that √ n 1−a∫ a g1(x) [ EF̂n(x)− F (x) ] dx = = √ n 1−a∫ a g1(x)  1∫ −1 [ K(u)(F (x−uh)−F (x)) ] du  dx = = O( √ nh2) +O ( 1√ nh ) . (35) Theorem 4 is proved. Lemma 2. 1. In the conditions of the item (a) of Theorem 4, E|Tn|s ≤ c17  1−a∫ a g(u) du  s 2 , s > 2. (36) 2. In the conditions of the item (b) of Theorem 4, E|Tn|s ≤ c18  1−a∫ a g(u) du  s 2 , s > 2. (37) Proof. Tn is the linear form of ηj = ξj − F (tj), Eηj = 0, 1 ≤ j ≤ n. Hence to prove (36) we use Whittle’s inequality [3]. ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1655 It is obvious that E|ηj |s ≤ 1, j = 1, n. Therefore by Whittle’s inequality E|Tn|s ≤ c(s)2s  1 nh2 n∑ j=1  1−a∫ a K ( u− tj h ) g2n(u) du 2  s 2 , where g2n(u) = g1(u)F−1 2n (u). This, by virtue of Lemma 1, yields E|Tn|s ≤ c(s)2s  1∫ 0  1 h 1−a∫ a K ( u− t h ) g2n(u) du 2 dt+ +O ( 1 nh2 ) 1−a∫ a g2n(u) du 2  s 2 . (38) Further, since g2n(u) ≤ g(u) [ 1 F (a)(1− F (1− a)) ] [ 1 +O ( 1 nh )] ≤ ≤ c19g(u), a ≤ u ≤ 1− a, from (38) it follows that E|Tn|s ≤ c20  sup 0≤t≤1  1 h 1−a∫ a K ( u− t h ) g2n(u) du × × 1∫ 0 dt 1−a∫ a 1 h K ( u−t h ) g2n(u) du  s 2 +O ( 1 nh2 ) s 2 1−a∫ a g2n(u) du  s 2 ≤ ≤ c21  1−a∫ a g(u) du  s 2 [1 + o(1)] ≤ c22  1−a∫ a g(u) du  s 2 , s > 2. Next we obtain E|Tn|s ≤ 2s−1 E|Tn|s + ∣∣∣∣∣∣√n 1−a∫ a g1(u) [ EF̂n(u)− F (u) ] du ∣∣∣∣∣∣ s ≤ ≤ c23  1−a∫ a g(u) du  s 2 + ∣∣∣∣∣∣O (√nh2 ) 1−a∫ a g(u) du ∣∣∣∣∣∣ s ≤ c24  1−a∫ a g(u) du  s 2 . Lemma 2 is proved. ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 1656 E. NADARAYA, P. BABILUA, G. SOKHADZE Let us introduce the following random processes: Tn(t) = √ n t∫ a ( F̂n(u)− EF̂n(u) ) ψ(F (u)) du, Tn(t) = √ n t∫ a ( F̂n(u)− F (u) ) ψ(F (u)) du. Theorem 5. 10. Let the conditions of the item (a) of Theorem 4 be fulfilled. Then for all continuous functionals f(·) on C[a, 1 − a], the distribution f(Tn(t)) converges to the distribution f(W (t − a)) where W (t − a), a ≤ t ≤ 1 − a, is a Wiener process with a correlation function r(s, t) = min(t− a, s− a), W (t− a) = 0, t = a. 20. Let the conditions of the item (b) of Theorem 4 be fulfilled. Then for all continu- ous functionals f(·) on C[a, 1− a], the distribution f(Tn(t)) converges to the distribu- tion f(W (t− a)). Proof. First we will show that the finite-dimensional distributions of processes Tn(t) converge to the finite-dimensional distribution of a process W (t − a), t ≥ a. Let us consider one moment of time t1. We have to show that Tn(t1) d−→W (t1 − a). (39) To prove (39), it suffices to take g(x) = I[a,t1)(x) in (25). Then, by virtue of Theorem 4, Tn(t1) d−→ N 0, 1−a∫ a I[a,t1)(x) dx  = N(0, t1 − a). Let us now consider two moments of time t1, t2, t1 < t2. We have to show that( Tn(t1), Tn(t2) ) d−→ (W (t1 − a),W (t2 − a)) . (40) To prove (40), it suffices to take in (25) g(x) = (λ1 + λ2)I[a,t1)(x) + λ2I[t1,t2)(x), where λ1 and λ2 are arbitrary finite numbers. Then, by virtue of Theorem 4, λ1Tn(t1) + λ2Tn(t2) d−→ N ( 0, (λ1 + λ2)2(t1 − a) + λ2 2(t2 − t1) ) . On the other hand, λ1W (t1 − a) + λ2W (t2 − a) = = (λ1 + λ2) [ W (t1 − a)−W (0) ] + λ2 [ W (t2 − a)−W (t1 − a) ] is distributed as N ( 0, (λ1 + λ2)2(t1 − a) + λ2 2(t2 − t1) ) . Therefore (40) holds. The case of three and more number of moments is considered analogously. Therefore the finite-dimensional distributions of processes Tn(t) converge to the finite-dimensional ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 THE ESTIMATION OF A DISTRIBUTION FUNCTION BY AN INDIRECT SAMPLE 1657 distributions of a Wiener process W (t− a), a ≤ t ≤ 1− a with a correlation function r(t1, t2) = min(t1 − a, t2 − a), W (t− a) = 0, t = a. Now we will show that the sequence {Tn(t)} is dense, i. e., the sequence of the corresponding distributions is dense. For this it suffices to show that for any t1, t2 ∈ ∈ [a, 1− a] and all n E ∣∣Tn(t1)− Tn(t2) ∣∣s ≤ c25|t1 − t2| s 2 , s > 2. Indeed, this inequality is obtained from (36) for g(x) = I[t1,t2](x). Further, taking (35), (37) and the statement b) of Theorem 4 into account, we easily ascertain that the finite-dimensional distributions of processes Tn(t) converge to the finite-dimensional distributions of a Wiener process W (t− a), and also that E |Tn(t1)− Tn(t2)|s ≤ c26|t1 − t2| s 2 , s > 2. Hence, from Theorem 2 of the monograph [3, p. 583] the proof of the theorem follows. Application. By virtue of Theorem 5 and the Corollary of Theorem 1 from [3, p. 371] we can write that P { T+ n = max a≤t≤1−a Tn(t) > λ } −→ −→ G(λ) = 2√ 2π(1− 2a) ∞∫ λ exp { − x2 2(1− 2a) } dx ( a is a prescribed number, 0 < a < 1 2 ) as n→∞. This result makes it possible to construct tests of a level α, 0 < α < 1, for testing the hypothesis H0 by which H0 : lim n→∞ EF̂n(x) = F0(x), a ≤ x ≤ 1− a, in the alternative hypothesis H1 : 1−a∫ a ψ(F0(x)) ( lim n→∞ EF̂n(x)− F0(x) ) dx > 0. Let λα be the critical value, G(λα) = α. If as a result of the experiment it turns out that T+ n ≥ λα, then the hypothesis H0 must be rejected. Remark 3. Let ti be the partitioning points of an interval [0, cF ], cF = inf{x ≥ ≥ 0: F (x) = 1} <∞, chosen from the relation H(tj) = 2j − 1 2n , j = 1, n, where H(x) = x∫ 0 h(u) du, h(u) is some known density of a distribution on [0, cF ] and h(x) ≥ µ > 0 for all x ∈ [0, cF ]. In that case, by a reasoning analogous to that used above we can obtain a generalization of the results of the present study. ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12 1658 E. NADARAYA, P. BABILUA, G. SOKHADZE Remark 4. Some ideas of the proof of Theorem 4 are borrowed from the interesting paper by A. V. Ivanov [5]. 1. Mandzhgaladze K. V. On an estimator of a distribution function and its moments // Soobshch. Akad. Nauk Gruz. SSR. – 1986. – 124, № 2. – S. 261 – 263. 2. Parzen E. On estimation of a probability density function and mode // Ann. Math. Statist.– 1962. – 33. – P. 1065 – 1076. 3. Whittle P. Bounds for the moments of linear and quadratic forms in independent variables // Teor. Ver. i Prim. – 1960. – 5. – S. 331 – 335. 4. Gikhman I. I., Skorokhod A. V. Introduction to the theory of random processes (in Russian). – Moskva: Gosudarstv. Izdat. Fiz.-Mat. Lit., 1965. 5. Ivanov A. V. Properties of a nonparametric estimate of the regression function (in Russian) // Dokl. Akad. Nauk Ukr. SSR. Ser. A. – 1979. – № 7. – S. 499 – 502, 589. Received 23.07.10 ISSN 1027-3190. Укр. мат. журн., 2010, т. 62, № 12
id umjimathkievua-article-2989
institution Ukrains’kyi Matematychnyi Zhurnal
keywords_txt_mv keywords
language English
last_indexed 2026-03-24T02:34:09Z
publishDate 2010
publisher Institute of Mathematics, NAS of Ukraine
record_format ojs
resource_txt_mv umjimathkievua/f3/305debf24b5dc0c8e7a74cba61005af3.pdf
spelling umjimathkievua-article-29892020-03-18T19:41:53Z Estimation of a distribution function by an indirect sample Оцінювання функції розподілу з використанням непрямої вибірки Babilua, P. Nadaraya, E. Sokhadze, G. A. Бабілуа, П. К. Надарая, Е. А. Сохадзе, Г. А. The problem of estimation of a distribution function is considered in the case where the observer has access only to a part of the indicator random values. Some basic asymptotic properties of the constructed estimates are studied. The limit theorems are proved for continuous functionals related to the estimation of $F^n(x)$ in the space $C[a,\; 1 - a], 0  Розглянуто задачу оцінювання функції розподілу у випадку, коли спостерігач має доступ лише до деяких індикаторних випадкових значень. Вивчено деякі базові асимптотичні властивості побудованих оцінок. У статгі доведено граничні теореми для неперервних функціоналів щодо оцінки $F^n(x)$ у просторі $C[a,\; 1 - a], 0  Institute of Mathematics, NAS of Ukraine 2010-12-25 Article Article application/pdf https://umj.imath.kiev.ua/index.php/umj/article/view/2989 Ukrains’kyi Matematychnyi Zhurnal; Vol. 62 No. 12 (2010); 1642–1658 Український математичний журнал; Том 62 № 12 (2010); 1642–1658 1027-3190 en https://umj.imath.kiev.ua/index.php/umj/article/view/2989/2728 https://umj.imath.kiev.ua/index.php/umj/article/view/2989/2729 Copyright (c) 2010 Babilua P.; Nadaraya E.; Sokhadze G. A.
spellingShingle Babilua, P.
Nadaraya, E.
Sokhadze, G. A.
Бабілуа, П. К.
Надарая, Е. А.
Сохадзе, Г. А.
Estimation of a distribution function by an indirect sample
title Estimation of a distribution function by an indirect sample
title_alt Оцінювання функції розподілу з використанням непрямої вибірки
title_full Estimation of a distribution function by an indirect sample
title_fullStr Estimation of a distribution function by an indirect sample
title_full_unstemmed Estimation of a distribution function by an indirect sample
title_short Estimation of a distribution function by an indirect sample
title_sort estimation of a distribution function by an indirect sample
url https://umj.imath.kiev.ua/index.php/umj/article/view/2989
work_keys_str_mv AT babiluap estimationofadistributionfunctionbyanindirectsample
AT nadarayae estimationofadistributionfunctionbyanindirectsample
AT sokhadzega estimationofadistributionfunctionbyanindirectsample
AT babíluapk estimationofadistributionfunctionbyanindirectsample
AT nadaraâea estimationofadistributionfunctionbyanindirectsample
AT sohadzega estimationofadistributionfunctionbyanindirectsample
AT babiluap ocínûvannâfunkcíírozpodíluzvikoristannâmneprâmoívibírki
AT nadarayae ocínûvannâfunkcíírozpodíluzvikoristannâmneprâmoívibírki
AT sokhadzega ocínûvannâfunkcíírozpodíluzvikoristannâmneprâmoívibírki
AT babíluapk ocínûvannâfunkcíírozpodíluzvikoristannâmneprâmoívibírki
AT nadaraâea ocínûvannâfunkcíírozpodíluzvikoristannâmneprâmoívibírki
AT sohadzega ocínûvannâfunkcíírozpodíluzvikoristannâmneprâmoívibírki