A new test for unimodality
A distribution function (d.f.) of a random variable is unimodal if there exists a number such that d.f. is convex left from this number and is concave right from this number. This number is called a mode of d.f. Since one may have more than one mode, a mode is not necessarily unique. The purpose of...
Збережено в:
| Дата: | 2008 |
|---|---|
| Автори: | , , |
| Формат: | Стаття |
| Мова: | Англійська |
| Опубліковано: |
Інститут математики НАН України
2008
|
| Онлайн доступ: | https://nasplib.isofts.kiev.ua/handle/123456789/4530 |
| Теги: |
Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
|
| Назва журналу: | Digital Library of Periodicals of National Academy of Sciences of Ukraine |
| Цитувати: | A new test for unimodality / R.I. Andrushkiw, D.D. Klyushin, Y.I. Petunin // Theory of Stochastic Processes. — 2008. — Т. 14 (30), № 1. — С. 1–6. — Бібліогр.: 12 назв.— англ. |
Репозитарії
Digital Library of Periodicals of National Academy of Sciences of Ukraine| _version_ | 1860027327220547584 |
|---|---|
| author | Andrushkiw, R.I. Klyushin, D.D. Petunin, Y.I. |
| author_facet | Andrushkiw, R.I. Klyushin, D.D. Petunin, Y.I. |
| citation_txt | A new test for unimodality / R.I. Andrushkiw, D.D. Klyushin, Y.I. Petunin // Theory of Stochastic Processes. — 2008. — Т. 14 (30), № 1. — С. 1–6. — Бібліогр.: 12 назв.— англ. |
| collection | DSpace DC |
| description | A distribution function (d.f.) of a random variable is unimodal if there exists a number such that d.f. is convex left from this number and is concave right from this number. This number is called a mode of d.f. Since one may have more than one mode, a mode is not necessarily unique. The purpose of this paper is to construct nonparametric tests for the unimodality of d.f. based on a sample obtained from the general population of values of the random variable by simple sampling. The tests proposed are significance tests such that the unimodality of d.f. can be guaranteed with some probability (confidence level).
|
| first_indexed | 2025-12-07T16:50:16Z |
| format | Article |
| fulltext |
Theory of Stochastic Processes
Vol. 14 (30), no. 1, 2008, pp. 1–6
UDC 519.21
ROMAN I. ANDRUSHKIW, DMITRY D. KLYUSHIN, AND YURIY I. PETUNIN
A NEW TEST FOR UNIMODALITY
A distribution function (d.f.) of a random variable is unimodal if there exists a num-
ber such that d.f. is convex left from this number and is concave right from this
number. This number is called a mode of d.f. Since one may have more than one
mode, a mode is not necessarily unique. The purpose of this paper is to construct
nonparametric tests for the unimodality of d.f. based on a sample obtained from the
general population of values of the random variable by simple sampling. The tests
proposed are significance tests such that the unimodality of d.f. can be guaranteed
with some probability (confidence level).
1. Introduction
Testing the unimodality of a distribution function is a widely investigated issue. The
most popular tests include the DIP test proposed by J.A.Hartigan and the kernel density
estimation test proposed by B.W.Silverman [1–4]. However, all of these tests are compu-
tationally quite complex and asymptotic. That is why it is useful to develop elementary
tests which are based on simple computational procedures and are non-asymptotic.
According to A.Ya.Khinchin, a distribution function F (u) of a random variable x is
unimodal if there exists a number M such that d.f. F (u) is convex in (−∞,M) and
concave in (M,∞) . The number M is said to be a mode of d.f. F (u). A mode can not
be unique, since d.f. F (u) can have several modes. Also, d.f. F (u) can have break at M
and be continuous in (−∞,M) and (M,∞) .
The purpose of the paper is to construct nonparametric tests for the unimodality of d.f.
F (u) based on a sample x1, x2, . . . , xn obtained by the simple sampling from the general
population of values of a random variable x. The tests proposed are significance tests,
so the unimodality of d.f. F (u) can be guaranteed with some probability α (confidence
level), where β = 1 − α is the significance level of a test. To formulate the tests, we
introduce new estimations of the probability density (d.p.) and the distribution function
based on a sample x1, x2, . . . , xn.
2. Unified histogram and modified empirical d.f.
Let x1, x2, . . . , xn. be a sample obtained from a general population F (u) by the sim-
ple sampling which has d.p. F (u). Since these functions are unknown, we call them
hypothetical. To estimate d.p., we use the relation
(1) p
(
xn+1 ∈
(
x(i), x(i+1)
))
=
1
n+ 1
,
where x(i) is the order statistics (i = 1, 2, ..., n). Using estimation (1), we can define the
estimation hn (u) for a hypothetical d.p.
hn(u) =
{
1
(n−1)(x(i+1)−x(i))
, if u ∈ (xi, xi+1) ,
0, otherwise.
2000 AMS Mathematics Subject Classification. Primary 62G05.
Key words and phrases. Unimodality, distribution function, significance test.
1
2 ROMAN I. ANDRUSHKIW, DMITRY D. KLYUSHIN, AND YURIY I. PETUNIN
In such a case, the probability that the value of a random variable x̃ with d.p. (2)
belongs to
[
x(i), x(i+1)
)
is equal to
p
(
x̃ ∈ (x(i), x(i+1)
))
=
1
n+ 1
,
where xi are considered as constants. For large n, this probability is close to probability
(1), so we refer to the value hn (u)) as a unified histogram constructed on x1, x2, . . . , xn..
This histogram has some advantage over all other histograms, because it is unambigu-
ously defined by the sample x1, x2, . . . , xn.. Also, the integral
(2) F̃ ∗
n (u) =
u∫
x(1)
hn (v) dv =
u+ (i− 1)x(i+1) − ix(i)
(n− 1)
(
x(i+1) − x(i)
)
is a linear spline x(i) ≤ v < x(i+1) which is a more precise estimation of the hypothetical
d.f F (u) than that of a piecewise empirical d.f.
F ∗
n (u) =
i
n
, if x(i) ≤ v < x(i+1).
We refer to the function F ∗
n (u) as e.d.f. and to F̃ ∗
n (u) defined by (2) as a modified
e.d.f. (m.e.d.f.). Its advantages over the conventional e.d.f. are obvious: 1) when d.f.
F (u) is continuous, linear splines are more precise approximations than the piecewise
e.d.f F ∗
n (u), and 2) F̃ ∗
n (u) is continuous, so it is possible to estimate quantiles of any
order and to construct an inverse d.f. (Quetelet curve), whereas it is impossible to do by
using the piecewise e.d.f. F ∗
n (u). However, at large n, e.d.f. F ∗
n (u) and F̃ ∗
n (u) are close.
Let us prove that
(3)
∣∣∣F ∗
n (u)− F̃ ∗
n (u)
∣∣∣ ≤ 1
n
.
Indeed, for all u ∈ [xi, xi+1), the following relation holds:∣∣∣F ∗
n (u)− F̃ ∗
n (u)
∣∣∣ = ∣∣∣∣∣u+ (i− 1)x(i+1) − ix(i)
n (n− 1)
(
x(i+1) − x(i)
) − i
n
∣∣∣∣∣ =
∣∣∣∣∣nu− (n− i)x(i+1) + ix(i)
n (n− 1)
(
x(i+1) − x(i)
) ∣∣∣∣∣ .
Granting that
F̃ ∗
n (u) =
u− x(i+1) + i
(
x(i+1) − x(i)
)
(n− 1)
(
x(i+1) − x(i)
) =
1
n− 1
[
u− x(i+1)
x(i+1) − x(i+1)
+ 1
]
,
we have
i− 1
n− 1
≤ F̃ ∗
n (u) ≤ i
n− 1
and
i− 1
n− 1
− F ∗
n (u) ≤ F̃ ∗
n (u)− F ∗
n (u) ≤ i
n− 1
− F ∗
n (u) ,
n (i− 1)− (n− 1) i
(n− 1) i
≤ F̃ ∗
n (u)− F ∗
n (u) ≤ ni− (n− 1) i
n (n− 1)
,
i− n
n (n− 1)
≤ F̃ ∗
n (u)− F ∗
n (u) ≤ 1
n
;
hence,
− 1
n
≤ F̃ ∗
n (u)− F ∗
n (u) ≤ 1
n
,
i.e.
(4)
∣∣∣F̃ ∗
n (u)− F ∗
n (u)
∣∣∣ ≤ 1
n
.
A NEW TEST FOR UNIMODALITY 3
Estimation (4) implies that m.e.d.f has similar asymptotic properties as conventional
e.d.f., i.e. it is consistent, asymptotically unbiased, etc.
3. Confidence limits for hypothetical d.f.
Let us define the lower and upper bounds of a hypothetical d.f.F (u) by means of an
empirical d.f., under the assumption that F (u) is continuous and strictly increasing. This
problem was solved in [5–10]. Hence, given a significance level β∗ (e.g., β∗ = 0.05), we
can define ε so that
p
(
Δ = max
x(1)≤u≤x(n)
|F (u)− Fn(u)| > ε
)
= β∗.
It follows that, for a given β∗, we can find ε according to statistical tables [11] and
construct a strip Πβ∗ , whose bounds are stepwise linear: y = F ∗
n(u)+ε and y = F ∗
n(u)−ε.
The strip Πβ∗ completely covers the true d.f. y = F (u) with the confidence probability
α∗ = 1 − β∗. Hereinafter, we refer to the strip Πβ∗ as the confidence strip for d.f. with
significance level β∗ constructed for the empirical d.f.
4. Test for unimodality based on e.d.f.
Let x1, x2, . . . , xn be a sample obtained from a general population G by the simple
sampling with continuous and strictly monotone d.f. F (u). Using this sample, we con-
struct the empirical d.f. F ∗
n(u) and the strip Πβ∗ . Denote, by ϕ(u), the upper bound of
Πβ∗ described by the equation y = F ∗
n (u) + ε, and let ψ(u) be the lower bound of Πβ∗
described by the equation y = F ∗
n(u)− ε. Then
p (ϕ(u) ≤ F (u) ≤ ψ(u)) = α∗ = 1− β∗.
Definition 1. Let y=ϕ(u) be an arbitrary function defined on [a, b]. Then the set
GU = {(u, y): y ≥ ϕ(u), a ≤ u ≤ b}
is an epigraph of ϕ(u), and the set GL = {(u, y) : y ≤ ϕ(u), a ≤ u ≤ b} is a subgraph of
ϕ(u).
Definition 2. The lower bound of a convex hull of the epigraph of a function ϕ(u) is a
convex minorant of ϕ(u),
ϕinf(u) = inf
{
v : (u, v) ∈ conv
a≤u≤b
GU
}
,
where conv GU is the convex hull of GU . Analogously, the upper bound of a convex hull
of the subgraph of a function ϕ(u) is a concave majorant of ϕ(u):
ψsup(u) = sup
{
v : (u, v) ∈ conv
a≤u≤b
GL
}
.
Theorem 1. Let ϕinf(u) and ψsup(u) be the convex minorant and concave majorant of
ϕ(u) and ψ(u), respectively, and
c = sup
{
u : ϕinf(u) ≤ ψ(u), x(1) ≤ u ≤ x(n)
}
,
d = inf
{
u : ψsup(u) ≥ ϕ(u), x(1) ≤ u ≤ x(n)
}
.
Then, the hypothetical distribution F (u) is unimodal iff
1) ϕinf(u) ≥ ψ(u) or ψsup(u) ≤ ϕ(u) ∀u ∈ [x(1), x(n)
]
;
or
2) c ≥ d.
4 ROMAN I. ANDRUSHKIW, DMITRY D. KLYUSHIN, AND YURIY I. PETUNIN
Fig. 1. If c < d, the unimodality is absent.
Moreover, the significance level of this criterion is β∗.
Proof. Necessity. Suppose that the hypothetical d.f. F (u) is unimodal, and M is its
mode. If M ≤ x(1) or M ≥ x(n), then F (u) on
[
x(1), x(n)
]
can be convex or concave.
Then condition 1) holds.
If x(1) ≥ M ≤ x(n), then F (u) is convex on
[
x(1),M
]
and concave on
[
M,x(n)
]
. In
such a case, it follows from Definition 2 (see Fig. 1) that ϕinf(u) ≥ F (u) on
[
x(1),M
]
.
Also, F (u) ≥ ψinf(u)∀u ∈ [x(1),M
]
, so ϕinf(u) ≥ ψ(u), and d ≥M .
On the other hand, F (u) ≥ ψsup(u), as far as F (u) is concave on
[
M,x(n)
]
. Definition
2 implies that F (u) ≥ ψsup(u) on
[
M,x(n)
]
. Also, ϕ(u) ≥ F (u)∀u ∈ [M,x(n)
]
. Thus,
ϕ(u) ≥ ψsup(u)∀u ∈ [M,x(n)
]
and d ≤M . Consequently, c ≥ d , and condition 2) holds.
Sufficiency. Note that ϕ(u) and ψ(u) are increasing. If condition 1) holds, then
ψ(u) ≤ ϕsup(u) ≤ ϕ(u) ∀u ∈ [x(1), x(n)
]
or
ψ(u) ≤ ψinf (u) ≤ ϕ(u) ∀u ∈ [x(1), x(n)
]
.
Thus, ϕsup(u) (or ψinf (u) ) lies in the strip Πβ . Therefore, ϕsup(u) (or ψinf (u)) can
be used as an estimation of the hypothetical d.f. F (u) of a general population G. Since
F (u) = ϕinf (u) or F (u) = ψinf (u), the hypothetical d.f. increases, is convex or concave
on
[
x(1), x(n)
]
, and is unimodal. The significance level of this test is β∗.
Now, we suppose that condition 2) holds, i.e. c ≥ d. Put F̂ (u) = ϕinf (u), if u ∈[
x(1), c
]
, and F̂ (u) = ψsup(u), if u ∈ (c, x(n)
]
. It is easy to see that F̂ (u) lies in Πβ ,
because c ≥ d. Also, F̂ (u) is convex on
[
x(1), c
]
and concave on
(
c, x(n)
]
. Let us prove
that F̂ (u) ≥ F̂ (c + 0) = limu→c,u>c = γ. Indeed, if γ < F̂ (c), then γ �∈ Πβ . Therefore,
the abscissa of the first exit point d, where ϕinf (u) exceeds the bounds Πβ while moving
from x(n) to x(1), is greater than c. This contradicts condition 2. Thus, F̂ (u) increases,
is convex on
[
x(1), c
]
, and concave on
(
c, x(n)
]
. But F̂ (u) can have a breakpoint in c. To
exclude this breakpoint, we take ε > 0 sufficiently small so that the segment with the
ends
(
c− ε, F̂ (c− ε)
)
completely lies in Πβ . Then the function
F̂ε(u) =
⎧⎪⎪⎨⎪⎪⎩
F̂ (u), if u ∈ [x(1), c− ε
]
,
û γ−F̂ (c−ε)
ε + γ − cγ−F̂ (c−ε)
ε , if u ∈ (c− ε, c),
F̂ (u), if u ∈ [c, x(n)
]
,
A NEW TEST FOR UNIMODALITY 5
increases, is continuous, convex on
[
x(1), c
]
, concave on
[
c, x(n)
]
, and its graph lies in Πβ .
Thus, d.f. F̂ε(u) is unimodal, and we can consider it as an estimation of the hypothetical
d.f. of a general population G. The significance level of this test is β. Theorem 1 is
proved.
Remark 1. Theorem 1 has the following geometric sense: let c be the abscissa of the
first exit point, where the convex minorant ϕinf (u) exceeds the upper bound of Πβ while
moving from the maximal order statistics to the minimal one, and let d be the abscissa
of the first exit point, where the convex minorant ψsup(u) exceeds the upper bound
of Πβ while moving from the minimal order statistics to the maximal one. Then the
hypothetical d.f. F (u) is unimodal iff the exit points c and d lie outside
[
x(1), x(n)
]
or
c ≥ d.
5. Test for unimodality based on m.e.d.f.
The confidence strip Π̂β for a hypothetical d.f. can be constructed on m.e.d.f. F̃ ∗
n(u)
in the following way: let the significance level β∗ be given, let ε be the width of Πβ , and
let
p
(
Δ = max
x(1)≤u≤x(n)
|F (u)− F ∗
n(u)| > ε
)
= β∗.
We put ϕ̃(u) = F̃ ∗
n(u) + ε+ 1
n and ψ̃(u) = F̃ ∗
n(u) − ε− 1
n . It is easy to see that Π̃β∗
with lower bound ψ̃(u) and upper bound ϕ̃(u) has the significance level not exceeding
β∗. Indeed, by the virtue of (4),∣∣∣F (u)− F̃ ∗
n(u)
∣∣∣ = ∣∣∣F (u)− F ∗
n(u) + F ∗
n(u)− F̃ ∗
n(u)
∣∣∣ ≤ |F (u)− F ∗
n (u)|+ 1
n
Therefore,
Δ̃ = max
x(1)≤u≤x(n)
|F (u)− F ∗
n(u)| < Δ +
1
n
Hence,
p
(
Δ̃− 1
n
≥ ε
)
≤ p (Δ ≥ ε) = β∗,
p
(
Δ̃ ≥ ε+
1
n
)
≤ p
(
Δ̃ ≥ ε̃
)
= β∗.
Thus, the significance level of Π̃β∗ does no exceed β∗. Since we increase the validity of
the test by selecting β∗ as a significance level, we can use the m.e.d.f F̃ ∗
n(u) to construct
Π̂β without decrease in the significance level. However, doing this, we increase the width
of Π̃β∗ by 1
n relative to Πβ∗ . For moderate samples (30 ≤ n ≤ 200), this increment varies
from 7 to 13
Now, we can formulate the test for the unimodality of a hypothetical d.f. based on
m.e.d.f.
Theorem 2. Let ϕ̃inf(u) and ψ̃sup(u) be the convex minorant and concave majorant of
ϕ̃(u) and ψ̃(u), respectively, and let
c = sup
{
u : ϕ̃inf(u) ≤ ψ̃(u), x(1) ≤ u ≤ x(n)
}
,
d = inf
{
u : ψ̃sup(u) ≥ ϕ̃(u), x(1) ≤ u ≤ x(n)
}
.
Then, the hypothetical distribution F (u) is unimodal iff
1) ϕ̃inf(u) ≥ ψ̃(u) or ψ̃sup(u) ≤ ϕ̃(u) ∀u ∈ [x(1), x(n)
]
;
or
2) c ≥ d.
6 ROMAN I. ANDRUSHKIW, DMITRY D. KLYUSHIN, AND YURIY I. PETUNIN
Moreover, the significance level of this criterion is β∗.
The proof of Theorem 2 is similar to that of Theorem 1.
6. Conclusion
It is shown in [12] that if the distribution function of a general population is unimodal,
then the confidence interval (m(x)− 3σ(x),m(x) + 3σ(x)), where m(x) is the mathemat-
ical expectation of G and σ(x) is the standard deviation of G, has the significance level
which does not exceed 0.05. That is why, this nonparametric test for unimodality can
be used to construct the confidence interval for the bulk of the general population G.
Bibliography
1. B.W. Silverman, Using kernel density estimates to investigate multimodality, J. of the Royal
Statistical Society B 43 (1981), 97-99.
2. J.A. Hartigan, Computation of the dip statistics to test for unimodality, Applied Statistics 34
(1985), 320-325.
3. J.A. Hartigan, The span test of multimodality, Classification and Related Methods of Data
Analysis, (H. H. Bock, ed.), North-Holland, Amsterdam, 1988, pp. 229-236.
4. J.A. Hartigan, S. Mohanty, The RUNT test for multimodality, Applied Statistics 9 (1992),
63-70.
5. A.N. Kolmogoroff, Determinatione empirica di una legge di distributione, Giornale Instit. Ital.
Attuari 4 (1933), 83-91.
6. N.V. Smirnov, Sur les ecarts de la courbe de distribution empiric, Mat. Sb. 6 (1939), 3-26.
7. A. Wald, J. Wolfowitz, Confidence limits for continuous distribution functions, Ann. Math.
Statist. 10 (1939), 199-326.
8. W. Feller, On the Kolmogorov–Smirnov limit theorems for empirical distributions, Ann. Math.
Statist. 19 (1948), 177-189.
9. F.J. Massey, A note on the estimation of a distribution function by confidence limits, Ann.
Math. Statist. 21 (1950), 125-128.
10. Z.W. Birnbaum, F.H. Tingey, One-sided confidence contours for distribution functions, Ann.
Math. Statist. 22 (1951), 592-596.
11. B.L. Van der Waerden, Mathematische Statistik, Springer, Berlin, 1957.
12. D.F. Vysochanskij, Yu.I. Petunin, Justification of the 3-σ rule for unimodal distribution, Theor.
Probability Math. Stat. 21 (1980), 25-36.
����������
��������
��� ��
����� ��� ������
� ����
�� ��������
�� ��� ����
�
��
��� ��� ������ ����
����
����� � ��� ������� �� !" #� $��
����� ���%����� &�
% ���
��� $�
%���
��� ����������
��'�����
��� ()� * � ���
������ ���+� &�
% " ,,� $���
��
E-mail : vm214@dcp.kiev.ua
|
| id | nasplib_isofts_kiev_ua-123456789-4530 |
| institution | Digital Library of Periodicals of National Academy of Sciences of Ukraine |
| issn | 0321-3900 |
| language | English |
| last_indexed | 2025-12-07T16:50:16Z |
| publishDate | 2008 |
| publisher | Інститут математики НАН України |
| record_format | dspace |
| spelling | Andrushkiw, R.I. Klyushin, D.D. Petunin, Y.I. 2009-11-25T10:59:23Z 2009-11-25T10:59:23Z 2008 A new test for unimodality / R.I. Andrushkiw, D.D. Klyushin, Y.I. Petunin // Theory of Stochastic Processes. — 2008. — Т. 14 (30), № 1. — С. 1–6. — Бібліогр.: 12 назв.— англ. 0321-3900 https://nasplib.isofts.kiev.ua/handle/123456789/4530 519.21 A distribution function (d.f.) of a random variable is unimodal if there exists a number such that d.f. is convex left from this number and is concave right from this number. This number is called a mode of d.f. Since one may have more than one mode, a mode is not necessarily unique. The purpose of this paper is to construct nonparametric tests for the unimodality of d.f. based on a sample obtained from the general population of values of the random variable by simple sampling. The tests proposed are significance tests such that the unimodality of d.f. can be guaranteed with some probability (confidence level). en Інститут математики НАН України A new test for unimodality Article published earlier |
| spellingShingle | A new test for unimodality Andrushkiw, R.I. Klyushin, D.D. Petunin, Y.I. |
| title | A new test for unimodality |
| title_full | A new test for unimodality |
| title_fullStr | A new test for unimodality |
| title_full_unstemmed | A new test for unimodality |
| title_short | A new test for unimodality |
| title_sort | new test for unimodality |
| url | https://nasplib.isofts.kiev.ua/handle/123456789/4530 |
| work_keys_str_mv | AT andrushkiwri anewtestforunimodality AT klyushindd anewtestforunimodality AT petuninyi anewtestforunimodality AT andrushkiwri newtestforunimodality AT klyushindd newtestforunimodality AT petuninyi newtestforunimodality |