Огляд методів виявлення та відстеження лабораторних тварин
This article presents an overview of several most common techniques and approaches for object detection and tracking. Today, the tracking task is a very common problem and it can appear in many aspects of our life. One particular case of using object tracking techniques can appear during a lab anima...
Збережено в:
| Дата: | 2022 |
|---|---|
| Автори: | , |
| Формат: | Стаття |
| Мова: | Англійська |
| Опубліковано: |
The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"
2022
|
| Теми: | |
| Онлайн доступ: | https://journal.iasa.kpi.ua/article/view/237497 |
| Теги: |
Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
|
| Назва журналу: | System research and information technologies |
| Завантажити файл: | |
Репозитарії
System research and information technologies| _version_ | 1866302749281353728 |
|---|---|
| author | Shvandt, Maksym Moroz, Volodymyr |
| author_facet | Shvandt, Maksym Moroz, Volodymyr |
| author_sort | Shvandt, Maksym |
| baseUrl_str | http://journal.iasa.kpi.ua/oai |
| collection | OJS |
| datestamp_date | 2022-06-21T10:27:50Z |
| description | This article presents an overview of several most common techniques and approaches for object detection and tracking. Today, the tracking task is a very common problem and it can appear in many aspects of our life. One particular case of using object tracking techniques can appear during a lab animal behavior study. Different experimental conditions and the need of certain data collection can require some special tracking techniques. Thus, a set of general approaches to object tracking techniques were considered, and their functionality and possibilities were tested in a real life experiment. In this paper, their basis and main aspects are presented. The experiment has demonstrated the advantages and disadvantages of the studied methods. Considering this, conclusions and recommendations to their usage cases were made. |
| doi_str_mv | 10.20535/SRIT.2308-8893.2022.1.10 |
| first_indexed | 2025-07-17T10:27:20Z |
| format | Article |
| fulltext |
M.A. Shvandt, V.V. Moroz, 2022
124 ISSN 1681–6048 System Research & Information Technologies, 2022, № 1
UDC: 004.932:519.652
DOI: 10.20535/SRIT.2308-8893.2022.1.10
OVERVIEW OF THE DETECTION AND
TRACKING METHODS OF THE LAB ANIMALS
M.A. SHVANDT, V.V. MOROZ
Abstract. This article presents an overview of several most common techniques and
approaches for object detection and tracking. Today, the tracking task is a very
common problem and it can appear in many aspects of our life. One particular case
of using object tracking techniques can appear during a lab animal behavior study.
Different experimental conditions and the need of certain data collection can require
some special tracking techniques. Thus, a set of general approaches to object track-
ing techniques were considered, and their functionality and possibilities were tested
in a real life experiment. In this paper, their basis and main aspects are presented.
The experiment has demonstrated the advantages and disadvantages of the studied
methods. Considering this, conclusions and recommendations to their usage cases
were made.
Keywords: object tracking, object detection, algorithm, video, frame, image, back-
ground, foreground, experiment, color space, thresholding, background estimation,
segmentation.
INTRODUCTION
Over the last two decades, strong development of tech and technologies had to
their widespread implementation in all areas of human life. One of such technolo-
gies is image processing and visual analysis [1]. Lots of processes in the world
that surrounds us, including street traffic control, the tasks of terrorism prevention
and war operations, need to be monitored, analyzed and, very often, controlled. In
most such cases photo and video analysis comes in handy. Object detection and
tracking usually play important roles in it. A certain object has to be detected on
life video stream or recorded video, and then it is necessary to observe and trace
the object’s movements and position and, presumably, perform some analysis of
these movements.
Object detection and tracking are in fact the key tasks of computer vision, as
they allow one to gather consecutive information about the object which later can
be analyzed [2]. As one knows most of information a person receives through its
eyes, that is why computer vision also play an important role in data analysis. The
tasks of computer vision include information acquisition, processing of the
acquired information, processed data analysis and useful data acquisition.
Computer vision is focused on the processing of two- and three-dimensional
images. One of the tasks of 2D-processing is optical flow processing (video
processing). It includes three key steps:
1) detection of moving objects;
2) object tracking from frame to frame;
3) analysis of an object to determine its characteristics.
In a simple way object tracking can be determined as the task of object tra-
jectory estimation in the image plane.
Overview of the detection and tracking methods of the lab animals
Системні дослідження та інформаційні технології, 2022, № 1 125
The detection and tracking methods relay on many features, but the key ones
are object shape and video background state. Thus the problem of
tracking/detection can be quite challenging due to several factors. For example,
there can be a lack of visual information due to the projection of a three-
dimensional object on a two-dimensional plane. At the detection phase, the object
can be partially occluded or it gets occluded later during tracking process. Also,
the tracked object can change its shape or scale, which can also lead to tracking
errors. And sometimes its movements can be quite complicated and hard-to-
predict. The change of lighting can cause tracking errors, as well as the image
noise. Image background can be a major source of difficulties. The easy situation
is when there is a static background or it changes very slightly – then it is simple
enough to pick out the tracked object. But if the background on each frame
changes quite severely, it can also lead to situation when the tracking algorithm
fails to pick out the necessary object correctly and loses it. In addition to that the
tracking algorithm must be applicable for real-time video processing. In order to
solve all these problems, different approaches have been suggested.
In biology there often is a necessity to study the life processes and behavior
of lab animals, for example mice or fish. Such studies in this field have been car-
ried out for a long time and they are still of sufficient scientific interest [3–5].
Different lab conditions may require specific approaches for automatic animal
behavior examination. Thus the problem of object detection and tracking can be
studied well on the particular example (Fig. 1) of such activity study. In the first
case (Fig. 1, a and 1, b) the test environment is represented by a box with circle
holes in the bottom (the holes denote the center of the test stand). In this case the
task is to track the lab mice, note their movements between holes, time spent in
the test stand center and moments when several mice contact with each other. In
the second case there is an aquarium (Fig. 1, c). The task is quite similar: to track
fish movements and notice their contacts. In both cases the camera is placed
above the test environment. In order to solve these particular tasks a wide
research was carried out to find the most suitable object detection and tracking
approaches that could be used separately or combined. Thus various methods
were examined, their advantages, disadvantages and algorithmic aspects have
been considered. The complete analysis is presented further in this paper.
THE PROBLEM OF OBJECT DETECTION
The key task that appears during the object tracking process on video stream is
their detection. Some methods require full object detection only on the first frame,
some use continuous full detection on each frame.
Fig. 1. Lab animals behavior study: a — lab rats; b — lab mice; c — fish
a b c
M.A. Shvandt, V.V. Moroz
ISSN 1681–6048 System Research & Information Technologies, 2022, № 1 126
Segmentation-based detection. Image segmentation is a process of digital
image division on multiple sets of pixels. This process can also assign special
markers to each pixel, so that the pixels with similar markers could have common
visual characteristics [6]. For example, such approach can be based on the Water-
shed Transform [7] and results in the Watershed Algorithm combined with the
Distance Transform. Image segmentation allows one to simplify the image analy-
sis. It results in the highlighting of the borders and object itself. Thus, pixels be-
longing to the same segments are similar by some calculated feature (color,
brightness value, etc.), with the rest of elements being significantly different by
that feature. The result of such segmentation can be seen on Fig. 2. This approach
is easy to use when objects of interest significantly differ from the background by
some parameter. But the main problem is that it has low versatility and requires
too accurate algorithm parameters setup in each particular case. It also is very
sensitive to lighting conditions.
a b c
d e f
Fig. 2. Image segmentation: a — test 1: original image; b — test 1: markers; c — test 1:
segmentation result; d — test 2: original image; e — test 2: markers; f — test 2:
segmentation result
Another approach that works with image segments is Template Matching
[8]. The algorithm compares the given object template with the sub-regions of
processed image. To do this it simply slides the template along the image and
checks if it matches to some region. The template (or patch) is sliding one pixel at
a time (left to right, up to down) [9, 10]. At each location the algorithm calculates
a metric that allows one to understand how similar the patch is to that particular
area of the source image. For each location T over input image I it stores the
metric in the result matrix R . Each cell ),( yx from R contains the match metric.
Thus it is possible to find the best match by searching for the highest value (or
lower, depending on the type of matching method) in the R matrix.
It is worth noticing, that while the patch must be a rectangle it may be that
not the whole area of the rectangle is relevant. In this case the algorithm uses
mask to isolate the portion of the patch that should be used to find the match. The
mask is a grayscale image that masks the template image and must have the same
Overview of the detection and tracking methods of the lab animals
Системні дослідження та інформаційні технології, 2022, № 1 127
dimensions and number of channels. The match ),( yxR can be calculated in sev-
eral ways:
1. As a Square Difference:
','
2.)),(),((),(
yx
sq yyxxIyxTyxR (1)
2. As Normed Square Difference:
.),(),(/),(),(
, ,
22
yx yx
sqnsq yyxxIyxTyxRyxR (2)
3. As Cross Correlation:
.)),(),((),(
,
yx
ccorr yyxxIyxTyxR (3)
4. As Normed Cross Correlation:
.),(),(/),(),(
, ,
22
yx yx
ccorrnccorr yyxxIyxTyxRyxR (4)
5. As Correlation Coefficient:
yx
ccoeff yyxxIyxTyxR
,
)),(),((),( , (5)
where
,
),(
),(),(
,
hw
yxT
yxTyxT
yx
.
),(
),(),( ,
hw
yyxxI
yyxxIyyxxI yx
6. As Normed Correlation Coefficient:
.),(),(/),(),(
, ,
22
yx yx
ccoeffnccoeff yyxxIyxTyxRyxR (6)
The result of Template matching algorithm is presented on Fig. 3. This ap-
proach can be used in case of some scene analysis when camera is static and ob-
jects of interest look almost identical, for example, detection of some products on
a factory assembly line. But on the other hand, such method does not work stable
in case of rotation or scaling and when an object is partially occluded. If the
searched objects are scaled the most simple way could be to enlarge the template
image as much as possible and that consequently scale it down at each search
stage, hoping that at some point the template image will be scaled to the correct
size. If the objects are rotated, the easiest way is to create a set of rotated by 1 de-
gree template images and then iteratively check each sample. But both such ap-
proaches will deliver poor performance, especially in case of high resolution im-
ages. In case when the objects are both scaled and rotated, the performance can
get even worse.
M.A. Shvandt, V.V. Moroz
ISSN 1681–6048 System Research & Information Technologies, 2022, № 1 128
a b
c d
e f
Fig. 3. Template matching: a — Square Difference (1); b — Normed Square
Difference (2); c — Cross Correlation (3); d — Normed Cross Correlation (4); e —
Correlation Coefficient (5); f — Normed Correlation Coefficient (6)
Feature-based detection. One more way to detect an object on image is to
find it by some features. A feature is some element or part that is more distin-
guished than the other parts/elements, some local image particle. As simple ex-
ample of such features are corners and borders. The search of an object in this
case is based on the comparison of the characteristic features of the processed
frame and a template showing the object one is looking for [11]. Local features
should be repetitive (stable to change the angle or lighting during the video
series), compact (their number should be much less than the total number pixels
of the image), unique (each feature must have their own description).
To identify the characteristic features special detectors are used. One of the
most common is the Harris (corner) detector (Fig. 4,c), which recognizes the
features of the type “corner” in the image. As corner detectors are not very
sensitive to image scaling, the concept of so-called drops (Blob) was introduced -
teardrop-shaped neighborhoods with a special point located in the center. One of
the most common blob methods is LoG (The Laplacian of Gaussian) [12]. LoG is
a filter
2
22
2
4
222 2
),(
yx
e
yx
yxLoG that applies the Gaussian operator
Overview of the detection and tracking methods of the lab animals
Системні дослідження та інформаційні технології, 2022, № 1 129
),();,();,( yxIyxGyxL ,
2
22
2
22
1
);,(
yx
eyxG , and the Laplace
operator
2
2
2
2
2
y
I
x
I
to the image, respectively (Fig. 4,a):
Here the standard deviation, );,( yxL is the Gaussian scale-space repre-
sentation of an image ,I x y , and is the convolution operator. DoG detector
2
2
22
2
1
22
21
2
2
2
1
11
2
1
)(
yxyx
eeGGDoG based on the Gaussian
difference ,),();,();,( 111 1
yxIyxGyxL ),();,();,( 222 2
yxIyxGyxL ;
),();,(),();,();,();,( 212211 21
yxIyxGyxIyxGyxLyxL
.),(),());,();,(( 21 21
yxIDoGyxIyxGyxG [13] is also common
(Fig. 4,b). The difference between the two smoothing is as follows.
a b c
Fig. 4. Blob detection using Gaussian filter: a — Laplacian of Gaussian (LoG);
b — Gaussian difference (DoG); c — Harris detector
After finding special points, it is necessary to compare them. This task
requires a way of compact characteristic features representation. In practical
tasks, the SIFT (Scale-Invariant feature transform) descriptor [14] and its
derivatives, such as SURF [15], are considered to be the best methods. Despite
being invariant to small turns, scaling of objects and changes in stage lighting, the
feature-based approach actually makes it impossible to define an object as
instance of some class and it also provide false results in case of object dynamic
shape change (Fig. 5).
Categorical recognition. Methods for detecting characteristic features are
well suited to solve the problem of searching across the database of images [16].
However, in our particular case it is necessary not simply to reveal some object on
the frames of a video corresponding to some template, but also to recognize all
objects of certain class. The considered problem could be solved by methods of
feature detection but at the same time it would be necessary to create a large
number of templates and it would take a long time to compare the frames with
each of them. The approach that allows us to avoid this is based on the
classification of objects, i.e. categorical recognition. It consists of two main
elements: the definition of a set of features or descriptors and machine learning of
M.A. Shvandt, V.V. Moroz
ISSN 1681–6048 System Research & Information Technologies, 2022, № 1 130
the classifier. As a set of features the Histogram of Oriented Gradients (HOG) or
Haar features can be used. The HOG features [17] are based on the calculation of
the number of gradient directions in the local areas of the image (Fig. 6).
a b
Fig. 5. SIFT feature object detection: a — perfect match; b — mismatch case
a b c
Fig. 6. HOG features detection: a — input images; b — extracted HOG features;
c — HOG features (magn.)
Haar signs [18] or primitives are rectangles consisting of adjacent areas (see
Fig. 7, a). These areas get positioned on the image, then the intensity of pixels in
the areas is summed, and then the difference between the obtained sums is
calculated, which is the value of a certain feature of a certain size, located on the
image in a certain way. An example of the use of Haar features is shown on
Fig. 7, b. The advantage of Haar features is a relatively high computational speed.
Machine learning is used to create a class clarifier. The classifier is used to
indicate which features belong to the object. Thus for training purpose some base
of these features is used.
HOG is calculated on a dense grid of evenly distributed cells (Fig. 6). This
method highlights well the objects with multiple details, but in case when the
object is mostly a single piece without any significant details, in most cases it will
Overview of the detection and tracking methods of the lab animals
Системні дослідження та інформаційні технології, 2022, № 1 131
only highlight the borders, which can be not enough for complete detection. The
Haar approach on the other hand is suitable for face detection and recognition.
But this approach still requires a pre-trained classifier which sometimes can be
problematic and will not work well with objects that tend to change shapes.
a b
Fig. 7. Haar features: a — Haar features types (source: www.spiedigitallibrary.org);
b — general representation of training the Haar classifier (source: medium.com)
THE PROBLEM OF OBJECT TRACKING
As mentioned above the process of tracking of moving objects is one of the
components of many real-time systems such as observation systems, video
analysis and others. The input data of any tracking algorithm is a sequence of
images (video frames) nIII ,...,, 21 with an increasing amount of information that
needs to be processed and analyzed. The task of tracking is to construct the
trajectories of the target objects on the input sequence of frames. If we assume
that the position of the object on the image numbered k is denoted by kP . Then
the trajectory of the object is sequence of its positions 11,...,, lsss PPP , where s
is the number of the first frame in which the object was detected, l is the number
of frames in the sequence where the object is observed.
Some methods of object detection allow us to detect the entire object, but
usually they are not suitable for continuous work, especially for real-time video
processing. In most cases performing the detection “from scratch” for each frame
can be very costly in terms of performance and speed, thus the detection process
should be optimized in some way, especially if some frames have already been
processed and we received some additional information from them. That is way
several different object tracking approaches have been introduced. Note that
depending on the method of tracking the position of the object can be determined
differently (coordinates and size of the sides of the surrounding rectangle,
coordinates of the center of mass of the contour, etc.).
Color-based tracking approaches. The idea of simple color-based tracking
consists of the following steps [17]: first, the algorithm takes each frame and con-
verts it from RGB to HSV color model. It is necessary because the RGB represen-
tation is nor very suitable for selection of some specific color range. It can be per-
formed in the following way [1, 19]: the given BGR ,, values are scaled to
change the range from 2550 to 10 . Then one calculates
,,),,(min,),,(max minmaxminmax CCBGRCBGRC
M.A. Shvandt, V.V. Moroz
ISSN 1681–6048 System Research & Information Technologies, 2022, № 1 132
where BGR ,, are scaled BGR ,, values. The Hue )(H can be calculated
using the following formula:
;if,460
;if,260
;if,60
;0if0
max
o
max
o
max
o
BC
GR
GC
RB
RC
BG
H
o
,360 HH if .0H
The Saturation (S) calculation:
.0,
;0,0
max
max
max
C
C
C
S
The Value V calculation:
.maxCV
It is worth noticing that before the RGB to HSV conversion a Gaussian blur
is applied, as described in [20] to remove noise in order to receive better output.
The result is seen on Fig. 8, a, b. The second step after a successful conversion is
the color thresholding. The lower and upper boundaries of the desired color are
set in the HSV color space. This allows to filter out the rest of the colors from the
image. The thresholding process for an input image I can be described as follows:
1)(()( IlowerBIdst
),)()()((...))()( 11 nnn IupperBIsrcIlowerBIupperBIsrc
where iii IupperBIsrcIlowerB )()()( stands for the thi input array channel,
ni ,...,1 . Thus:
For every element of a single-channel input array:
;)()()()( 111 IupperBIsrcIlowerBIdst
For two-channel arrays:
.)()()()()()()( 222111 IupperBIsrcIlowerBIupperBIsrcIlowerBIdst
The resulting image is a binary image, i.e. all its pixel values are 1 or 0. For
the resulting image after thresholding the operations of erosion and dilatation are
applied, as described in [21, 22]. This allows us to get rid of most separated areas
that managed to pass the threshold (Fig. 8, c, d, f). The final step is a centroid cal-
culation for each blob using the binary image moments [23]:
x y
ji
ji yxIyxM ).,(, (7)
Overview of the detection and tracking methods of the lab animals
Системні дослідження та інформаційні технології, 2022, № 1 133
If one denotes a blob area as 00M , then the centroid can be calculated as
follows:
.;};{
00
01
00
10
M
M
M
M
yx (8)
For each blob its point };{ yx can be used as object position on current frame
(Fig. 8, e).
a b c
d e f
Fig. 8. Color-based tracking: a — RGB frame; b — HSV frame; c — raw binary
image; d — after erosion & dilatation operations; e — detected object; f — HSV
thresholding result
The advantages of such tracking approach are that in fact the target object
gets detected automatically it works well with objects, which change their shape.
In addition the overall realization is very simple and it has good performance
speed, which makes it suitable for real-time video capturing. But this method has
some serious disadvantages. Firstly, it is more of a detection than tracking
technics, so if there are several object of interest and they occasionally get
occluded, after repeated detection there is no guarantee, that these objects’ posi-
tions were not messed up (Fig. 9, a, b). This point requires additional control in
addition to tracking technics. Also it requires from the user a manual selection of
lower and upper HSV color threshold boundaries, which is not a very easy task by
itself, and the tracked objects have to be distinguished from the background by
color (Fig. 9, c). In addition, this technic will work well mostly only with the
colorful images, because the grayscale color space is much more poor for color
differentiation, thus it will not be suitable for usage on videos like one on
Fig. 1, c. An finally, if several tracked objects of one occasionally come very
close to each other, they merge into just one object and the algorithm begins
treating them as a single object. This fact devalues the accuracy of the method.
M.A. Shvandt, V.V. Moroz
ISSN 1681–6048 System Research & Information Technologies, 2022, № 1 134
a b c
Fig. 9. Color-based tracking errors: a — 4 mice; b — 2 mice are merged into a single
blob; c — identical color range case
Another approach is the background subtraction. The idea of this method is
similar to the color tracking algorithm as it also directly separates the tracking
object from the background [21, 24]. The main difference is that one requires an
image of observed location without any moving objects on it. In case of rat/mice
tracking task this location is the test box. This method was implemented and
tested in [20]. In brief, it consists of the following steps:
1. The algorithm receives an image of empty observed location, it is con-
verted from RGB to grayscale and cleared from noise with Gaussian or Median
filters (Fig. 10, a).
2. Each video frame is also converted from RGB to grayscale and cleared
from noise (Fig. 10, b).
3. For each frame the background subtraction operation is performed on
both grayscale frame and grayscale image of an empty box. The main formula is:
,,1,,1,,,, mjnifbd jijiji
where ,i jd is pixel value of the resulting image (i.e. background subtraction out-
put / image difference), ,i jb and jif , are the pixel values of empty grayscale
background image and each grayscale video frame respectively, ni ,1 and
mj ,1 are the dimensions image/frames. Notice, that these dimensions must be
equal for both empty background image and frame for obvious reasons. The result
is presented on Fig. 10, c.
4. Next step is thresholding [25]: all pixel values, that are higher than some
threshold are put to 0, the rest is set to 255. The result is a binary image (Fig. 10,d),
i.e. it is only black and white. If necessary, operations of erosion and dilatation
are applied (Fig. 10, e).
5. Finally, similarly to color-based tracking, for binary image blobs centroid
calculation is performed [23]. It is done using image moments calculation (formu-
las 7 and 8). The result can be seen on Fig. 10, f (circles were detected using
Hough transform as the box central area [25]).
This methods has similar to color-based tracking advantages, as it also
requires a threshold value, but it is more convenient as it requires only one such
value instead of a range. Thus it is more stable. But the main disadvantage is the
mandatory existence of the background image. In case of difficulties with
Overview of the detection and tracking methods of the lab animals
Системні дослідження та інформаційні технології, 2022, № 1 135
providing such image this algorithm should not be used. The rest of possible
problems are also similar to color-based tracking.
a b c
d e f
Fig. 10. Background subtraction tracking: a — grayscale box image smoothed; b — gray-
scale frame smoothed; c — image difference; d — binary image; e — after ero-
sion/dilatation; f — background subtraction result
It can happen that there is no background image provided. Then background
estimation method comes in handy. It can be used as an addition to background
subtraction algorithm. The main idea is that the background image gets calculated
from input video. It can be performed using the Approximation Median Algo-
rithm [26, 27]. As it is described in [26], it finds the difference of values of the
current pixel’s intensity and the median of some recent pixel’s intensity. For this
task an n-size buffer is used, it contains n last frames whose pixel values are used
for calculating the median value for background image. The main formula for this
method can be written as follows:
,,1,,1,,1, 21,,,, frNumksidejsideithreshMedF kjikji
with F representing the current frame and Med being the median of last n frames.
For each new frame k for each pixel ),( ji the difference of current frame pixel
with pixel of median of last n frames decides whether this value is foreground or
background. The median value gets updated for last n recent pixel values.
The described method can be also performed in the following way [28]. As-
suming, that the camera is static and most of the time every pixel shows the same
piece of the background, every moving object will occlude the background. In
this case for the video on can randomly sample n frames. Thus for every pixel,
now there are n estimates of the background. As long as a pixel is not occluded by
the moving object, more than 50% of the time, the median of the pixel over these
n can be a good estimate of the background at that pixel. This process can be re-
peated this for every pixel and thus it recovers the entire background.
As an alternative, Mixture of Gaussian can be used for background estima-
tion [26, 27]. It uses a Gaussian probability density function to evaluate the pixel
M.A. Shvandt, V.V. Moroz
ISSN 1681–6048 System Research & Information Technologies, 2022, № 1 136
intensity value. This method calculates the difference of values of the current
pixel’s intensity and cumulative average of the previous values. It means that the
algorithm keeps a cumulative average t of the recent pixel values, and if the
difference of the current image’s pixel and the cumulative pixel values is greater
than the product of a constant value c and standard deviation , then this differ-
ence it is classified as foreground. Thus for each frame t the tF pixel value can
be denoted as foreground pixel, if the following inequality holds:
,,1, NumfrtcF ttt
In other case, this value can be classified as background. Also, this algorithm
updates background image as the running average using formulas:
,)1(1 tttt F
,)1()( 222
1 tF ttt
where is the learning rate (typically 05,0 ); tF is the pixel current value;
t is the previous average.
The result of such technics is shown on Fig. 11, 12. In case of mice/rat track-
ing, there is an empty box image, so it can be compared with the resulting back-
ground estimation (Fig. 11). Notice, that three white dots were static on each
video frame, thus they managed to pass to estimated background (Fig. 11, b).
Fig. 12, a shows estimated background image, acquired from the corresponding
aquarium video. In this particular case no empty aquarium image had been pro-
vided, thus this is exactly the case when background estimation can be applied.
Also notice, that one fish in the top corners remained static during the whole
video, so they were also classified as background and was later missed by the al-
gorithm. This case shows the main drawback of such approach (Fig. 12, b).
a b c
Fig. 11. Background estimation: a — original empty box image; b — estimated box image;
c — tracking result
`
The positive side of this approach is that when one uses the background sub-
traction and there is no empty background image, in most cases this technic can
compensate this need. But as it is shown on Fig. 12, a, if one of the tracked ob-
jects remains static during the whole video, it will be classified as a part of the
background and thus the tracking method will not be able to detect and track it. It
is also effective only if the entire background is static.
Overview of the detection and tracking methods of the lab animals
Системні дослідження та інформаційні технології, 2022, № 1 137
a b
Fig. 12. Background subtraction operation test with fish: a — empty aquarium estimation;
b — tracking result
Kernel tracking & optical flow. A kernel component is the shape of an
object. In the simplest case, the component can be represented by a rectangular or
oval shape, in more complex ones by three-dimensional model of the object
projected on the plane of the image. The methods of this group are usually used if
the motion is determined by a normal displacement, rotation, or affine
transformation. Component tracking is an iterative localization procedure based
on maximizing some similarity criterion. In practice, it is realized using mean
shift and its continuous modification (Continuous Adaptive Mean Shift, CAM Shift).
The idea of the Mean Shift [29, 30] is that for each special point (in the
general case, for each object) the search window is selected, the center of masses
of the intensity distribution (i.e. of the histogram) is calculated. Accordingly, the
center of the window is shifted to the center of mass, which is the position of the
point on the current frame. Determining the position of the point in the following
frames is reduced to the application of the next step of the method of “average
shift”. The method stops when the center of mass stops shifting (Fig. 13).
Fig. 13. Mean shift tracking
The problem with Mean Shift is that the window (ROI) always has the same
size whether the object is very far or very close to the camera, it needs to be
adapted during the tracking process. The solution to this is CAM Shift (Continu-
ously Adaptive Mean Shift) [31]. This approach applies the Mean Shift first, then
once Mean Shift converges, it updates the size of the window as
.
256
2 00M
s
CAM Shift also calculates the orientation of the best fitting ellipse to it. It
applies the Mean Shift with new scaled search window again and previous win-
dow location. This process continues until the required accuracy is met (Fig. 14).
This approach shows fine work speed and is more stable than Mean Shift, but the
M.A. Shvandt, V.V. Moroz
ISSN 1681–6048 System Research & Information Technologies, 2022, № 1 138
main problem with CAM Shift is that it is connected to color range, thus it is sen-
sitive to lighting conditions and can fail with objects that change their shape.
Fig. 14. CAM Shift tracking
Optical flow estimation can be used as the alternative to all previous
methods [32, 33]. Optical flow itself can be described as a trace of visible object
movement between two consecutive frames [34]. It can be caused by moving
object itself or by camera movement and it is represented by 2D vector field
where each vector is a displacement vector showing the movement of points from
first frame to second. There are several applications, where optical flow can be
used, especially motion detection, or video stabilization.
There are several assumptions that optical flow works with [33, 34]: firstly,
the pixel intensities of an object do not change between consecutive frames, and
secondly, pixels in neighborhood must have similar motion. Let ),,( tyxI be a
pixel from the first frame ( t is time), and it gets moved by distance ),( dydx in the
next frame taken after dt time. Assuming, that the pixel intensity does not
change, the following holds:
, , , , .I x y t I x dx y dy t dt
By using the Taylor series approximation of right-hand side, removing
common terms and dividing by dt one gets the following equation:
0 tyx fvfuf
with ;
x
f
f x
;
y
f
f y
;
dt
dx
u
dt
dy
v . (9)
The equation (9) is called Optical Flow equation, where yx ff , are image
gradients, which can be found, and tf is the gradient along time. The u and v
components are unknown and thus equation (9) cannot be solved with two
unknown variables. There are several solutions to this problem. One of them is
Lucas-Kanade method [32, 34]. The Lucas–Kanade approach uses the 3 3 patch
around the point, so that 9 points have the same motion. It is possible to calculate
tyx fff ,, for these 9 points, thus there appears a task to solve 9 equations with
two unknown variables which is over-determined. It can be solved with least
square fit method:
.
1
2
2
i
ty
i
tx
i
y
i
yx
i
yx
i
x
ii
ii
iii
iii
ff
ff
fff
fff
v
u
It is worth noticing, that the inverse matrix is similar to Harris corner detec-
tor, as corners are better points to be tracked. Also, as it can be seen, this ap-
proach allows to detect only small motions, but no the big ones. In order to solve
this problem the pyramids are used: when going up in the pyramid, small motions
Overview of the detection and tracking methods of the lab animals
Системні дослідження та інформаційні технології, 2022, № 1 139
are removed and large motions become small motions. Thus when one applies
Lucas–Kanade there, one gets optical flow along with the scale. The result is pre-
sented on Fig. 15, a. As Lucas–Kanade method computes optical flow for a sparse
feature set (sparse optical flow), using, for example, Shi-Tomasi corner detection
technic, another approach, based on the Gunner Farneback’s algorithm (dense
optical flow) [34, 35] computes the optical flow for all the points in the frame
(Fig. 15, b). For vectors ),( vu it is possible to find their magnitude and direction.
Thus it allows us to trace the moving object and its movement directions (color
shows the direction).
Both Lucas–Kanade and Farneback’s algorithms perform well in case of a
static background. They also do not require any manual object selection, the ob-
ject of interest can be found by its motion. But in case of object occlusion redetec-
tion is required, this fact makes these technics suitable mostly only for laboratory
conditions, like in this particular case (Fig. 15). They also perform fine in case of
object’s shape change.
a b
Fig. 15. Optical flow: a — Lucas–Kanade method; b — Farneback’s method
Point tracking methods. In such approaches, it is assumed that the position
of the object is determined by the location of a set of characteristic points. The
same object in consecutive frames is represented by sets of corresponding pairs of
points. This group of methods is divided into two subgroups:
Deterministic methods [36] use qualitative heuristics of motion (a small
change in velocity, the invariance of the distance in three-dimensional space
between a pair of points belonging to object), in essence, the task is reduced to
minimizing the function correspondence of sets of points. Methods based on the
calculation of dense and sparse optical flux, as well as methods of matching key
point descriptors are typical representatives of deterministic methods.
Probabilistic methods use an approach based on the concept of state
space. It is believed that a moving object has a certain internal state, which is
measured on to each frame. To estimate the next state of the object, it is necessary
to generalize as much as possible the received measurements, that is, to determine
the new state provided that the set is obtained measurements for states on
previous frames. Typical examples of such methods are methods based on the
Kalman filter [37, 38] or Particle filter [39].
The Kalman filter is used to track single objects in noisy images. Each state
of the system can be described by a vector of its parameters. By some influence
the system passes from one state to another. The set of all states of the system and
transitions form a model. There is a concept of observation data vector. This is
M.A. Shvandt, V.V. Moroz
ISSN 1681–6048 System Research & Information Technologies, 2022, № 1 140
a set of system parameters that we can extract from the observation of behavior of
the system. In most cases, the dimension of the vector states of the system
exceeds the dimension of the observation data vector. In this case, the Kalman
filter is able to estimate with a certain probability the complete internal state of
the system.
The Kalman filter works with time-discrete linear dynamical systems. Such
systems are modeled by Markov chains with the help of linear operators and
terms with normal distribution. At each discrete moment of time, the linear
operator acts on the state and translates it into another state, adding some random
variable in the form of normal noise and, in the general case, a control vector that
simulates the influence of the control signal. Mathematical model of this process
in matrix form:
,1 kkkkkk wuBxFx
,kkkk vxHz
where kF is a nn matrix that describes how the state changes systems in
transition from 1k to k without control;
kB is a ln matrix that describes
how the control effect ku changes state from 1k to k , l is the dimension of
the control effect; kH is a lc matrix that describes how the state kx is
transformed into an observation kz , c is the dimension of the observation vector;
kw , kv are arbitrary values representing the normally distributed noise when
measuring the state c by the corresponding covariance matrices
),0(,, kkkk QNwRQ , ),0( kk RNv .
The algorithm consists of two repeating phases: extrapolation phase and
correction phase. During the operation of the first phase, a prediction of values of
the state variables takes place (extrapolation) based on state estimation on the
previous step, as well as their uncertainty. This assessment often also called a
priori because it is given to perform any measurements and is based on
mathematical model only. The second phase is responsible for refining the result
of extrapolation using the appropriate measurements, possibly obtained with some
error. This assessment is called a posteriori.
In the classical operation of the algorithm, these phases alternate, i.e. the
prediction happens in relation to the results of adjustment with past iteration, and
the adjustment specifies the result of the extrapolation phase. However, in some
cases, the correction phase may be missed and the prediction will be based on an
unspecified estimate. This situation can occur if for some reason we do not have
information from the measuring sensors at this stage. To understand further
processes, it is necessary to enter the following notation:
kx — the actual state of the system at the time k ;
kx̂ — estimated state at time k ;
kx̂ — predicted system state at time k ;
kP — estimated matrix of error covariance of condition measurement;
kP — predicted matrix of error covariance of condition measurement.
Overview of the detection and tracking methods of the lab animals
Системні дослідження та інформаційні технології, 2022, № 1 141
T a b l e 1 . Kalman Filter algorithm
Extrapolation Correction
1) State extrapolation:
kkkkk uBxFx
1ˆˆ
2) Covariance matrix extrapolation:
k
T
kkkk QFPFP
1
1) Kalman amplification:
1)( k
T
kkk
T
kkk RHPHHPK
2) State vector correction:
)ˆ(ˆˆ kkkkkk xHzKxx
3) Covariance matrix calculation:
kkkk PHKIP )(
Interactive tracking. The general idea of the set of suggested approaches is
the motion and appearance models [40]. As one remembers, the task of tracking is
to detect an object in the current frame given this object was successfully detected
and tracked in all (nearly all) previous frames. If the object was tracked up until
current frame, it means that it has been moving, i.e. the parameters of the motion
model are known. This term means that object’s location and the velocity (speed
and motion direction) in previous frames are also known. If there is no other in-
formation on the object, it can be possible to estimate its new location based on
the currently existing motion model and thus one can get close to the object real
position.
Thus if the object is simple and its appearance did not change too much, it is
possible to use some simple template as an appearance model and look for it. But
as the object appearance can change pretty much, the model can be represented as
a classifier that is trained during the whole tracking process. The main task for the
classifier is to classify a rectangular region of interest (ROI) of an image as either
an object or background. In order to do this, it takes as the input an image patch
and returns an estimation value in range [0, 1]. This value is the probability that
the image patch contains the object. As one can see here the binary classification
is used, thus if the estimation score is 0, it means that the classifier thinks that the
image patch is the background, and if the score is 1, it says that the patch is the
object. The training (learning) is performed during the tracking process, as the
classifier “learns” to detect the object. This approach is similar to the work of the
neural networks, but is this particular case the training set is quite small, as it is
just the set of video frames.
There are several interactive training methods [40–44], that uses this meth-
odology. First group includes BOOSTING, MIL, KCF trackers. The BOOSTING
tracker is based on the AdaBoost algorithm and uses HAAR cascade based face
detector. The user should provide the initial bounding box, that is used as a posi-
tive example for the object and many other image patches outside this box are
treated as the background. Also, this algorithm cannot detect the tracking failure.
MIL (Multiple Instance Learning) is based on the same idea, but instead of con-
sidering only the current location of the object as a positive example, it looks in a
small neighborhood around the current location to estimate several potential posi-
tive examples. The KCF (Kernelized Correlation Filters) tracker also supports the
ideas from BOOSTING and MIL. The difference is, that this tracker uses the fact
that the multiple positive samples used in the MIL tracker have large overlapping
regions. The fact of overlapping is used for performance enhancement. This
method reports a tracking failure and can recover from partial occlusion.
M.A. Shvandt, V.V. Moroz
ISSN 1681–6048 System Research & Information Technologies, 2022, № 1 142
The experiment had shown that both BOOSTING (Fig. 16) and MIL (Fig. 17, a)
tracker had shown similar performance and tracking quality. The main problem
was that in current condition they began failing and losing the objects (Fig. 17, b).
The KCF tracker indeed had shown much faster frame processing due to its
technic of usage of the overlapping regions. But it also resulted in much worse
tracker quality – the tracker tends to loose objects very quickly.
Fig. 16. BOOSTING tracker
a b
Fig. 17. MIL tracker: a — successful tracking; b — tracker failure
The other set of trackers include TLD, MEDIANFLOW, MOSSE and
CSRT. The TLD (Tracking, learning, and detection) as its name suggests
separates the tracking process into three subtasks, i.e. tracking, learning, detection
[40]. According to its creators, the algorithm tracks the object from frame to
frame, while its detector localizes all object’s appearances that have been found
so far and performs tracker’s self-correction if required. During the learning
process the algorithm estimates errors of the tracker’s object detector and then
updates it in order to avoid them further on. This results in tracker jumping
around, which one hand, in case of sudden occlusions allows the tracker to return
back to initial object. But on the other hand as result of such jumps quite often
TLD tends to lose its target and focus on another object. Thus despite this tracker
performs fine under occlusion over multiple frames or scale changes, it provides
lots of false positives results, which making it almost unusable. The testing of
TLD is shown on Fig. 18.
The MEDIANFLOW tracker [40] follows the object in both forward and
backward directions and estimates the divergence between object’s two
trajectories. Thus it calculates forward-backward error and tries to minimize it.
This technic allows to detect tracking failures and keep a more or less stable
trajectory. The test has shown (and it matches the earlier results [40]), that this
Overview of the detection and tracking methods of the lab animals
Системні дослідження та інформаційні технології, 2022, № 1 143
tracker works well only with predictable and small movements with no
occlusions. But in case of lab animals which tend to move unpredictably it fails
almost immediately (Fig. 19).
a b
Fig. 18. TLD tracker: a — successful tracking; b — tracker jumps to other object
a b
Fig. 19. MEDIANFLOW tracker: a — successful tracking; b — tracker failure due to
chaotic mice movements
The final two trackers are MOSSE (Minimum Output Sum of Squared Error)
and CSRT (Channel and Spatial Reliability Tracker). The MOSSE tracker is
based on the calculation of adaptive correlation, as it produces stable correlation
filters when initialized using a single frame. This tracker can operate fast at very
high framerates, and it couples fine with lighting, scale, pose changes and non-
rigid deformations. But its overall performance is lower than learning-based
trackers, for example, like MIL or KCF. The CSRT tracker uses the spatial
reliability map for adjusting the filter support to the part of the selected region
from the frame for tracking [1]. This allows to resolve situations with enlarging
and localization of the selected, thus it can track fine the non-rectangular regions
or objects. But in current static background conditions and unpredictable
movements it also tends to loose objects (Fig. 20).
a b c
Fig. 20. CSRT tracker: a — successful tracking; b — tracker begins to lose objects;
c — tracker failure: object lost
M.A. Shvandt, V.V. Moroz
ISSN 1681–6048 System Research & Information Technologies, 2022, № 1 144
CONCLUSIONS
As the experiment results show, the interactive trackers may perform well under
certain conditions, but in this particular case of lab animal tracking none of them
can track all objects up to the very end of the video. The main advantage of them
is that they usually do not require any additional data, like empty background
image and do not need to estimate it (as minor preprocessing, some noise
reduction or contrast change can be applied). Also, some of them can resolve
minor occlusion situations. But all of these methods require manual object
selection and none of them managed to demonstrate any stable work, which can
say that they are not completely suitable for the current task in its original form.
The more simple approaches, like background subtraction (with or without
background estimation) in case of static location can perform quite well, as they
detect the objects automatically and their computational complexity is not high.
Kernel tracking approaches, as well as optical flow and point tracking can provide
some additional information about object motion, which can be used in
combination with the simple/interactive tracking methods for performance
improvement and additional motion data acquisition.
Some of object detection methods can also serve as some addition to the
tracking methods during the tracking process itself. Different use cases can
require different detection technics. For example, in case of lab mice behavior
observations with specific environment conditions (a box with the dark floor as a
test stand) an automatic detection by color can be applied, but in case of some
more complex environment a combination of several detection approaches may
be required. The detailed comparison of tested object detection and tracking
approaches is presented on Table 2, 3.
The testing results also allow one to assume that for lab mice activity study
most likely background subtraction in combination with image segmentation and
interactive tracker can be used. Fish tracking may require some interactive tracker
combined with optical flow and image segmentation. Our further research will
include the usage of the composition of interactive trackers and the simple
approaches. The idea is to use the positive sides of both sets of methods to
compensate each other's disadvantages. Also, the neural network based tracking is
planned to be applied in attempt to create a completely stable tracker.
T a b l e 2 . Object detection approaches performance comparison
Method Advantages Disadvantages / Features
Image
segmentation
Easy to use when the objects of
interest significantly differ from the
background by some parameter
Can be used as an addition to
other methods to highlight the main
image parts
Low versatility
Requires too accurate algorithm
parameters setup in each particular case
Sensitive to lighting conditions
Template
Matching
Useful for scene analysis when
camera is static and objects of interest
look almost identical
Can be used for detection of some
products on a factory assembly line
Can fail in case of rotation, scaling or
partial occlusions
If rotation/scaling take place, addi-
tional search steps will be required
Feature-based
detection
Invariant to minor turns, scaling
of objects and changes in stage
lighting
Suitable for rough object search
Requires a way of compact character-
istic features representation
Impossible to define an object as in-
stance of some class
Provides false results in case of object
dynamic shape change
Overview of the detection and tracking methods of the lab animals
Системні дослідження та інформаційні технології, 2022, № 1 145
Continued Table 2
Method Advantages Disadvantages / Features
Categorical
recognition
HOG highlights well the objects
with multiple details
Haar approach is more suitable
for face detection and recognition
When the object is mostly a single
piece without any significant details,
HOG will only highlight the borders
Both approaches still require
a pretrained classifier
Fails when objects tend to change
shape
T a b l e 3 . Object tracking approaches performance comparison
Method Advantages Disadvantages / Features
Color-based
tracking
Object gets detected automatically
Simple realization
Good performance speed
Works well with objects, which
change their shape
Cannot handle occlusions
Cannot handle object ‘merging’
Requires additional algorithms for cen-
troid tracking
Requires manual color threshold setup
Unstable if colors are too similar
Background
subtraction
Same benefits as color tracking
Requires only one threshold value
instead of a range
Mandatory existence of the background
image
Other possible problems are similar to
those ones from color-based tracking
Background
estimation
Useful if no empty background i
mage provided
Can be used as addition to Back-
ground subtraction
Objects of interest that do not move ac-
tively can be classified as background
Effective only if the entire background
is static
Kernel track-
ing
(Mean
Shift/CAM
Shift)
Shows fine work speed
CAM Shift ROI can adjust its size
during the process
The Mean Shift ROI has fixed size
CAM Shift is connected to color range
CAM Shift is sensitive to lighting con-
ditions and object shape changes
Optical flow
Good work in case of static
background
The object of interest can be found
by its motion automatically
Performs fine in case of object’s
shape change
In case of object occlusion redetection
is required
Object gets lost when its movements
are getting slower
Cannot handle ‘object merging’ prob-
lem
Point tracking
(Kalman
filter)
Good performance when tracking
single objects on noisy images
Has complicated computations and
implementation
Not good at handling object merging/
occlusions
Interactive
tracking
(BOOSTING
/ MIL / KCF)
Fine work speed (BOOSTING/ MIL)
Best work speed (KCF)
Suitable for non-static background
(moving camera)
KCF tends to loose objects more often
than MIL/BOOSTING
Not good at handling object
merging/occlusions
Interactive
tracking
(TLD)
Can handle object occlusions/ merging
Works with non-static background
(moving camera)
Provide too many false positives, tends
to loose object of interest
Interactive
tracking
(MEDIAN-
FLOW /
MOSSE /
CSRT)
CSRT has fine work speed at high
framerates
CSRT handles lighting, scale, pose
changes
Suitable for non-static background
(moving camera)
MEDIANFLOW works well only with
predictable and small movements with no
occlusions
Methods works well mostly with
predictable object movements
Methods cannot handle occlusions
ACKNOWLEDGEMENTS
Special thanks for the research assistance and provided test videos and images of
lab animals to Faculty of Biology of Odesa I.I. Mechnikov National University.
M.A. Shvandt, V.V. Moroz
ISSN 1681–6048 System Research & Information Technologies, 2022, № 1 146
REFERENCES
1. A.R. Smith, “Color Gamut Transform Pairs”, in SIGGRAPH '78: Proceedings of the
5th annual conference on Computer graphics and interactive techniques, pp. 12–19,
1978. doi:10.1145/800248.807361.
2. W.K. Pratt, Digital Image Processing; 4th edition. Wiley-Interscience, A John Wiley
& Sons, Inc., Publication, 2007, 807 p.
3. X. Zhao, S. Yan, and Q. Gao, “An Algorithm for Tracking Multiple Fish Based on
Biological Water Quality Monitoring”, IEEE Access, vol. 7, pp. 15018–15026,
January 2019. doi:10.1109/ACCESS.2019.2895072.
4. J. Delcourt, M. Denoel, M. Ylieff, and P. Poncin, “Video multitracking of fish
behaviour: A synthesis and future perspectives”, Fish and Fisheries, vol. 14, no. 2,
pp. 186–204, June 2013. doi:10.1111/j.1467-2979.2012.00462.x.
5. H.E.-D. Mohamed et al. “MSR-YOLO: Method to Enhance Fish Detection and
Tracking”, The 11th International Conference on Ambient Systems, Networks and
Technologies (ANT), April 6–9, 2020, Warsaw, Poland.
6. P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Contour Detectionand Hierarchi-
cal Image Segmentation”, IEEE TPAMI, vol. 33, no. 5, pp. 898–916, 2011. doi:
10.1109/TPAMI.2010.161.
7. S. Beucher, “The Watershed Transformation Applied to Image Segmentation”,
Scanning microscopy, vol. 6, July 2000, 26 p.
8. R. Brunelli, Template Matching Techniques in Computer Vision. Theory and Prac-
tice. Wiley, 2009, 339 p.
9. Dr. A.S. Khedher, Dr. A.M. Alkababji, and O. Hadi, “Improving the Reliability of
Object Recognition Based On Template Matching”, AL Rafdain Engineering
Journal, Computer Science, pp. 81–88, 2015.
10. Template Matching. Available: https://docs.opencv.org/3.4/de/da9tutorial_template_
matching.html
11. D.G. Lowe, “Object recognition from local scale-invariant features”, International
Conference on Computer Vision – ICCV, 1999, vol. 2, pp. 1150–1157.
12. H. Kong, H.C. Akakin, and S.E. Sarma, “A Generalized Laplacian of Gaussian Filter
for Blob Detection and Its Applications”, IEEE Transactions on Cybernetics, vol.
43, no. 6, pp. 1719–1733, January 2013. doi: 10.1109/TSMCB.2012.2228639.
13. L. Assirati, N.R. Silva, L. Berton, A.A. Lopes, and O.M. Bruno, “Performing edge
detection by Difference of Gaussians using q-Gaussian kernels”, 2nd International
Conference on Mathematical Modeling in Physical Sciences 2013, Journal of
Physics, Conference Series 490(2014) 012020, IOP Publishing, 2014, 4 p. doi:
10.1088/1742-6596/490/1/012020.
14. T. Lindeberg, “Scale Invariant Feature Transform”, Scholarpedia, vol. 7, no. 5:
10491, 2012, 17 p. doi: 10.4249/scholarpedia.10491.
15. H. Bay, T. Tuytelaars, and L.V. Gool, “SURF: Speeded up robust features”, in Pro-
ceedings of the 9th European conference on Computer Vision, vol. 1, 2006, 14 p.
doi: 10.1007/11744023_32.
16. N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection”,
IEEE Computer Society Conference on Computer Vision and Pattern Recognition
(CVPR ’05), San Diego, United States, vol. 1, pp. 886–893, June 2005. doi:
0.1109/CVPR.2005.177.
17. T. Kobayashi, A. Hidaka, and T. Kurita, “Selection of Histograms of Oriented Gra-
dients Features for Pedestrian Detection”, Neural Information Processing, 14th In-
ternational Conference, ICONIP 2007, Kitakyushu, Japan, November 13–16, 2007, Re-
vised Selected Papers, Part II, pp. 598–607. doi: 10.1007/978-3-540-69162-4_62.
18. P. Viola and M. Jones, “Rapid Object Detection using a Boosted Cascade of Simple
Features”, IEEE Conference on Computer Vision Pattern Recognition, vol. 1, 2001.
doi: I-511.10.1109/CVPR.2001.990517.
19. H. Hunud, A. Kadouf, and Y.M. Mustafah, “Colour-based Object Detection and
Tracking for Autonomous Quadrotor UAV”, 2013 IOP Conf. Ser.: Mater. Sci. Eng.,
vol. 53, 2013, 9 p. doi:10.1088/1757-899X/53/1/012086.
Overview of the detection and tracking methods of the lab animals
Системні дослідження та інформаційні технології, 2022, № 1 147
20. V.V. Moroz and M.A. Shvandt, “Study of movement and behavior of laboratory
animals by methods of object detection and tracking”, Herald of the National Tech-
nical University “KhPI”, Series of “Informatics and Modeling”, Kharkiv: NTU
“KhPI”, Kharkiv, vol. 13, no. 1338, pp. 93–103, 2019. doi: 10.20998/2411-
0558.2019.13.09.
21. A.M. Raid, W.M. Khedr, M.A. El-dosuky, and Mona Aoud, “Image restoration
based on morphological operations”, International Journal of Computer Science,
Engineering and Information Technology (IJCSEIT), vol. 4, no. 3, 2014, pp. 9–21.
doi: 10.5121/ijcseit.2014.4302.
22. S. Ravi and A.M. Khan, “Morphological Operations for Image Processing: Under-
standing and its Applications”, in NCVSComs-13 Conference Proceedings, 2015,
pp. 17–19.
23. Y. Zhang, “Pathological Brain Detection based on wavelet entropy and Hu moment
invariants”, Bio-Medical Materials and Engineering, no. 26, pp. 1283–1290, 2015.
24. P. Joshi, D.M. Escrivá, and V. Godoy, OpenCV By Example. Birmingham: Packt
Publishing Ltd, 2016, 297 p.
25. M. Nixon and A. Aguado, Feature Extraction & Image Processing for Computer
Vision; 3d ed. London: Elsevier Ltd, 2012, 623 p.
26. S.-C.S. Cheung and C. Kamath, “Robust Background Subtraction with Foreground
Validation for Urban Traffic Video”, EURASIP Journal on Applied Signal Process-
ing, vol. 14, pp. 2330–2340, Hindawi Publishing Corporation, 2005.
27. M.A. Alawi, O.O. Khalifa, and M.D.R. Islam, “Performance Comparison of Back-
ground Estimation Algorithms for Detecting Moving Vehicle”, World Applied Sci-
ences Journal 21 (Mathematical Applications in Engineering), IDOSI Publications,
2013, pp. 109–114. doi: 10.5829/idosi.wasj.2013.21.mae.99934.
28. S. Mallick, Simple Background Estimation in Videos using OpenCV (C++/Python).
2019. Available: https://learnopencv.com/simple-background-estimation-in-videos-
using-opencv-c-python/.
29. Y. Cheng, “Mean Shift, Mode Seeking, and Clustering”, IEEE Transactions on Pat-
tern Analysis and Machine Intelligence, vol. 17, no. 8, pp. 790–799, August 1995.
30. D. Comaniciu and P. Meer, “Mean Shift: A Robust Approach Toward Feature Space
Analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence,
vol. 24, no. 5, pp. 603–619, May 2002. doi: 10.1109/34.1000236.
31. G. Bradski, “Computer Vision Face Tracking For Use in a Perceptual User Inter-
face”, Archived 2012-04-17 at the Wayback Machine, Intel Technology Journal,
no. Q2, 1998.
32. B. Lucas and T. Kanade, “An iterative image registration technique with an applica-
tion to stereo vision”, in Proceedings of the 7th International Joint Conference on
Artificial Intelligence (IJCAI ’81), pp. 674–679.
33. A. Radgui, C. Demonceaux, E. Mouaddib, D. Aboutajdine, and M. Rziza, “An
adapted Lucas–Kanade’s method for optical flow estimation in catadioptric images”,
The 8th Workshop on Omnidirectional Vision, Camera Networks and Non-classical
Cameras - OMNIVIS, Marseille: France, Marseille, 2008, 12 p.
34. Optical Flow. Available: https://docs.opencv.org/3.4/d4/dee/tutorial_optical_ flow.html.
35. G. Farneback, “Two-Frame Motion Estimation Based on Polynomial Expansion”,
Lecture Notes in Computer Science, 8 p., 2003.
36. C. Veenman, M. Reinders, and E. Backer, “Resolving motion correspondence for
densely moving points”, IEEE Transactions on Pattern Analysis Machine Intelli-
gence, vol. 23, no. 1, pp. 54–72, 2001. doi: 10.1109/34.899946.
37. A. Salarpour, A. Salarpour, M. Fathi, and M.H. Dezfoulian, “Vehicle tracking using
Kalman filter and features”, Signal and Image Processing: An International Journal
(SIPIJ), vol. 2, no. 2, pp. 45–67, 2011. doi: 10.5121/sipij.2011.2201.
38. S. Dan, Zh. Baojun, and T. Linbo, “A Tracking Algorithm Based on SIFT and Kal-
man Filter”, in Proceedings The 2nd International Conference on Computer Appli-
cation and System Modeling, 2012, pp. 1563–1566.
39. F. Gunnarsson, N. Bergman, U. Forssell, J. Jansson, R. Karlsson, and P.J. Nordlund,
“Particle Filters for Positioning, Navigation and Tracking”, IEEE Transactions on
Signal Processing, vol. 2, no. 2, pp. 425–437, 2002.
M.A. Shvandt, V.V. Moroz
ISSN 1681–6048 System Research & Information Technologies, 2022, № 1 148
40. S. Mallick, Object Tracking using OpenCV (C++/Python). 2017. Available:
https://learnopencv.com/object-tracking-using-opencv-cpp-python/.
41. Z. Kalal, K. Mikolajczyk, and J. Matas, “Forward-Backward Error: Automatic De-
tection of Tracking Failures”, in Proceedings of International Conference on Pattern
Recognition, 23-26 August, 2010, Istambul, Turkey, 4 p. doi: 10.1109/ICPR.2010.675.
42. Z. Kalal, K. Mikolajczyk, and J. Matas, “Tracking-Learning-Detection”, in Proceed-
ings of IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 6,
no. 1, 2010, 14 p.
43. S. Zhou, Y. Peng, K. Gong, and L. Shu, “An Improved TLD Tracking Algorithm for
Fast-moving Object”, International Conference on Computer Science, Electronics
and Communication Engineering (CSECE 2018), Advances in Computer Science
Research, vol. 80, pp. 69–73.
44. D.S. Bolme, J.R. Beveridge, B.A. Draper, and Y.M. Lui, “Visual Object Tracking using
Adaptive Correlation Filters”, CVPR, 2010, 10 p. doi: 10.1109/CVPR.2010.5539960.
Received 07.07.2021
INFORMATION ON THE ARTICLE
Maksym A. Shvandt, ORCID: 0000-0002-4580-3961, Odesa I.I. Mechnikov National
University, Ukraine, e-mail: maxim.shvandt@gmail.com
Volodymyr V. Moroz, ORCID: 0000-0002-3240-4590, Odesa I.I. Mechnikov National
University, Ukraine, e-mail: v.moroz@onu.edu.ua
ОГЛЯД МЕТОДІВ ВИЯВЛЕННЯ ТА ВІДСТЕЖЕННЯ ЛАБОРАТОРНИХ
ТВАРИН / М.А. Швандт, В.В. Мороз
Анотація. Подано огляд та аналіз кількох найпоширеніших методів та алгорит-
мів виявлення і відстеження об’єктів. Окремий випадок використання техніки
відстеження об’єктів може виникнути під час лабораторного дослідження по-
ведінки тварин. Різні експериментальні умови та необхідність збирання певних
корисних даних можуть потребувати спеціальних методів відстеження. Тому
розглянуто набір загальних підходів до відстеження об’єктів, а їх функціона-
льність та можливості перевірено в реальному експерименті. Наведено їх ос-
нову та базові аспекти. Експеримент продемонстрував переваги та недоліки
досліджуваних методів. Зроблено висновки та рекомендації щодо випадків їх
використання.
Ключові слова: відстеження (трекінг) об’єктів, детектування об’єктів, алго-
ритм, відео, кадр, зображення, задній план, передній план, експеримент, кольо-
ровий простір, порогове значення, обчислення заднього плану, сегментація.
ОБЗОР МЕТОДОВ ОБНАРУЖЕНИЯ И ОТСЛЕЖИВАНИЯ ЛАБОРАТОРНЫХ
ЖИВОТНЫХ / М.А. Швандт, В.В. Мороз
Аннотация. Представлены обзор и анализ нескольких распространенных ме-
тодов и алгоритмов обнаружения и отслеживания объектов. Частный случай
использования методики отслеживания объектов может возникнуть во время
лабораторного исследования поведения животных. Различные эксперимента-
льные условия и необходимость сбора определенных полезных данных могут
потребовать специальных методов отслеживания. Поэтому рассмотрен набор
общих подходов к отслеживанию объектов, а их функциональность и возмож-
ности проверены в ходе реального эксперимента. Представлены их основа и
базовые аспекты. Эксперимент продемонстрировал преимущества и недостат-
ки исследуемых методов. Сделаны выводы и рекомендации по поводу случаев
их использования.
Ключевые слова: отслеживание (трекинг) объектов, обнаружение объектов,
алгоритм, видео, кадр, изображение, задний план, передний план, эксперимент,
цветовое пространство, пороговое значение, вычисление заднего плана, сегмен-
тация.
|
| id | journaliasakpiua-article-237497 |
| institution | System research and information technologies |
| keywords_txt_mv | keywords |
| language | English |
| last_indexed | 2025-07-17T10:27:20Z |
| publishDate | 2022 |
| publisher | The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" |
| record_format | ojs |
| resource_txt_mv | journaliasakpiua/ad/3201861f0b309e981c20bf064a0988ad.pdf |
| spelling | journaliasakpiua-article-2374972022-06-21T10:27:50Z Overview of the detection and tracking methods of the lab animals Обзор методов обнаружения и отслеживания лабораторных животных Огляд методів виявлення та відстеження лабораторних тварин Shvandt, Maksym Moroz, Volodymyr відстеження (трекінг) об’єктів детектування об’єктів алгоритм відео кадр зображення задній план передній план експеримент кольоровий простір порогове значення обчислення заднього плану сегментація отслеживание (трекинг) объектов обнаружение объектов алгоритм видео кадр изображение задний план передний план эксперимент цветовое пространство пороговое значение вычисление заднего плана сегментация object tracking object detection algorithm video frame image background foreground experiment color space thresholding background estimation segmentation This article presents an overview of several most common techniques and approaches for object detection and tracking. Today, the tracking task is a very common problem and it can appear in many aspects of our life. One particular case of using object tracking techniques can appear during a lab animal behavior study. Different experimental conditions and the need of certain data collection can require some special tracking techniques. Thus, a set of general approaches to object tracking techniques were considered, and their functionality and possibilities were tested in a real life experiment. In this paper, their basis and main aspects are presented. The experiment has demonstrated the advantages and disadvantages of the studied methods. Considering this, conclusions and recommendations to their usage cases were made. Представлены обзор и анализ нескольких распространенных методов и алгоритмов обнаружения и отслеживания объектов. Частный случай использования методики отслеживания объектов может возникнуть во время лабораторного исследования поведения животных. Различные экспериментальные условия и необходимость сбора определенных полезных данных могут потребовать специальных методов отслеживания. Поэтому рассмотрен набор общих подходов к отслеживанию объектов, а их функциональность и возможности проверены в ходе реального эксперимента. Представлены их основа и базовые аспекты. Эксперимент продемонстрировал преимущества и недостатки исследуемых методов. Сделаны выводы и рекомендации по поводу случаев их использования. Подано огляд та аналіз кількох найпоширеніших методів та алгоритмів виявлення і відстеження об’єктів. Окремий випадок використання техніки відстеження об’єктів може виникнути під час лабораторного дослідження поведінки тварин. Різні експериментальні умови та необхідність збирання певних корисних даних можуть потребувати спеціальних методів відстеження. Тому розглянуто набір загальних підходів до відстеження об’єктів, а їх функціональність та можливості перевірено в реальному експерименті. Наведено їх основу та базові аспекти. Експеримент продемонстрував переваги та недоліки досліджуваних методів. Зроблено висновки та рекомендації щодо випадків їх використання. The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" 2022-04-25 Article Article application/pdf https://journal.iasa.kpi.ua/article/view/237497 10.20535/SRIT.2308-8893.2022.1.10 System research and information technologies; No. 1 (2022); 124-148 Системные исследования и информационные технологии; № 1 (2022); 124-148 Системні дослідження та інформаційні технології; № 1 (2022); 124-148 2308-8893 1681-6048 en https://journal.iasa.kpi.ua/article/view/237497/255908 |
| spellingShingle | відстеження (трекінг) об’єктів детектування об’єктів алгоритм відео кадр зображення задній план передній план експеримент кольоровий простір порогове значення обчислення заднього плану сегментація Shvandt, Maksym Moroz, Volodymyr Огляд методів виявлення та відстеження лабораторних тварин |
| title | Огляд методів виявлення та відстеження лабораторних тварин |
| title_alt | Overview of the detection and tracking methods of the lab animals Обзор методов обнаружения и отслеживания лабораторных животных |
| title_full | Огляд методів виявлення та відстеження лабораторних тварин |
| title_fullStr | Огляд методів виявлення та відстеження лабораторних тварин |
| title_full_unstemmed | Огляд методів виявлення та відстеження лабораторних тварин |
| title_short | Огляд методів виявлення та відстеження лабораторних тварин |
| title_sort | огляд методів виявлення та відстеження лабораторних тварин |
| topic | відстеження (трекінг) об’єктів детектування об’єктів алгоритм відео кадр зображення задній план передній план експеримент кольоровий простір порогове значення обчислення заднього плану сегментація |
| topic_facet | відстеження (трекінг) об’єктів детектування об’єктів алгоритм відео кадр зображення задній план передній план експеримент кольоровий простір порогове значення обчислення заднього плану сегментація отслеживание (трекинг) объектов обнаружение объектов алгоритм видео кадр изображение задний план передний план эксперимент цветовое пространство пороговое значение вычисление заднего плана сегментация object tracking object detection algorithm video frame image background foreground experiment color space thresholding background estimation segmentation |
| url | https://journal.iasa.kpi.ua/article/view/237497 |
| work_keys_str_mv | AT shvandtmaksym overviewofthedetectionandtrackingmethodsofthelabanimals AT morozvolodymyr overviewofthedetectionandtrackingmethodsofthelabanimals AT shvandtmaksym obzormetodovobnaruženiâiotsleživaniâlaboratornyhživotnyh AT morozvolodymyr obzormetodovobnaruženiâiotsleživaniâlaboratornyhživotnyh AT shvandtmaksym oglâdmetodívviâvlennâtavídstežennâlaboratornihtvarin AT morozvolodymyr oglâdmetodívviâvlennâtavídstežennâlaboratornihtvarin |