Підхід з напівкерованим навчанням в інвертованому файловому індексі для пошуку наближеного найближчого сусіда
This paper introduces a novel modification to the Inverted File (IVF) index approach for approximate nearest neighbor search, incorporating supervised learning techniques to enhance the efficacy of intermediate clustering and achieve more balanced cluster sizes. The proposed method involves creating...
Gespeichert in:
| Datum: | 2023 |
|---|---|
| 1. Verfasser: | |
| Format: | Artikel |
| Sprache: | Englisch |
| Veröffentlicht: |
The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"
2023
|
| Schlagworte: | |
| Online Zugang: | https://journal.iasa.kpi.ua/article/view/297400 |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Назва журналу: | System research and information technologies |
| Завантажити файл: | |
Institution
System research and information technologies| _version_ | 1866302945921859584 |
|---|---|
| author | Bazdyrev, Anton |
| author_facet | Bazdyrev, Anton |
| author_sort | Bazdyrev, Anton |
| baseUrl_str | http://journal.iasa.kpi.ua/oai |
| collection | OJS |
| datestamp_date | 2024-02-01T21:03:07Z |
| description | This paper introduces a novel modification to the Inverted File (IVF) index approach for approximate nearest neighbor search, incorporating supervised learning techniques to enhance the efficacy of intermediate clustering and achieve more balanced cluster sizes. The proposed method involves creating clusters using a neural network by solving a task to classify query vectors into the same bucket as their corresponding nearest neighbor vectors in the original dataset. When combined with minimizing the standard deviation of the bucket sizes, the indexing process becomes more efficient and accurate during the approximate nearest neighbor search. Through empirical evaluation on a test dataset, we demonstrate that the proposed semi-supervised IVF index approach outperforms the industry-standard IVF implementation with fixed parameters, including the total number of clusters and the number of clusters allocated to queries. This novel approach has promising implications for enhancing nearest-neighbor search efficiency in high-dimensional datasets across various applications, including information retrieval, natural language search, recommendation systems, etc. |
| doi_str_mv | 10.20535/SRIT.2308-8893.2023.4.05 |
| first_indexed | 2025-07-17T10:28:25Z |
| format | Article |
| fulltext |
A. Bazdyrev, 2023
Системні дослідження та інформаційні технології, 2023, № 4 69
UDC 004.424.4
DOI: 10.20535/SRIT.2308-8893.2023.4.05
SEMI-SUPERVISED INVERTED FILE INDEX APPROACH
FOR APPROXIMATE NEAREST NEIGHBOR SEARCH
A. BAZDYREV
Abstract. This paper introduces a novel modification to the Inverted File (IVF) in-
dex approach for approximate nearest neighbor search, incorporating supervised
learning techniques to enhance the efficacy of intermediate clustering and achieve
more balanced cluster sizes. The proposed method involves creating clusters using a
neural network by solving a task to classify query vectors into the same bucket as
their corresponding nearest neighbor vectors in the original dataset. When combined
with minimizing the standard deviation of the bucket sizes, the indexing process be-
comes more efficient and accurate during the approximate nearest neighbor search.
Through empirical evaluation on a test dataset, we demonstrate that the proposed
semi-supervised IVF index approach outperforms the industry-standard IVF imple-
mentation with fixed parameters, including the total number of clusters and the
number of clusters allocated to queries. This novel approach has promising implica-
tions for enhancing nearest-neighbor search efficiency in high-dimensional datasets
across various applications, including information retrieval, natural language search,
recommendation systems, etc.
Keywords: approximate nearest neighbor search, inverted file index, high-
dimensional data, machine learning.
INTRODUCTION
Approximate Nearest Neighbor (ANN) [1] search is a fundamental problem in
many data-driven applications, spanning domains such as information retrieval,
image processing, natural language search, and recommendation systems. The
efficient retrieval of similar data points from vast datasets is critical for tasks that
involve high-dimensional data representations, where exhaustive search methods
become computationally infeasible. As the dataset size grows, the computational
cost of performing an exact nearest neighbor search using brute force algorithms
becomes prohibitive. Brute force approaches involve comparing each query vec-
tor with every data point in the dataset, leading to computational inefficiencies
and impractical execution times for large datasets. Approximate nearest neighbor
algorithms offer a trade-off between search accuracy and efficiency, allowing for
the retrieval of reasonably accurate results within a significantly reduced search
space. By intelligently approximating the nearest neighbors, these algorithms en-
able faster exploration of large datasets, making them essential for real-world ap-
plications where timely responses are crucial, such as image and text search, rec-
ommendation systems, and similarity-based clustering.
One popular approach in ANN is the Inverted File (IVF) index method [2].
Originally, the IVF index was an inverted indexing technique that partitions the
dataset into a set of Voronoi cells or “buckets” [3]. Each bucket corresponds to a
cluster of data points, and the indices of data points within each bucket are stored
efficiently. During the search process, queries are mapped to their corresponding
A. Bazdyrev
ISSN 1681–6048 System Research & Information Technologies, 2023, № 4 70
buckets, and the search is constrained to the nearest neighbors within these buck-
ets, significantly reducing the search space and accelerating the process.
The standard IVF index has shown remarkable performance gains in nearest
neighbor search tasks. However, it faces challenges in scenarios with unevenly
distributed data, leading to imbalanced bucket sizes [4]. These imbalances can
result in a suboptimal trade-off between search efficiency and accuracy, as some
buckets might be excessively populated, while others remain underutilized. In
addition to challenges posed by unevenly distributed data and imbalanced bucket
sizes, another significant issue that the standard IVF index may encounter relates
to the formation of centroid clusters. The standard approach typically relies on
unsupervised clustering techniques to create the centroids or representatives for
each bucket. This process can potentially lead to suboptimal cluster assignments,
especially when the training data for centroid formation is insufficient or poorly
representative of the underlying data distribution.
To address this limitation, we propose a novel modification to the IVF index
method that leverages supervised learning techniques. Specifically, we train clas-
sification neural networks to assign query vectors to their most appropriate
bucket, based on the similarity to vectors in the dataset. Moreover, we incorporate
an optimization objective to minimize the standard deviation of the bucket sizes,
further refining the indexing process. By doing so, we aim to achieve more bal-
anced cluster sizes, effectively mitigating the impact of unevenly distributed data.
PRELIMINARIES
Let’s formulate a general ANN problem. Let },1| { NiєxX d
i be a set of N
d-dimensional vectors representing the data points in the dataset. The objective of
ANN search is to efficiently find, for a given query vector dqє , an approximate
nearest neighbor Xєx * such that the distance between q and *x is minimized.
In the Inverted File Index (IVF) approach, we partition the dataset X into K
disjoint subsets or buckets, denoted as KBBB , 21 . Each bucket corresponds to a
subset (cluster) of vectors in X with corresponding centroids ic — centroid of
corresponding iB .
The ANN search with the IVF index can be formulated as follows. Given the
metric function dist, a query vector dq , the goal is to find the bucket queryB ,
with a corresponding centroid queryc that minimizes the distance to the query vec-
tor — equation:
)), ((argmin
}..{ 1
i
cc
query cqdistc
k
.
Once the bucket queryB , is identified, we need to find *x — approximate
nearest neighbor within that bucket using brute force search — equation:
queryBx
xqdistx
)), ((argmin* .
Optionally, to improve accuracy, it is possible to use several jB adjoining to
queryB buckets on the last step depending on the method hyperparameter set.
Semi-supervised inverted file index approach for approximate nearest neighbor search
Системні дослідження та інформаційні технології, 2023, № 4 71
SEMI-SUPERVISED INVERTED FILE INDEX APPROACH
Let dist — some metric function (euclidian, manhattan, etc.).
Let },1| { NiєxX d
i vectors representing the data points in the dataset.
Let },1| { MiєqQ d
i — a set of M d-dimensional vectors with a similar
distribution to real-life production queries be a queries training set, NM .
Let },1, )), ((argmin| { MixqdistrXєrR
Xx
iii
— set of ground truth near-
est neighbors (responses) from X for each Qqi .
Let K є — method hyperparameter, a desired amount of buckets
KBBB ,,, 21 , such that X = i
K
i B1 and ji BB if ji .
Let KdNN : – some vector function — equation:
} / { )( jijiij BєrBєqPqNN for Kj ,1 , (1)
where } / { jiji BrBqP — is a conditional probability that ji Bq given ji Br .
In our case a multi-layer perceptron [5] with a final softmax layer — equation
j
i
zK
j
z
i
e
e
z
1
)(softmax for Ki ,1 , that distributes query vectors iq into buckets
KBBB , ,, 21 . We also want this function to have a specific property, that it
distributes query vectors Qqi to the same bucket as their corresponding re-
sponses Rri .
We can estimate the NN’s parameters using the maximum likelihood estima-
tion method [6; 7], if we consider the task as a standard softmax multiclass classi-
fication with a cross-entropy loss function — equation )~(log)ˆ, (
1
ii
K
i
yyyyCE
.
If we consider Q as an input training set and on each epoch step we can calculate
actual training targets Y as follows },1 )}),(({maxarg{
,1 Kj
ij MirNNY
— for
each training query we assign its ground truth nearest neighbor’s bucket as a tar-
get bucket. As a result of NN training, we can explicitly distribute input queries
by buckets — equation }))(({argmax)(
,1
qNNqbucket i
Ki
for d q and implicitly
get the desired buckets KBBB , ,, 21 — equation:
jxNNXєxB
Ki
ij
,1
)})(({maxarg | for Kj ,1 . (2)
STANDARD DEVIATION-BASED BUCKET SIZE REGULARIZATION
The vanilla approach proposed in the previous paragraph can produce imbalanced
buckets KBBB , ,, 21 in the result, for example, NN will distribute all the query
items in the single bucket, so there will be no full power use of the IVF index. If
A. Bazdyrev
ISSN 1681–6048 System Research & Information Technologies, 2023, № 4 72
we want the most efficient computational power of the IVF index method, then
we obviously need buckets of the most equal size so that the expectation of the
search time of a brute force search over a random bucket takes the minimum time.
Let },1| { KiBsS ii — set of buckets sizes after we have trained NN that
distributes query vectors by buckets. We can calculate the standard deviation of
the dataset S:
1
)(
)(
2
N
ss
S i . If we want to have buckets of approxi-
mately equal sizes then we need to minimize )(S . The problem here is that this
function is not differentiable with respect to the parameters of the NN model, so
we need to use a differentiable approximation of )(S .
Using equations (1), (2) we can calculate the expectation of size for each
bucket as follows — equation:
; for )(
1
XxxNNs iij
N
i
j
for Kj ,1 . (3)
So, we can have },1|{
~
KjsS i — set of expectations of bucket sizes after
we have trained NN that distributes query vectors by buckets. And )
~
(S which is
differentiable with respect to the parameters of the NN model.
Finally, we can introduce a combined multiclass cross-entropy loss function
with std-based bucket size regularization in equation:
) (*
11
~
)~(log
1
),ˆ, ( Syy
N
XyyL ijij
K
j
N
i
, (4)
where
)~(log
1
11
ijij
K
j
N
i
yy
N
is a standard cross-entropy component; )
~
(S —
approximated standard deviation of bucket sizes and ) , 0[ — regularization
scale.
TRAINING ALGORITHM
1. Defining K — desired number of buckets and M — desired maximum
bucket size.
2. Initialization of multiclass classification NN weights [8].
3. On each training epoch:
1. Calculate current epoch targets )})}(({argmax{
,1
ij
Kj
rNNY
.
2. Calculate the multiclass cross-entropy loss component using Qqi as
inputs and Yyi as targets.
3. Calculate expectations of sizes for each cluster — equation (3).
4. Calculate )
~
(S — std-regularization component.
5. Calculate aggregated loss equation (4).
6. Do the backpropagation step using stochastic gradient descent modifi-
cation, for example, Adam [9], and update NN’s weights.
Semi-supervised inverted file index approach for approximate nearest neighbor search
Системні дослідження та інформаційні технології, 2023, № 4 73
4. After the training process is complete, we select the best checkpoint
based on the desired performance metric, for example, precision where the actual
maximum bucket size < M. If there is no such checkpoint in which the maximum
actual bucket size is lower than the desired one, then select the checkpoint with
the size closest to the desired one and display the corresponding warning.
It could also be useful to apply some dynamic scaling of regularization
parameter to achieve better precision performance results.
EXPERIMENTAL RESULTS
We’ve used 3 different configurations in our experiments:
1. Both indexed and query data have a Normal distribution: ;)1,0( ~ NX
)1,0(~ NQ .
2. Both indexed and query data have a skewed Exponential distribution:
)1 (~); 1 ( ~ lExponentiaQlExponentiaX .
3. Indexed data has a Normal distribution and query data has an Exponential
distribution that can be similar to different life scenarios: );1,0( ~ NX
)1(~ lExponentiaQ .
In all cases we use 64-dimensional vectors. We also split query data Q to
training and testing parts equally in order to minimize the risk of overfitting and
getting incorrect results — we use the train part during NN’s weights optimization
and the test part to calculate final metrics. We use a three-layer perceptron with
tanh activation functions and Adam [9] optimization algorithm using pytorch
framework [10]. We evaluate our algorithm compared to a faiss IVF implementa-
tion [11] which is a current industrial standard using SMAPE and precision met-
rics — equations:
2||
||1
100),(
1 /iFiA
iFiA
n
*FASMAPE
n
i
;
FPTP
TP
Precision
.
Where in our case iA is the distance between i-th query vector iq and its ac-
tual nearest neighbor from X and iF is the distance between i-th query vector iq
and its suggested by algorithm approximate nearest neighbor from X. In other
words, the SMAPE metric shows us how much the distances to the ground truth
nearest neighbors and to the approximated neighbors differ on average.
In the case of the precision metric, we have TP — the number of cases where
the approximate nearest neighbor equals the actual nearest neighbor and FP — the
number of cases where the approximate nearest neighbor differs from the actual
nearest neighbor. In other words, this metric shows us how often our approxi-
mated nearest neighbors exactly coincide with the ground truth ones.
We have final results presented in Tables 1, 2, 3. We also have a general
structure of the result table:
– X-size — number of vectors in the indexed dataset;
– Q-size — number of vectors in the queries training set;
A. Bazdyrev
ISSN 1681–6048 System Research & Information Technologies, 2023, № 4 74
– K — number of buckets in the algorithm;
– Nprobe — number of adjoining buckets to use in the brute force phase in
order to achieve a better precision;
– IFV Prec./ IFV SMAPE — precision and SMAPE metrics of the faiss IFV;
– SSIFV Prec./ SSIFV SMAPE — precision and SMAPE metrics of the novel
semi-supervised IFV proposed in the paper.
T a b l e 1
X-size Q-size K Nprobe IFV Prec. IFV SMAPE SSIFV Prec. SSIFV SMAPE
10K 10K 200 1 0.055 8.7% 0.083 7.7%
10K 10K 200 5 0.200 4.37% 0.255 3.69%
10K 10K 200 20 0.480 1.81% 0.524 1.56%
1M 10K 2000 1 0.063 7.41% 0.071 7.1%
1M 10K 2000 5 0.200 3.8% 0.220 3.72%
1M 10K 2000 20 0.435 1.79% 0.491 1.65%
;)1,0( ~ NX )1,0(~ NQ results
T a b l e 2
X-size Q-size K Nprobe IFV Prec. IFV SMAPE SSIFV Prec. SSIFV SMAPE
10K 10K 200 1 0.057 8.68% 0.066 8.53%
10K 10K 200 5 0.197 4.40% 0.207 4.33%
10K 10K 200 20 0.473 1.87% 0.460 1.95%
1M 10K 2000 1 0.061 8.16% 0.069 7.99%
1M 10K 2000 5 0.218 4.32% 0.217 4.34%
1M 10K 2000 20 0.490 1.77% 0.498 1.77%
;)1 (~ lExponentiaX )1(~ lExponentiaQ results
T a b l e 3
X-size Q-size K Nprobe IFV Prec. IFV SMAPE SSIFV Prec. SSIFV SMAPE
10K 10K 200 1 0.025 14.76% 0.137 3.87%
10K 10K 200 5 0.107 6.14% 0.403 1.46%
10K 10K 200 20 0.305 2.49% 0.756 0.41%
1M 10K 2000 1 0.035 11.68% 0.141 3.65%
1M 10K 2000 5 0.130 4.97% 0.419 1.28%
1M 10K 2000 20 0.341 2.44% 0.766 0.40%
;)1,0(~ NX )1(~ lExponentiaQ results
CONCLUSION
The experimental results of our novel semi-supervised modification to the In-
verted File (IVF) index approach for approximate nearest neighbor search look
very promising, because SS-IVF approach outperforms the industry standard im-
plementation in a lot of different experiment configurations from the raw preci-
sion/smape metrics perspective, especially in scenarios where query distribution
significantly differs from the indexed dataset. However, this SS-IVF algorithm is still
quite far from a production solution, since we have not yet done an efficient C/C++
implementation, which would use parallelization and low-level optimizations.
Semi-supervised inverted file index approach for approximate nearest neighbor search
Системні дослідження та інформаційні технології, 2023, № 4 75
REFERENCES
1. P. Indyk and R. Motwani, “Approximate nearest neighbors: towards removing the
curse of dimensionality,” in Proceedings of the Annual ACM Symposium on Theory
of Computing (STOC), 1998.
2. H. Jégou, M. Douze, and C. Schmid, “Product Quantization for Nearest Neighbor
Search,” IEEE Xplore. [Online]. Available: https://ieeexplore.ieee.org/ docu-
ment/5432202
3. G. Voronoi, “Une méthode géométrique pour la détermination des régions de visi-
bilité dans le voisinage d’un point de l’espace (A geometric method for determining
regions of visibility in the vicinity of a point in space),” Journal de Mathématiques
Pures et Appliquées (Journal of Pure and Applied Mathematics), 1908.
4. J. Johnson, M. Douze, and H. Jégou, “Optimizing Product Quantization for Nearest
Neighbor Search,” IEEE Xplore. [Online]. Available: https://ieeexplore.ieee.org/
document/6619223
5. D. E. Rumelhart and J. L. McClelland, “Learning Internal Representations by Error
Propagation,” IEEE Xplore. [Online]. Available: https://ieeexplore.ieee.org/ docu-
ment/6302929
6. Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning. 2016. Avail-
able: https://www.deeplearningbook.org/
7. X. Glorot and Y. Bengio, Understanding the difficulty of training deep feedforward
neural networks. 2010. [Online]. Available: https://proceedings.mlr.press/v9/
glorot10a/glorot10a.pdf
8. D.P. Kingma, Adam: A Method for Stochastic Optimization. 2014. [Online]. Avail-
able: https://arxiv.org/abs/1412.6980
9. PyTorch. [Online]. Available: https://pytorch.org/
10. faiss::IndexIVF Class Reference. [Online]. Available: https://faiss.ai/cpp_api/struct/
structfaiss_1_1IndexIVF.html
Received 05.09.2023
INFORMATION ON THE ARTICLE
Anton A. Bazdyrev, ORCID: 0000-0001-8191-897X, Educational and Research In-
stitute for Applied System Analysis of the National Technical University of Ukraine
“Igor Sikorsky Kyiv Polytechnic Institute”, Ukraine, e-mail: bazdyrev.anton@gmail.com
ПІДХІД З НАПІВКЕРОВАНИМ НАВЧАННЯМ В ІНВЕРТОВАНОМУ
ФАЙЛОВОМУ ІНДЕКСІ ДЛЯ ПОШУКУ НАБЛИЖЕНОГО
НАЙБЛИЖЧОГО СУСІДА / А.А. Баздирев
Анотація. Запропоновано удосконалення підходу з використанням інвертова-
ного файлового індексу для пошуку наближених найближчих сусідів з викори-
станням напівкерованого навчання та навчання з учителем з метою підвищен-
ня ефективності проміжної кластеризації та досягнення більш збалансованих
розмірів кластерів. Запропонований метод полягає у створенні кластерів за до-
помогою нейронної мережі з розв’язанням завдання класифікації векторів за-
питів у той самий кластер, що і їхні відповідні найближчі сусідні вектори у ви-
хідному наборі даних. У поєднанні з мінімізацією стандартного відхилення
розмірів кластерів процес індексування стає більш ефективним і точним під
час наближеного пошуку найближчих сусідів. Через емпіричну оцінку на тес-
товому наборі даних продемонстровано, що запропонований підхід до індексу
виявився більш точним порівняно з індустрійно-стандартною реалізацією із
фіксованими параметрами, включаючи загальну кількість кластерів та кіль-
кість кластерів, що виділяються для запитів. Метод перспективний для підви-
щення ефективності пошуку найближчих сусідів у великорозмірних наборах
даних у різних застосуваннях, таких як інформаційний пошук, пошук за при-
родною мовою, рекомендаційні системи тощо.
Ключові слова: пошук наближених найближчих сусідів, інвертований файло-
вий індекс, дані високої розмірності, машинне навчання.
|
| id | journaliasakpiua-article-297400 |
| institution | System research and information technologies |
| keywords_txt_mv | keywords |
| language | English |
| last_indexed | 2025-07-17T10:28:25Z |
| publishDate | 2023 |
| publisher | The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" |
| record_format | ojs |
| resource_txt_mv | journaliasakpiua/bf/3cb2d54dcf468bd50926ffd14a14ddbf.pdf |
| spelling | journaliasakpiua-article-2974002024-02-01T21:03:07Z Semi-supervised inverted file index approach for approximate nearest neighbor search Підхід з напівкерованим навчанням в інвертованому файловому індексі для пошуку наближеного найближчого сусіда Bazdyrev, Anton approximate nearest neighbor search inverted file index high-dimensional data machine learning пошук наближених найближчих сусідів інвертований файловий індекс дані високої розмірності машинне навчання This paper introduces a novel modification to the Inverted File (IVF) index approach for approximate nearest neighbor search, incorporating supervised learning techniques to enhance the efficacy of intermediate clustering and achieve more balanced cluster sizes. The proposed method involves creating clusters using a neural network by solving a task to classify query vectors into the same bucket as their corresponding nearest neighbor vectors in the original dataset. When combined with minimizing the standard deviation of the bucket sizes, the indexing process becomes more efficient and accurate during the approximate nearest neighbor search. Through empirical evaluation on a test dataset, we demonstrate that the proposed semi-supervised IVF index approach outperforms the industry-standard IVF implementation with fixed parameters, including the total number of clusters and the number of clusters allocated to queries. This novel approach has promising implications for enhancing nearest-neighbor search efficiency in high-dimensional datasets across various applications, including information retrieval, natural language search, recommendation systems, etc. Запропоновано удосконалення підходу з використанням інвертованого файлового індексу для пошуку наближених найближчих сусідів з використанням напівкерованого навчання та навчання з учителем з метою підвищення ефективності проміжної кластеризації та досягнення більш збалансованих розмірів кластерів. Запропонований метод полягає у створенні кластерів за допомогою нейронної мережі з розв’язанням завдання класифікації векторів запитів у той самий кластер, що і їхні відповідні найближчі сусідні вектори у вихідному наборі даних. У поєднанні з мінімізацією стандартного відхилення розмірів кластерів процес індексування стає більш ефективним і точним під час наближеного пошуку найближчих сусідів. Через емпіричну оцінку на тестовому наборі даних продемонстровано, що запропонований підхід до індексу виявився більш точним порівняно з індустрійно-стандартною реалізацією із фіксованими параметрами, включаючи загальну кількість кластерів та кількість кластерів, що виділяються для запитів. Метод перспективний для підвищення ефективності пошуку найближчих сусідів у великорозмірних наборах даних у різних застосуваннях, таких як інформаційний пошук, пошук за природною мовою, рекомендаційні системи тощо. The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" 2023-12-26 Article Article application/pdf https://journal.iasa.kpi.ua/article/view/297400 10.20535/SRIT.2308-8893.2023.4.05 System research and information technologies; No. 4 (2023); 69-75 Системные исследования и информационные технологии; № 4 (2023); 69-75 Системні дослідження та інформаційні технології; № 4 (2023); 69-75 2308-8893 1681-6048 en https://journal.iasa.kpi.ua/article/view/297400/290386 |
| spellingShingle | пошук наближених найближчих сусідів інвертований файловий індекс дані високої розмірності машинне навчання Bazdyrev, Anton Підхід з напівкерованим навчанням в інвертованому файловому індексі для пошуку наближеного найближчого сусіда |
| title | Підхід з напівкерованим навчанням в інвертованому файловому індексі для пошуку наближеного найближчого сусіда |
| title_alt | Semi-supervised inverted file index approach for approximate nearest neighbor search |
| title_full | Підхід з напівкерованим навчанням в інвертованому файловому індексі для пошуку наближеного найближчого сусіда |
| title_fullStr | Підхід з напівкерованим навчанням в інвертованому файловому індексі для пошуку наближеного найближчого сусіда |
| title_full_unstemmed | Підхід з напівкерованим навчанням в інвертованому файловому індексі для пошуку наближеного найближчого сусіда |
| title_short | Підхід з напівкерованим навчанням в інвертованому файловому індексі для пошуку наближеного найближчого сусіда |
| title_sort | підхід з напівкерованим навчанням в інвертованому файловому індексі для пошуку наближеного найближчого сусіда |
| topic | пошук наближених найближчих сусідів інвертований файловий індекс дані високої розмірності машинне навчання |
| topic_facet | approximate nearest neighbor search inverted file index high-dimensional data machine learning пошук наближених найближчих сусідів інвертований файловий індекс дані високої розмірності машинне навчання |
| url | https://journal.iasa.kpi.ua/article/view/297400 |
| work_keys_str_mv | AT bazdyrevanton semisupervisedinvertedfileindexapproachforapproximatenearestneighborsearch AT bazdyrevanton pídhídznapívkerovanimnavčannâmvínvertovanomufajlovomuíndeksídlâpošukunabliženogonajbližčogosusída |