Analysis of the Site Structure Using the Concept of Modularity

The analysis of the structure of the website, which has a hierarchical organization of sections, is carried out. The hierarchical structure the division of all information into separate categories by topic is involved. The hypertext model of a website is represented by a mathematical model in the fo...

Повний опис

Збережено в:
Бібліографічні деталі
Дата:2020
Автори: Гук, Наталія, Диханов, Станіслав, Долотов, Іван
Формат: Стаття
Мова:Ukrainian
Опубліковано: Кам'янець-Подільський національний університет імені Івана Огієнка 2020
Онлайн доступ:http://mcm-math.kpnu.edu.ua/article/view/224943
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
Назва журналу:Mathematical and computer modelling. Series: Physical and mathematical sciences

Репозитарії

Mathematical and computer modelling. Series: Physical and mathematical sciences
Опис
Резюме:The analysis of the structure of the website, which has a hierarchical organization of sections, is carried out. The hierarchical structure the division of all information into separate categories by topic is involved. The hypertext model of a website is represented by a mathematical model in the form of an oriented unweighted web graph. Web pages are vertices of a graph, and hyperlinks between them are edges of a graph. A hypothesis is put forward about the thematic coherence of pages that link to each other. Groups of related pages are thought to form a cluster. Using local information about hyperlinks between site pages, site pages are clustered. As a clustering quality metric the modularity functional is used. Modularity characterizes the difference between the fraction of edges within a cluster at a given partition and the fraction of edges if they were generated in the graph at random. A random graph as the zero model is chosen. The Louvain method to maximize the values of the modularity functional is used. A greedy scheme of the algorithm, which reduces the problem to a sequence of local optimization problems, is developed. It is proposed to select vertex-cluster pairs, the connection of which leads to an increase in the value of the modularity functional. For an arbitrary vertex of the graph, the target cluster is found based on the analysis of the lists of adjacency of the vertex. Using the principles of functional programming application software that implements the algorithm is developed. The software to analyze the structure of the online store site is used. The dependence of the value of the modularity functional on the number of partition clusters and the parameters of the iterative process is investigated. Analysis of the content of the website pages within the cluster, which revealed their thematic similarity, was performed. For most clusters the formation of a semantic description is possible. The results of clustering are compared with the expert partition. The values of accuracy and completeness of division into clusters are calculated.