Semantic Indexing and Cluster Analysis of Cybersecurity Documents
This study examines methods for extracting concepts from textual messages and constructing semantic networks for text data analysis, specifically within the context of cyberthreats. The semantic networks are essential tools for identifying key concepts and their relationships which provide a better...
Збережено в:
Дата: | 2024 |
---|---|
Автори: | , |
Формат: | Стаття |
Мова: | Ukrainian |
Опубліковано: |
Інститут проблем реєстрації інформації НАН України
2024
|
Теми: | |
Онлайн доступ: | http://drsp.ipri.kiev.ua/article/view/316711 |
Теги: |
Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
|
Назва журналу: | Data Recording, Storage & Processing |
Репозитарії
Data Recording, Storage & ProcessingРезюме: | This study examines methods for extracting concepts from textual messages and constructing semantic networks for text data analysis, specifically within the context of cyberthreats. The semantic networks are essential tools for identifying key concepts and their relationships which provide a better understanding of the relationships between concepts and help uncover critical data such as hacker group names, malicious programs, vulnerabilities, and other threats. Such an approach can be applied in cybersecurity, where textual information can contain vital data for preventing and responding to cyber threats.
The focus is on the use of large language models (LLMs) that enable automated extraction of entities and the construction of concept networks. Utilizing LLMs for information extraction from text data helps create networks of relationships that can be used to analyze causal links between events and objects, detect interdependencies, and structure information. These networks can be further employed for cluster analysis, allowing for the automatic grouping of nodes by similarity and the identification of new patterns in the data.
The research also addresses the construction of document proximity networks, which assess the degree of similarity between texts based on their semantic structures. This enables the identification of thematically related documents that may contain significant information for analysis, as well as the detection of informational chains and key trends within large textual datasets.
By applying the methods described in the article, it is possible to effectively structure and analyze large volumes of textual information in cybersecurity, facilitating quicker threat detection and the formulation of strategies for prevention. This approach also allows for the streamline of many stages of analytical work to do, thereby enhancing the efficiency of big data analysis. Fig.: 3. Refs: 11 titles. |
---|