Integration of large language models with semantic processing tools as an instrument for knowledge digitization
The paper addresses the task of automating the analysis, generation, and management of complex natural language documents based on the integration of generative artificial intelligence with semantic technologies, in particular Semantic MediaWiki. It analyzes how the use of ontological models of subj...
Gespeichert in:
| Datum: | 2025 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Artikel |
| Sprache: | Ukrainian |
| Veröffentlicht: |
PROBLEMS IN PROGRAMMING
2025
|
| Schlagworte: | |
| Online Zugang: | https://pp.isofts.kiev.ua/index.php/ojs1/article/view/838 |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Назва журналу: | Problems in programming |
| Завантажити файл: | |
Institution
Problems in programming| Zusammenfassung: | The paper addresses the task of automating the analysis, generation, and management of complex natural language documents based on the integration of generative artificial intelligence with semantic technologies, in particular Semantic MediaWiki. It analyzes how the use of ontological models of subject domains and semantic markup makes it possible to prevent such critical shortcomings of large language models as the tendency to “hallucinations” (generation of false statements) and the lack of transparency in decision explanations. This integration is explored using the example of the instrumental system “LINZA,” which is being developed for automated intelligent processing of content from heterogeneous documents with complex and weakly formalized structure, with the aim of generating natural language reports according to specified requirements in various domains, such as public administration, jurisprudence, certification, and standardization. The system is based on the combination of the flexibility and adaptability of large language models with formalized ontological knowledge and support for semantic queries about pertinent facts in the Semantic MediaWiki environment or external sources (Retrieval-Augmented Generation). The proposed approach will significantly reduce the risks of typical errors in generative models and ensure factual accuracy and transparency in the decision-making process. Special attention is paid to mechanisms of transparency, reliability, and the possibility of human control to increase trust in the generated data, which is especially important in areas with high information security requirements, and ensures greater confidence in automatically created documents. The multi-level architecture of the system defines the tasks of agents and services that perform specialized functions of data collection, analysis, transformation, and verification, and ensures flexibility, scalability, and adaptability of the system to changes in input data and requirements.Problems in programming 2025; 2: 63-76 |
|---|