Reverse synthesis of natural language phrases grounding on their ontological representation using a large language model
The presented article introduces a novel solution that uses a specially developed structured prompt for a large language model (Chat GPT). A series of experiments were carried out on synthesizing natural language phrases based on their ontological representations. These ontological representations w...
Saved in:
Date: | 2024 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | Ukrainian |
Published: |
Інститут програмних систем НАН України
2024
|
Subjects: | |
Online Access: | https://pp.isofts.kiev.ua/index.php/ojs1/article/view/657 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Journal Title: | Problems in programming |
Download file: | ![]() |
Institution
Problems in programmingSummary: | The presented article introduces a novel solution that uses a specially developed structured prompt for a large language model (Chat GPT). A series of experiments were carried out on synthesizing natural language phrases based on their ontological representations. These ontological representations were automatically constructed from sentences of scientific and technical texts using previously developed software tools. Such representations contain entities found in the text and typed semantic relationships between them, which can be realised in the phrases of the analysed text. The system of relationships, specified by a set of concepts, is linked with the entity of the related part of the sentence, which in turn can be a simple sentence or part of a complex sentence. The structured prompt for the large language model includes explanations of the semantic relationships between concepts in the context of sentence synthesis from ontological representation, as well as a set of pairs of concepts connected by semantic relationships, which serve as materia l for sentence creation. The synthesised natural language sentences were compared with the originals using the cosine similarity measure across different vectorisation methods. The obtained similarity scores ranged from 0.8193 to 0.9722 according to the xx_ent_wiki_sm model, although stylistic distortions of the generated sentences were observed in some cases. The research presented in this work has practical significance for the development of dialogue information systems that combine the ontological approach with the use of large language models.Prombles in programming 2024; 2-3: 359-368 |
---|