The problem of IC50 prediction for ligand-protein pairs using Transformer architecture under limited resources

The article discusses approaches to optimizing the training of models for predicting the half-maximal inhibitory concentration (IC50) of ligand-protein pairs under limited computational resources. A method of smart bucketing of data by protein length with a dynamic selection of the number of groups...

Повний опис

Збережено в:
Бібліографічні деталі
Дата:2026
Автори: Krysenko, Pavlo, Bektimirov, Alim
Формат: Стаття
Мова:Українська
Опубліковано: Kyiv National University of Construction and Architecture 2026
Теми:
Онлайн доступ:https://es-journal.in.ua/article/view/365080
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
Назва журналу:Environmental safety and natural resources
Завантажити файл: Pdf

Репозитарії

Environmental safety and natural resources
Опис
Резюме:The article discusses approaches to optimizing the training of models for predicting the half-maximal inhibitory concentration (IC50) of ligand-protein pairs under limited computational resources. A method of smart bucketing of data by protein length with a dynamic selection of the number of groups to improve randomization is proposed. To solve the problem of the quadratic complexity of the Transformer architecture, a convolution layer was used to compress the input data. Based on 4 conducted experiments, the relationship between the degree of sequence compression and the obtained root mean square error (RMSE) for lgIC50 was analyzed.
DOI:10.32347/2411-4049.2026.2.287-292