AUTOMATED IDENTIFICATION OF ROOFTOP SOLAR POWER PLANTS USING LARGE MULTIMODAL MODELS UNDER LIMITED VISUAL INFORMATION

A method for detecting rooftop photovoltaic systems in satellite and aerial imagery under degraded spatial resolution is proposed. Object detection is based on solving an image classification task using a large language model to reduce computational load and accelerate the identification process. Th...

Повний опис

Збережено в:
Бібліографічні деталі
Дата:2026
Автори: Shapovalova , S., Matіakh , S., Holovakin , M.
Формат: Стаття
Мова:Українська
Опубліковано: Institute of Renewable Energy National Academy of Sciences of Ukraine 2026
Теми:
Онлайн доступ:https://ve.org.ua/index.php/journal/article/view/604
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
Назва журналу:Vidnovluvana energetika

Репозитарії

Vidnovluvana energetika
Опис
Резюме:A method for detecting rooftop photovoltaic systems in satellite and aerial imagery under degraded spatial resolution is proposed. Object detection is based on solving an image classification task using a large language model to reduce computational load and accelerate the identification process. The proposed method is intended for rapid inventory of solar generation facilities to enhance the resilience of Ukraine’s energy infrastructure. Unlike semantic segmentation approaches, which require specialized model training and significant computational resources, the proposed method reformulates the task as binary classification of local image fragments using a large language model. This approach eliminates the need for additional training and reduces overall computational costs by replacing the resource-intensive semantic segmentation task with a binary classification problem applied to local image patches. Four prompt strategies for target object detection are developed and evaluated. An algorithm for simulating limited spatial resolution through controlled scaling and interpolation is introduced. Computational experiments were performed using the GPT-4o large language model to assess alternative strategies for formulating identification criteria for solar generation assets at different levels of spatial resolution. Degradation in the range of 1–2 m/pixel significantly affects detection accuracy. At high detail levels (0.1 – 0.3 m/pixel), the highest performance is achieved by the “binary classification based on image examples” strategy (F1-score = 0.9523). At lower resolutions (1–2 m/pixel), the more robust approaches are “classification based on step-by-step feature analysis” (F1-score = 0.6801) and “classification based on hypotheses” (F1-score = 0.6502). The results demonstrate that multimodal language models can support scalable automated inventory of distributed solar installations over large territories without task-specific training.
DOI:10.36296/1819-8058.2026.1(84).166-180