Site-specific sunflower yield forecasting based on spatial analysis and machine learning
The study focuses on the development of an intelligent yield forecasting system using satellite data, geospatial data and climate indicators. The introduction of modern information technologies, in particular machine learning and big data analysis methods, provides agricultural professionals with st...
Gespeichert in:
| Veröffentlicht in: | Доповіді НАН України |
|---|---|
| Datum: | 2025 |
| Hauptverfasser: | , , , , |
| Format: | Artikel |
| Sprache: | Englisch |
| Veröffentlicht: |
Видавничий дім "Академперіодика" НАН України
2025
|
| Schlagworte: | |
| Online Zugang: | https://nasplib.isofts.kiev.ua/handle/123456789/206609 |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Назва журналу: | Digital Library of Periodicals of National Academy of Sciences of Ukraine |
| Zitieren: | Site-specific sunflower yield forecasting based on spatial analysis and machine learning / V.H. Hnatiienko, H.M. Hnatiienko, O.L. Zozulya, V.Ye. Snytyuk, V.V. Schwartau // Доповіді Національної академії наук України. — 2025. — № 4. — С. 17-26. — Бібліогр.: 14 назв. — англ. |
Institution
Digital Library of Periodicals of National Academy of Sciences of Ukraine| _version_ | 1860251214393901056 |
|---|---|
| author | Hnatiienko, V.H. Hnatiienko, H.M. Zozulya, O.L. Snytyuk, V.Ye. Schwartau, V.V. |
| author_facet | Hnatiienko, V.H. Hnatiienko, H.M. Zozulya, O.L. Snytyuk, V.Ye. Schwartau, V.V. |
| citation_txt | Site-specific sunflower yield forecasting based on spatial analysis and machine learning / V.H. Hnatiienko, H.M. Hnatiienko, O.L. Zozulya, V.Ye. Snytyuk, V.V. Schwartau // Доповіді Національної академії наук України. — 2025. — № 4. — С. 17-26. — Бібліогр.: 14 назв. — англ. |
| collection | DSpace DC |
| container_title | Доповіді НАН України |
| description | The study focuses on the development of an intelligent yield forecasting system using satellite data, geospatial data and climate indicators. The introduction of modern information technologies, in particular machine learning and big data analysis methods, provides agricultural professionals with strategic advantages, reducing the risks of excessive pesticide use and promoting sustainable agricultural development. This study aims to optimize desiccant application in sunflower cultivation by modeling potential yield losses based on data obtained during the growing season. The use of digital solutions is relevant for crop production, as it increases the accuracy of forecasts and the efficiency of management decisions, while reducing costs and increasing the productivity of agrophytocenoses.
Дослідження присвячено розробленню інтелектуальної системи прогнозування врожайності з використанням супутникових та геоінформаційних даних і кліматичних показників. Впровадження сучасних інформаційних технологій, зокрема методів машинного навчання та аналізу великих даних, надає фахівцям аграрного сектору стратегічні переваги, що дає можливість знижувати ризики надмірного використання пестицидів і сприяти сталому розвитку сільського господарства. Це дослідження спрямоване на оптимізацію використання десикантів на соняшнику шляхом моделювання обсягів можливих втрат врожаю на основі одержаних у період вегетації культури даних. Використання цифрових рішень є актуальним для рослинництва, оскільки забезпечує підвищення точності прогнозів та ефективності управлінських рішень, сприяючи зменшенню витрат та збільшенню продуктивності агрофітоценозів.
|
| first_indexed | 2025-12-07T18:43:23Z |
| format | Article |
| fulltext |
17
ОПОВІДІ
НАЦІОНАЛЬНОЇ
АКАДЕМІЇ НАУК
УКРАЇНИ
ISSN 1025-6415. Допов. Нац. акад. наук Укр. 2025. № 4: 17—26
C i t a t i o n: Hnatiienko V.H., Hnatiienko H.M., Zozulya O.L., Snytyuk V.Ye., Schwartau V.V. Site-specifi c sunfl ower yield
forecasting based on spatial analysis and machine learning. Dopov. Nac. akad. nauk Ukr. 2025. No. 4. P. 17—26. https://
doi.org/10.15407/dopovidi2025.04.017
© Publisher PH «Akademperiodyka» of the NAS of Ukraine, 2025. Th is is an open access article under the CC BY-NC-ND
license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
БІОЛОГІЯ
BIOLOGY
https://doi.org/10.15407/dopovidi2025.04.017
UDC 519.7+004.8
V.H. Hnatiienko1, https://orcid.org/0009-0000-2678-5158
H.M. Hnatiienko1, https://orcid.org/0000-0002-0465-5018
O.L. Zozulya2, https://orcid.org/0000-0003-3500-3423
V.Ye. Snytyuk1, https://orcid.org/0000-0002-9954-8767
V.V. Schwartau3, https://orcid.org/0000-0001-7402-5559
1 Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
2 Syngenta LLC, Kyiv, Ukraine
3 Institute of Plant Physiology and Genetics of the NAS of Ukraine, Kyiv, Ukraine
E-mail: g.gna5@ukr.net
Site-specifi c sunfl ower yield forecasting
based on spatial analysis and machine learning
Presented by Academician of the NAS of Ukraine V.V. Morgun
Th e study focuses on the development of an intelligent yield forecasting system using satellite data, geospatial data
and climate indicators. Th e introduction of modern information technologies, in particular machine learning and big
data analysis methods, provides agricultural professionals with strategic advantages, reducing the risks of excessive
pesticide use and promoting sustainable agricultural development. Th is study aims to optimize desiccant application
in sunfl ower cultivation by modeling potential yield losses based on data obtained during the growing season. Th e
use of digital solutions is relevant for crop production, as it increases the accuracy of forecasts and the effi ciency of
management decisions, while reducing costs and increasing the productivity of agrophytocenoses.
Keywords: satellite data, climate indicators, machine learning, big data analysis, vegetation indices, FAO, loss
forecasting, desiccation.
Introduction. Th e active development of digital agronomy opens up broad prospects for intensi-
fying the development of the agricultural sector, while at the same time given a rise to a number
of complex tasks and challenges. Amid climate change, market price fl uctuations and growing
demands on the effi ciency of natural resources use, the need to harmonize Ukrainian legislation
in the fi eld of plant protection with European standards with accurate budget planning and opti-
18 ISSN 1025-6415. Dopov. Nac. akad. nauk Ukr. 2025. No. 4
V.H. Hnatiienko, H.M. Hnatiienko, O.L. Zozulya, V.Ye. Snytyuk, V.V. Schwartau
mization of plant care methods is becoming increasingly important. Th e modern development of
digital technologies, artifi cial neural networks, artifi cial intelligence and new approaches to statis-
tical data processing allows farmers to move to a new level of agricultural production. Th erefore,
the search for the application of these new methods of obtaining and processing information is
an urgent problem for the agricultural sector. Aft er all, the successful application of such modern
methods is an essential factor in ensuring food security [1].
Th e shortcomings and limitations of traditional statistical approaches, which provide only
a rough estimate of yields, become particularly apparent in the face of the demands of modern
agricultural production. While these methods can model potential performance, they oft en fail
to meet the need for accurate and detailed planning. At the same time, artifi cial intelligence,
with its capabilities of deep analysis of large amounts of data and machine learning, opens up
new horizons for the development of the agricultural sector of the economy. Digitalization of
processes in agrophytocenoses has great potential for the development of crop production, fa-
cilitating the creation of innovative solutions to optimize agrotechnical measures and improve
production effi ciency. Th e transition to the use of these advanced technologies requires not only
the development of new tools and methods, but also a profound rethinking of approaches to the
management of agricultural processes.
In [2], the authors developed and presented a mathematical framework for minimizing sun-
fl ower yield losses. Th e study was based on spatial analysis of satellite images. Th e scientifi c results
were obtained by applying machine learning methods.
Analysis of the latest research and publications. Modern yield forecasting systems using
artifi cial intelligence cover a wide range of technologies. Here are some important areas of re-
search in this area.
1. Th e use of deep neural networks (DNNs) can accurately predict yield by analyzing data
on genotype, weather and soil characteristics, as well as the historically identifi ed productivity
zones in each fi eld, demonstrating an average accuracy of 85—89 %. However, the main dis-
advantage of this approach is its limitation to fi eld-level forecasts, which does not allow taking
into account microclimatic and soil variations within a fi eld that are important for detailed
forecasting [3].
2. Machine learning using traditional algorithms provides high accuracy in predicting the
overall fi eld yield. However, despite the theoretical possibilities of detailed analysis, as a rule, de-
tailed forecasting for individual plots is not realized [4].
3. Some studies have applied recurrent neural networks using reinforcement learning to pre-
dict yield [5], achieving an average accuracy of 93.7 %. Th is study also did not address distributed
(spatially resolved) forecasting.
4. In study [6] stratifi ed sampling for potato yield forecasting using empirical equations
based on NDVI and SAVI indices was considered. Th e authors point out that the forecasting is
performed with errors of 3.8—7.5 %, but the test and training samples are formed on data of the
same fi elds, so the possibilities of generalization and practical application of the method were
not analyzed.
Modern approaches to yield forecasting, including the use of artifi cial intelligence technolo-
gies, have made it possible to achieve signifi cant results in processing large amounts of data and
providing accurate forecasts at the whole-fi eld level. Despite their eff ectiveness in identifying
general yield trends, the methods discussed above have signifi cant limitations, especially when it
19ISSN 1025-6415. Допов. Нац. акад. наук Укр. 2025. № 4
Site-specifi c sunfl ower yield forecasting based on spatial analysis and machine learning
comes to achieving resolution at the scale of individual fi eld plots. Forecasting is complicated by
the fact that many factors are completely unpredictable, such as rainfall, number of days of poten-
tial vegetation, natural disasters, etc.
Th e main problem is that most existing methods are designed to predict total fi eld yields
and do not take into account internal variations that can be critical for eff ective management
of agronomic measures. Th is limitation does not allow for a detailed productivity map, which,
in turn, limits the potential of such systems in a number of key tasks. In particular, optimiza-
tion of diff erentiated application of fertilizers and plant-protection products, maintenance of
water regime, and analysis of the impact of diff erent combinations of parameter values on the
productivity of individual plots remain beyond the capabilities of these technologies. It is quite
diffi cult to compare certain technologies due to the high variability of individual plots. Th us, the
development of detailed analysis and forecasting methods at the individual plot level is becom-
ing a priority for improving yield forecasting systems, opening up new prospects for precision
agriculture.
Objective of the study. Th e main purpose of this study is to improve the accuracy of yield
forecasting
Th is will make it possible to predict variations in crop productivity formation due to uneven
maturation of sunfl ower and avoid yield losses. Th ese losses can be signifi cantly reduced by de-
siccation. However, in each specifi c case the question arises whether these losses are signifi cant
enough to invest in an additional agrotechnical measure — desiccation. Th e question also arises
whether it makes sense to use diff erentiated application of the product (pesticide) to specifi c plots
as an alternative to continuous spraying of the fi eld. Th is is achieved by introducing the ability to
accurately predict yields in individual plots of the fi eld, to predict losses in each area, to optimize
sowing and plant care conditions, including sowing dates, sowing density, timing and intensity of
herbicide and fungicide applications.
Materials and methods. To build the model, we identifi ed predictors that can aff ect the non-
uniform ripening, namely: sowing date, weather conditions, soil moisture, FAO (or hybrid matu-
rity group), predecessor crops, etc. Such a forecast allows the farmer to assess the economic fea-
sibility of the planned agrotechnical protection measures, apply selective treatment of individual
plots, reducing the pesticide burden on the environment.
Th e task of forecasting yields is extremely complex, comparable to weather forecasting. It
requires not only taking into account a large number of parameters, but also identifying the key
factors that have the greatest impact on the result.
Yield depends on many parameters: chemical composition and structure of the soil, its mois-
ture, pH, types of fertilizers and methods of their application, etc. Other important parameters
include information on weather conditions: air temperature, precipitation, soil moisture and solar
radiation intensity. Th e presence and activity of pests and diseases also have a signifi cant im-
pact on yields. Agrotechnical measures are equally important: tillage, crop rotation, sowing and
harvesting methods. In addition, genetic characteristics of seeds, their resistance to diseases and
adaptability to weather conditions should be taken into account.
It is necessary to develop methods for predicting the yield of each fi eld area based on the
analysis of detailed data on plant condition. Such data include maps of refl ected solar radiation
intensity obtained from satellite images in diff erent spectra, which are converted into vegeta-
tion indices NDVI, NDWI, CLg, CLr, GLI [7]. Meteorological data are also important: tempe-
20 ISSN 1025-6415. Dopov. Nac. akad. nauk Ukr. 2025. No. 4
V.H. Hnatiienko, H.M. Hnatiienko, O.L. Zozulya, V.Ye. Snytyuk, V.V. Schwartau
rature, precipitation, wind speed and direction, cloud cover, solar radiation, and atmospheric
pressure. Th ese data are supplemented by information on agrotechnical measures, including
herbicide and fungicide treatments, the ripeness group of the sunfl ower hybrid (fi ve groups
from early to late), and sowing density. All these data aff ect the maturity rate of sunfl ower. A
dataset containing yield data in tons per hectare for each fi eld plot is used to train and validate
the model.
Results and discussion. Let us introduce some notations to further describe the data struc-
ture and methods.
Let Xi be the matrix of fi eld i, xijkl ∈ Xi be the elements of the matrix;
i = 1, 2, …, g — fi eld numbers; g — number of fi elds in the training set;
j ∈ J = {j1, j2, …, jn} — days of observations; n — number of days of observations conducted
for the fi eld;
k = 1, 2, …, m — are the row numbers of the matrix Xi; m is the number of fi eld plots; each
plot corresponds to a row of the matrix;
l ∈ L = {NDVI, NDWI, GLI, CLr, CLg, wind speed, seeding density, …} — parameters of input
information.
Th e input data for yield forecasting are extremely voluminous due to a wide range of param-
eters and the length of the observation period. Th eir structure is shown in Fig. 1. Th e data is a
multidimensional array of dimension m × |L| × n.
In order to reduce the dimensionality of the input data vector and increase the forecasting
effi ciency, preliminary analysis and selection of the most informative features is performed. One
of the tools of this process is correlation analysis, which allows to identify statistical relationships
between parameters. Features that are highly correlated with each other are identifi ed, and among
them, and the most informative ones are determined by expert judgment, while others are dis-
carded to reduce the dimensionality of the data set. Th is increases the model effi ciency by reduc-
ing the computational burden.
Th e following steps are performed at the preprocessing stage.
Step 1. Removal of outliers for each day separately for each fi eld using the z-score me-
thod [8, 9].
Th e z-score for each item is determined by the formula:
,ijkl ijl
ijkl
ijl
x
z
where xijkl ∈ Xi;
1 ,
m
ijkl
k
ijl
x
m
is the average value oft he parameter l for day j in the matrix Xi;
2
1
( )
m
ijkl ijl
k
ijl
x
m
is the standard deviation oft he parameter l for day j in the matrix Xi; zijkl
is the z-score of the element xijkl.
Heuristic E1. We consider the element xijkl to be an outlier if the value is |zijkl| > 2.9. In this
case, the entire row k for the day j of the matrix Xi is removed from the training set.
21ISSN 1025-6415. Допов. Нац. акад. наук Укр. 2025. № 4
Site-specifi c sunfl ower yield forecasting based on spatial analysis and machine learning
Step 2. Data aggregation:
For each matrix Xi, 1,i g , for each sequence of elements xijkl ∈ Xi, j ∈ J, the aggregates are
calculated using the formulas:
min min( )ikl ijklj J
x x
, mean 1
ikl ijkl
j J
x x
J
, max max( )ikl ijklj J
x x
,
forming a matrix of aggregated values iX .
Step 3. Combining data into a common dataset X:
1,
i
i g
X X
.
Step 4. Repeated removal of outliers on the merged dataset X:
,ql l
ql
l
x
z
where ,qlx X is the matrix element corresponding to row q and the parameter l;
1
s
ql
q
l s
x
— is the average value of the parameter l in the matrix X;
2
1
( )
s
ql l
q
l s
x
— is the standard deviation of the parameter l in the matrix X;
qlz — z-score of the element qlx .
Heuristic E2. We consider the element qlx to be an outlier if 2.9qlz . In this case, the entire
row q of the matrix X is removed from the training set.
Repeated removal of outliers is an important step because removing outliers separately for
each fi eld does not guarantee the absence of erroneous observations in the aggregated dataset.
When combining diff erent fi elds, it is possible that data that were considered normal for one fi eld
become abnormal in the context of the overall set due to diff erences in scale, distributions, or
other characteristics. Th erefore, it is necessary to remove outliers again to ensure the consistency
and homogeneity of all data.
1
2
3
4
5
6
7
8
9
...
m
NDVI NDWIGLI GLr GLg wind speed seeding density ...
day jn
day j2
day j1
...
Fig. 1. Schematic representation of the structure of
the training dataset before preprocessing
22 ISSN 1025-6415. Dopov. Nac. akad. nauk Ukr. 2025. No. 4
V.H. Hnatiienko, H.M. Hnatiienko, O.L. Zozulya, V.Ye. Snytyuk, V.V. Schwartau
To record the data on the dates of herbicide and fungicide treatments that are part of the use
of crop protection products, an algorithm similar to one-hot encoding was implemented. Instead
of using the number of days from the date of sowing to the moment of treatment, the dates were
encoded as categorical variables representing predefi ned day-range buckets. Th e experts estab-
lished typical time ranges for fungicide (34—82 days from sowing date) and herbicide (24—62
days) applications. Th ese observation day ranges were categorized separately. For fungicides,
these are {34, 46, 58, 70, 82}, and for herbicides, {24, 33, 43, 52, 62}. Each date of chemical appli-
cation refers to the closest category. For example, if fungicides were applied on the 48th day aft er
sowing, the vector for this case would consist of the elements {0, 1, 0, 0, 0}, which corresponds
to category 46.
As a result of applying the described data processing methods, a vector of traits describing its
development during the ripening period was constructed for each fi eld plot. One of the models
was built with the Light Gradient Boosting Machine (LightGBM) algorithm [10]. Th e model was
used to predict yield for each fi eld area in isolation.
Results of studying the efficiency of the yield forecasting system [2]
Field MAE Accuracy Predicted harvest,
tons
Actual harvest,
tons Area, ha
Flora__Baba__22 0.360007 98.299489 189.184446 192.400052 101
East-West__Serby_26__23 0.722826 96.284523 89.576487 87.459699 26.6
East-West __Serby_37__23 0.585210 92.651682 98.978797 106.251481 37
East-West __Serby_57__23 0.705239 93.220749 209.482780 195.217092 57.4
East-West __Serby_69__23 0.724573 95.723041 253.647770 242.73062 69
Zhuravske__Field_2__22 0.548212 86.796210 126.477451 109.757490 29.9
Av
er
ag
e M
A
E
pe
r fi
el
d
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1 Av
er
ag
e a
cc
ur
ac
y
pe
r fi
el
d
100 250 Predicted yield
Actual yield
200
100
150
50
0
90
80
70
60
30
40
50
20
10
Yi
eld
, t
Fl
or
e_
Ba
ba
_2
2
Ea
st-
W
es
t_
Se
rb
y_
26
_2
3
Ea
st-
W
es
t_
Se
rb
y_
37
_2
3
Ea
st-
W
es
t_
Se
rb
y_
57
_2
3
Zh
ur
av
sk
e_
Fi
el
d_
2_
22
Ea
st-
W
es
t_
Se
rb
y_
69
_2
3
Fl
or
e_
Ba
ba
_2
2
Ea
st-
W
es
t_
Se
rb
y_
26
_2
3
Ea
st-
W
es
t_
Se
rb
y_
37
_2
3
Ea
st-
W
es
t_
Se
rb
y_
57
_2
3
Zh
ur
av
sk
e_
Fi
el
d_
2_
22
Ea
st-
W
es
t_
Se
rb
y_
69
_2
3
Fl
or
e_
Ba
ba
_2
2
Ea
st-
W
es
t_
Se
rb
y_
26
_2
3
Ea
st-
W
es
t_
Se
rb
y_
37
_2
3
Ea
st-
W
es
t_
Se
rb
y_
57
_2
3
Zh
ur
av
sk
e_
Fi
el
d_
2_
22
Ea
st-
W
es
t_
Se
rb
y_
69
_2
3
Fig. 2. Visualization of the obtained accuracy indicators [2]
23ISSN 1025-6415. Допов. Нац. акад. наук Укр. 2025. № 4
Site-specifi c sunfl ower yield forecasting based on spatial analysis and machine learning
In addition to the traditional approach based on the vector representation of monitoring pa-
rameters for a single site, it is proposed to consider the spatial context by analyzing an addi-
tional dimension that covers a set of neighboring sites. Th is allows taking into account spatial
dependencies and obtaining comprehensive information on the state of agricultural areas [11, 12].
Y
48.458 Values range
Values range
1.23—1.76
1.76—1.84
1.84—1.93
1.93—2.01
2.01—2.63
2.48—2.60
2.36—2.48
2.24—2.36
2.12—2.24
Values range
1.00—1.46
1.46—1.91
1.91—2.37
1.61—2.12
2.37—2.82
2.82—4.99
Values range
1.00—1.41
1.41—1.83
1.83—2.24
2.24—2.66
2.66—4.99
48.456
48.454
48.452
48.450
48.448
48.446
28.5725 28.5775 28.5875 X28.5825
Y
48.446
48.444
48.442
48.440
48.438
48.436
48.446
48.444
48.442
48.440
48.438
48.436
28.5875 28.5925 28.6025 X28.5975 28.5875 28.5925 28.602528.5975
Predicted
Y
48.458
48.456
48.454
48.452
48.450
48.448
48.446
28.5725 28.5775 28.5875 X28.5825
Y
X
Actual
Fig. 3. On the left is a map of projected yields, on the right is the actual yield
24 ISSN 1025-6415. Dopov. Nac. akad. nauk Ukr. 2025. No. 4
V.H. Hnatiienko, H.M. Hnatiienko, O.L. Zozulya, V.Ye. Snytyuk, V.V. Schwartau
Th e combination of these approaches provides a synergistic eff ect [13], increases the accuracy of
forecasting [1, 14], and expands the possibilities of analyzing the studied objects. To implement
spatial context analysis, a computer vision model based on U-Net architecture was developed to
eff ectively identify high-productivity zones and zones with potential yield reduction by analyzing
spatial relationships between plots.
Since agricultural fi elds have a variety of sizes, this study used a method of splitting the images
into smaller parts, known as patches, to process the data effi ciently. Th is approach allows detailed
segmentation of each part of the fi eld separately, aft er which the resulting patches are matched
together to form a complete segmented image of the fi eld. In the process of overlaying the patches,
a weighting method was used, allowing for smoother merging of the image parts. Each pixel in
the overlapping areas receives a weight depending on its distance to the center of the patch. Th is
procedure contributes to a soft er and more natural transition between segments.
Th e U-net model developed for land plot segmentation allows for large-scale analysis of plant
development conditions, taking into account the general features of the fi eld and the individual
characteristics of each plot. Aft er the segmentation is completed, an additional stage of analysis
is performed for each fi eld area using the LightGBM model: forecasting is performed taking into
account the defi ned fi eld segment.
As part of Syngenta’s experimental research, a dataset has been created for a number of fi elds
to determine the accuracy of yield prediction. Each fi eld is divided into separate plots for which
the model is used to predict yields. Th e forecast for each plot is compared with the actual harvest
data, and the root mean square error (RMSE) is calculated.
Next, the yield of each plot (tons per hectare) is converted to tons by multiplying by the plot
area. Th is way, the total yield of the entire fi eld is calculated from the predicted data and compared
to the actual total yield. Total-fi eld forecast accuracy is assessed as the percentage error between
predicted and actual yields.
Although the key indicator is the accuracy of the total yield, the accuracy of the prediction for
individual plots is also important for further research. Th is will allow the system to be scaled up and
generalized, ensuring high accuracy in both general forecasting and individual plot application.
Th e results of the yield forecasting system effi ciency analysis are shown in Table and Fig. 2.
Th e use of the developed models allowed us to obtain estimates of potential yields with high
accuracy. At the same time, the forecasting accuracy is not stable enough: the minimum value is
87.62 %, the maximum is 97.88 %. Th e mean absolute error (MAE) across all fi eld plots is 0.608.
Th e average accuracy of total yield forecasting is 92.78 %.
During the study we faced a constraint of a very limited training and testing dataset. Th e
training sample contains only 8 fi elds, which signifi cantly limits the generalizability of the model.
Th is contrasts with modern studies that use hundreds and sometimes thousands of fi elds for train-
ing. Expanding the training dataset will allow additional patterns to emerge, thereby improving
model accuracy and enabling more reliable yield forecasts in diff erent agroclimatic conditions and
regions. Th is is important to improve the accuracy of forecasts and make the model more versatile
and suitable for a wide range of applications.
Fig. 3 shows examples of forecast visualization. Each fi gure shows a map of actual yields on the
left and a map of predicted yields on the right.
Conclusions. Th e introduction of a model capable of generating high-resolution forecasts
localized to individual plots opens up new prospects for the application of digital approaches in
25ISSN 1025-6415. Допов. Нац. акад. наук Укр. 2025. № 4
Site-specifi c sunfl ower yield forecasting based on spatial analysis and machine learning
agriculture. Th is approach allows for an in-depth analysis of the impact of local factors on plant
development and yield, which helps to identify optimal conditions or negative factors for their
growth. In addition, it opens up opportunities to optimize the variable-rate application of fertiliz-
ers and crop-protection chemicals, signifi cantly increasing the effi ciency of agricultural practices
and minimizing the negative impact of human activity on the environment.
Th e high accuracy of forecasts provided by the model enables highly reliable budget plan-
ning for farming enterprises. Th is, in turn, allows agricultural producers to eff ectively plan and
optimize costs, ensuring more effi cient resource management and increased overall profi tability.
Th us, the implementation of the described model opens up signifi cant opportunities for increas-
ing productivity and ensuring sustainable development of the agricultural sector.
In further research, enriching the training set by adding data from a variety of geographical
locations and growing conditions will be key to maximizing the model’s versatility, allowing it to
work eff ectively with a wider range of agroecosystems.
Optimization of the training set by balancing its structure by removing overrepresented data
and augmenting under-represented categories plays an important role in improving the accuracy
of forecasts. Th is will allow the model to better adapt to the variability of conditions and cha-
racteristics of diff erent types of crops. Th e presented digital solutions are promising for further
development and integration with nutrient management and crop protection systems as part of
agrophytocenosis management, as well as a way to ensure the country’s food security.
REFERENCES
1. Schwartau, V. V. (2024). Biological factors affecting food security in Ukraine: According to the materials of sci-
entific report at the meeting of the Presidium of NAS of Ukraine, February 7, 2024. Visn. Nac. Acad. Nauk Ukr.,
No. 4, pp. 15-24 (in Ukrainian). https://doi.org/10.15407/visn2024.04.015
2. Hnatiienko, V. H., Hnatiienko, H. M., Zozulya, O. L. & Snytyuk, V. Ye. (2024). Method of forecasting yield of
agricultural crops using multifactor analysis and neural networks. Scientific Bulletin of Uzhhorod University.
Series of Mathematics and Informatics, 44, No. 1, pp. 93-105 (in Ukrainian). https://doi.org/10.24144/2616-
7700.2024.44(1).93-105
3. Khaki, S. & Wang, L. (2019). Crop yield prediction using deep neural networks. Front. Plant Sci., 10, 621. https://
doi.org/10.3389/fpls.2019.00621
4. Paudel, D., Boogaard, H., de Wit, A., Janssen, S., Osinga, S., Pylianidis, C. & Athanasiadis, I. N. (2021). Ma-
chine learning for large-scale crop yield forecasting. Agric. Syst., 187, 103016. https://doi.org/10.1016/
j.agsy.2020.103016
5. Elavarasan, D. & Vincent, P. M. D. (2020). Crop yield prediction using deep reinforcement learning model for
sustainable agrarian applications. IEEE Access, 8, pp. 86886-86901. https://doi.org/10.1109/ACCESS.2020.
2992480
6. Al-Gaadi, K. A., Hassaballa, A. A., Tola, E., Kayad, A. G., Madugundu, R., Alblewi, B. & Assiri, F. (2016). Predic-
tion of potato crop yield using precision agriculture techniques. PLoS ONE, 11(9), e0162219. https://doi.
org/10.1371/journal.pone.0162219
7. Zozulia, O. L., Schwartau, V. V., Mykhalska, L. M., Kovel, O. L., Hnatienko, H. M., Snitiuk, V. E., Domrachev, V. M.
& Tmenova, N. P. (2023). Modern methods of digital monitoring in crop production. Kyiv, Vid A do Ya (in
Ukrainian).
8. Anusha, P. V., Anuradha, Ch., Murty, P. S. R. C. & Kiran, Ch. S. (2019). Detecting outliers in high dimensional
data sets using z-score methodology. Int. J. Innov. Technol. Explor. Eng., 9, No. 1, pp. 48-53. https://doi.
org/10.35940/ijitee.A3910.119119
9. Jiao, L., Huo, L., Hu, C. & Tang, P. (2001). Refined UNet: UNet-based refinement network for cloud and shadow
precise segmentation. Remote Sens., 12, No. 12. https://doi.org/10.3390/rs12122001
26 ISSN 1025-6415. Dopov. Nac. akad. nauk Ukr. 2025. No. 4
V.H. Hnatiienko, H.M. Hnatiienko, O.L. Zozulya, V.Ye. Snytyuk, V.V. Schwartau
10. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q. & Liu, T.-Y. (2017, December). LightGBM: A
highly efficient gradient boosting decision tree. Proceeding of the 31st Conference on Neural information pro-
cessing systems 30 (NIPS 2017), (pp. 3146-3154), Long Beach, CA, USA.
11. Bilan, S., Hnatiienko, V., Ilarionov, O. & Krasovska, H. (2023, September). The technology of selection and
recognition of information objects on images of the earth’s surface based on multi-projection analysis. Proceed-
ings of the 3th International Scientific Symposium “Intelligent Solutions” (IntSol-2023), (pp. 23-32), Kyiv —
Uzhhorod, Ukraine.
12. Hnatiienko, H., Domrachev, V. & Saiko, V. (2021). Monitoring the condition of agricultural crops based on the
use of clustering methods. Proceeding of the 15th International Conference Monitoring of geological processes
and ecological condition of the environment, Vol. 2021 (pp. 1-5), Kyiv, Ukraine. https://doi.org/10.3997/2214-
4609.20215K2049
13. Hnatiienko, V. & Snytyuk, V. (2024, October). Site-specific forecasting of agricultural crop yield as a technology
and service. Proceedings of the 8th International Scientific and Practical Conference Applied information sys-
tems and technologies in the digital society (AISTDS 2024), (pp. 44-55). Kyiv, Ukraine.
14. Hnatiienko, V. & Hnatiienko, H. (2024). Integration of machine learning and deep learning methods for sun-
flower yield prediction. Management of Development of Complex Systems, 59, pp. 225-234. https://doi.
org/10.32347/2412-9933.2024.59.225-234
Received 05.05.2025
В.Г. Гнатієнко1, https://orcid.org/0009-0000-2678-5158
Г.М. Гнатієнко1, https://orcid.org/0000-0002-0465-5018
О.Л. Зозуля2, https://orcid.org/0000-0003-3500-3423
В.Є. Снитюк1, https://orcid.org/0000-0002-9954-8767
В.В. Швартау3, https://orcid.org/0000-0001-7402-5559
1 Київський національний університет ім. Тараса Шевченка, Київ, Україна
2 ТОВ “Сингента”, Київ, Україна
3 Інститут фізіології рослин і генетики НАН України, Київ, Україна
E-mail: g.gna5@ukr.net
РОЗПОДІЛЕНЕ ПРОГНОЗУВАННЯ ВРОЖАЙНОСТІ СОНЯШНИКА
НА ОСНОВІ ПРОСТОРОВОГО АНАЛІЗУ ТА МАШИННОГО НАВЧАННЯ
Дослідження присвячено розробленню інтелектуальної системи прогнозування врожайності з викорис-
танням супутникових та геоінформаційних даних і кліматичних показників. Впровадження сучасних ін-
формаційних технологій, зокрема методів машинного навчання та аналізу великих даних, надає фахівцям
аграрного сектору стратегічні переваги, що дає можливість знижувати ризики надмірного використання
пестицидів і сприяти сталому розвитку сільського господарства. Це дослідження спрямоване на оптиміза-
цію використання десикантів на соняшнику шляхом моделювання обсягів можливих втрат врожаю на
основі одержаних у період вегетації культури даних. Використання цифрових рішень є актуальним для
рослинництва, оскільки забезпечує підвищення точності прогнозів та ефективності управлінських рі-
шень, сприяючи зменшенню витрат та збільшенню продуктивності агрофітоценозів.
Ключові слова: супутникові дані, кліматичні показники, машинне навчання, аналіз великих даних, вегета-
ційні індекси, ФАО, прогнозування втрат, десикація.
|
| id | nasplib_isofts_kiev_ua-123456789-206609 |
| institution | Digital Library of Periodicals of National Academy of Sciences of Ukraine |
| issn | 1025-6415 |
| language | English |
| last_indexed | 2025-12-07T18:43:23Z |
| publishDate | 2025 |
| publisher | Видавничий дім "Академперіодика" НАН України |
| record_format | dspace |
| spelling | Hnatiienko, V.H. Hnatiienko, H.M. Zozulya, O.L. Snytyuk, V.Ye. Schwartau, V.V. 2025-09-16T15:33:38Z 2025 Site-specific sunflower yield forecasting based on spatial analysis and machine learning / V.H. Hnatiienko, H.M. Hnatiienko, O.L. Zozulya, V.Ye. Snytyuk, V.V. Schwartau // Доповіді Національної академії наук України. — 2025. — № 4. — С. 17-26. — Бібліогр.: 14 назв. — англ. 1025-6415 https://nasplib.isofts.kiev.ua/handle/123456789/206609 519.7+004.8 https://doi.org/10.15407/dopovidi2025.04.017 The study focuses on the development of an intelligent yield forecasting system using satellite data, geospatial data and climate indicators. The introduction of modern information technologies, in particular machine learning and big data analysis methods, provides agricultural professionals with strategic advantages, reducing the risks of excessive pesticide use and promoting sustainable agricultural development. This study aims to optimize desiccant application in sunflower cultivation by modeling potential yield losses based on data obtained during the growing season. The use of digital solutions is relevant for crop production, as it increases the accuracy of forecasts and the efficiency of management decisions, while reducing costs and increasing the productivity of agrophytocenoses. Дослідження присвячено розробленню інтелектуальної системи прогнозування врожайності з використанням супутникових та геоінформаційних даних і кліматичних показників. Впровадження сучасних інформаційних технологій, зокрема методів машинного навчання та аналізу великих даних, надає фахівцям аграрного сектору стратегічні переваги, що дає можливість знижувати ризики надмірного використання пестицидів і сприяти сталому розвитку сільського господарства. Це дослідження спрямоване на оптимізацію використання десикантів на соняшнику шляхом моделювання обсягів можливих втрат врожаю на основі одержаних у період вегетації культури даних. Використання цифрових рішень є актуальним для рослинництва, оскільки забезпечує підвищення точності прогнозів та ефективності управлінських рішень, сприяючи зменшенню витрат та збільшенню продуктивності агрофітоценозів. en Видавничий дім "Академперіодика" НАН України Доповіді НАН України Біологія Site-specific sunflower yield forecasting based on spatial analysis and machine learning Розподілене прогнозування врожайності соняшника на основі просторового аналізу та машинного навчання Article published earlier |
| spellingShingle | Site-specific sunflower yield forecasting based on spatial analysis and machine learning Hnatiienko, V.H. Hnatiienko, H.M. Zozulya, O.L. Snytyuk, V.Ye. Schwartau, V.V. Біологія |
| title | Site-specific sunflower yield forecasting based on spatial analysis and machine learning |
| title_alt | Розподілене прогнозування врожайності соняшника на основі просторового аналізу та машинного навчання |
| title_full | Site-specific sunflower yield forecasting based on spatial analysis and machine learning |
| title_fullStr | Site-specific sunflower yield forecasting based on spatial analysis and machine learning |
| title_full_unstemmed | Site-specific sunflower yield forecasting based on spatial analysis and machine learning |
| title_short | Site-specific sunflower yield forecasting based on spatial analysis and machine learning |
| title_sort | site-specific sunflower yield forecasting based on spatial analysis and machine learning |
| topic | Біологія |
| topic_facet | Біологія |
| url | https://nasplib.isofts.kiev.ua/handle/123456789/206609 |
| work_keys_str_mv | AT hnatiienkovh sitespecificsunfloweryieldforecastingbasedonspatialanalysisandmachinelearning AT hnatiienkohm sitespecificsunfloweryieldforecastingbasedonspatialanalysisandmachinelearning AT zozulyaol sitespecificsunfloweryieldforecastingbasedonspatialanalysisandmachinelearning AT snytyukvye sitespecificsunfloweryieldforecastingbasedonspatialanalysisandmachinelearning AT schwartauvv sitespecificsunfloweryieldforecastingbasedonspatialanalysisandmachinelearning AT hnatiienkovh rozpodíleneprognozuvannâvrožainostísonâšnikanaosnovíprostorovogoanalízutamašinnogonavčannâ AT hnatiienkohm rozpodíleneprognozuvannâvrožainostísonâšnikanaosnovíprostorovogoanalízutamašinnogonavčannâ AT zozulyaol rozpodíleneprognozuvannâvrožainostísonâšnikanaosnovíprostorovogoanalízutamašinnogonavčannâ AT snytyukvye rozpodíleneprognozuvannâvrožainostísonâšnikanaosnovíprostorovogoanalízutamašinnogonavčannâ AT schwartauvv rozpodíleneprognozuvannâvrožainostísonâšnikanaosnovíprostorovogoanalízutamašinnogonavčannâ |