Application of machine learning methods and remote sensing data for crop yield forecasting

Authors

DOI:

https://doi.org/10.17721/1728-2713.111.13

Keywords:

artificial intelligence (AI), machine learning (ML), remote sensing (RS), random forest (RF), gradient boosting (GB), normalized difference moisture index (NDMI), green normalized difference vegetation index (GNDVI), precision agriculture (PA), Correlation Analysis (CA)

Abstract

Background. Forecasting agricultural crop yields has always been a complex task, particularly in the context of climate instability and increasing pressure on resources. Given the limitations of classical mathematical models in such a complex field as agricultural analytics, data-driven approaches and machine learning-based methods are becoming increasingly important. The combination of satellite imagery, agrochemical soil analysis, and artificial intelligence algorithms is particularly promising for building flexible and accurate forecasts.

Methods. This study analyzes two agricultural fields located in different regions of Ukraine with varying natural conditions. A comprehensive dataset was collected, including topographic features (elevation, slope, topographic wetness index), spectral indices from Sentinel-2A and Landsat 8 satellites (specifically, NDMI and GNDVI), and soil chemical composition. Correlation analysis was used to identify which indicators are most closely associated with yield levels. Yield prediction models were developed using Random Forest and Gradient Boosting algorithms, adapted to field subplots of 5 ha and 1 ha.

Results. The analysis revealed that vegetation condition and crop water balance (NDMI, GNDVI) are the most effective indicators in explaining yield variability. Meanwhile, surface temperature showed a clearly negative impact, suggesting potential heat stress during the grain filling periods. Gradient Boosting demonstrated particularly high sensitivity to spatial detail, reaching a prediction accuracy of R²=0.801 at the 1 ha grid level. In contrast, Random Forest proved to be a robust method with lower sensitivity to data scale.

Conclusions. The study demonstrates that combining satellite imagery, soil analysis results, and machine learning methods can significantly improve the accuracy of crop yield prediction. The developed models incorporate vegetation indices along with factors describing crop growing conditions. A comparison of various algorithms was also conducted under different levels of spatial detail. The results indicate that the proposed approach can be effectively applied in precision agriculture, particularly for agronomic planning and crop monitoring.

References

Alpaydin, E. (2010). Introduction to machine learning (2nd ed.). The MIT Press.

Ayodele, T. O. (2010). Introduction to machine learning. In IntechOpen. https://www.intechopen.com/chapters/10703

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451

Gnatienko, I., Sorochynskyi, I., & Derkach, M. (2024). Comparative analysis of ensemble learning methods for yield prediction based on remote sensing. Journal of Precision Agriculture and Data Science, 12(1), 45–58 [in Ukrainian]. [Гнатієнко, І., Сорочинський, І., & Деркач, М. (2024). Порівняльний аналіз методів ансамблевого навчання для прогнозування врожайності на основі дистанційного зондування. Журнал прецизійного землеробства та наук про дані, 12(1), 45–58].

Ivanenko, O., & Sakhno, S. (2021). Use of NDVI and NDMI indices for crop yield estimation in Ukrainian agricultural systems. Agroecology and Land Management, 24(3), 71–78 [in Ukrainian]. [Іваненко, О., & Сахно, С. (2021). Використання індексів NDVI та NDMI для оцінки врожайності в агросистемах України. Агроекологія і землекористування, 24(3), 71–78].

Kravchenko, P., & Danylenko, V. (2022). Assessment of vegetation stress indicators for yield forecasting in arid zones of Ukraine. Ukrainian Journal of Remote Sensing, 10(2), 33–41 [in Ukrainian]. [Кравченко, П., & Даниленко, В. (2022). Оцінка індикаторів вегетаційного стресу для прогнозування врожайності в посушливих регіонах України. Український журнал дистанційного зондування, 10(2), 33–41].

Kussul, N., Lavreniuk, M., Skakun, S., & Shelestov, A. (2017). Deep learning classification of land cover and crop types using remote sensing data. IEEE Geoscience and Remote Sensing Letters, 14(5), 778–782. https://doi.org/10.1109/LGRS.2017.2681128

Kuwata, K., & Shibasaki, R. (2015). Estimating crop yields with deep learning and remotely sensed data. In Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (pp. 858–861). IEEE. https://doi.org/10.1109/IGARSS.2015.7325924

Lobell, D. B., Thau, D., Seifert, C., Engle, E., & Little, B. (2015). A scalable satellite-based crop yield mapper. Remote Sensing of Environment, 164, 324–333. https://doi.org/10.1016/j.rse.2015.04.021

Maimaitijiang, M., Sagan, V., Sidike, P., Maimaitiyiming, M., Hartling, S., Peterson, K. T., & Fritschi, F. B. (2020). Soybean yield prediction from UAV using multimodal data fusion and deep learning. Remote Sensing of Environment, 237, 111599. https://doi.org/10.1016/j.rse.2019.111599

Makedonska, I. O., Zatserkovnyi, V. I., & Tustanovska, L. V. (2018). Application of GIS technologies and remote sensing in precision farming. In 17th International Conference on Geoinformatics – Theoretical and Applied Aspects. EAGE. https://doi.org/10.3997/2214-4609.201801835

McQueen, R. J., Garner, S. R., Nevill-Manning, C. G., & Witten, I. H. (1995). Applying machine learning to agricultural data. Computers and Electronics in Agriculture, 12(4), 275–293. https://doi.org/10.1016/0168-1699(95)98601-9

Melnyk, O. (2023). Application of LSTM neural networks for regional crop yield forecasting based on EO data. Environmental Modeling and Assessment, 28(1), 109–123 [in Ukrainian]. [Мельник, О. (2023). Застосування нейронних мереж LSTM для регіонального прогнозування врожайності на основі даних дистанційного зондування. Екологічне моделювання та оцінювання, 28(1), 109–123].

Pearson, K. (1895). Notes on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58, 240–242.

Samko, M., Zatserkovnyi, V. I., Vorokh, V., Tsyguliov, I., & Ilchenko, A. (2025, April). Monitoring using UAVs in precision farming technologies [Conference presentation]. XVIII International Scientific Conference "Monitoring of Geological Processes and Ecological Condition of the Environment", Kyiv, Ukraine.

Semeniaka, V., Zatserkovnyi, V. I., Vorokh, V. V., Ilyin, L., & Myronchuk, T. (2024, October). Differential technologies for precision agriculture [Conference paper]. International Conference of Young Professionals "GeoTerrace-2024", EAGE. https://doi.org/10.3997/2214-4609.2024510098

Shin, J., Kim, N., & Lee, H. (2021). Evaluation of machine learning models for predicting crop yield at field level. Computers and Electronics in Agriculture, 182, 106037. https://doi.org/10.1016/j.compag.2021.106037

Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data mining: Practical machine learning tools and techniques. Morgan Kaufmann. https://doi.org/10.1016/C2009-0-19715-5

Xu, X., Gao, P., Zhu, X., Guo, W., Ding, J., Li, C., & Wu, X. (2019). Design of an integrated climatic assessment indicator (ICAI) for wheat production: A case study in Jiangsu Province, China. Ecological Indicators, 101, 943–953. https://doi.org/10.1016/j.ecolind.2019.01.059

You, J., Li, X., Low, M., Lobell, D., & Ermon, S. (2017). Deep Gaussian processes for crop yield prediction based on remote sensing data. In Proceedings of the 31st AAAI Conference on Artificial Intelligence (pp. 4559–4566). AAAI Press. https://doi.org/10.1609/aaai.v31i1.11273

Zatserkovnyi, V., Vorokh, V., Hloba, O., Mironchuk, T., & Siuiva, I. (2025). Agrochemical analysis of soils in precision farming technologies: A case study of the Chernihiv region. Visnyk of Taras Shevchenko National University of Kyiv. Geology, 1(108), 85–93. https://doi.org/10.17721/1728-2713.108.12

Downloads

Published

2026-01-30

How to Cite

ZATSERKOVNYI, V., VOROKH, V., HLOBA, O., LIASHCHENKO, O., & SIUIVA, I. (2026). Application of machine learning methods and remote sensing data for crop yield forecasting. Visnyk of Taras Shevchenko National University of Kyiv. Geology, 4(111), 114-121. https://doi.org/10.17721/1728-2713.111.13