Data-driven approach to rubber yield forecasting: A case study
| dc.contributor.author | Meththananda, H.R.N.B. | |
| dc.contributor.author | Senarathne, J. | |
| dc.contributor.author | Kodithuwakku, S. | |
| dc.date.accessioned | 2025-11-05T17:46:00Z | |
| dc.date.available | 2025-11-05T17:46:00Z | |
| dc.date.issued | 2025-11-07 | |
| dc.description.abstract | Rubber cultivation plays a vital role in Sri Lanka’s economy, with yield influenced by various climatic and agronomic factors. This study investigated rubber yield dynamics using five years of monthly data from 2019 to 2023, comprising 6,007 observations collected from a large commercial rubber processing industry. Key variables included tappable trees, stand per hectare, tapping days, number of trees per tapper, rainfall, and tapping techniques. To capture these complex relationships, machine learning (ML) models, Lasso Regression, K-Nearest Neighbours (KNN), Random Forest (RF), and Extreme Gradient Boosting (XGBoost) regressor were employed. An ensemble model using a voting regressor was constructed by combining the outputs of all four models to enhance predictive robustness. Additionally, a Long Short-Term Memory (LSTM) network was utilised for long-term forecasting, effectively identifying temporal dependencies and seasonal patterns. The dataset was shuffled and divided into 70% for training, 20% for testing, and 10% for validation. XGBoost achieved the best short-term performance, with a coefficient of determination (R2) of 0.93 and the lowest mean absolute error (MAE) of 489.17 on the test set, outperforming other short-term forecasting models and effectively capturing complex, non-linear relationships. For long-term forecasting, the LSTM model delivered exceptional precision with minimal errors, achieving an MAE of 0.18 on the test set, enabling accurate yield predictions critical for planning tapping activities and managing seasonal variations. The study reveals that tappable trees, stand per hectare, and tapping days are the most significant yield determinants, with cross-correlation analysis indicating that prior-month rainfall has a 0.002 higher correlation with yield than current-month rainfall due to the disruptive impact of heavy rain on tapping activities. These findings highlight the importance of integrating agronomic practices and climatic factors into predictive models, offering critical insights to enhance productivity and sustainability in Sri Lanka’s rubber industry. | |
| dc.identifier.citation | Proceedings of the Postgraduate Institute of Science Research Congress (RESCON)-2025, University of Peradeniya,p95 | |
| dc.identifier.issn | 3051-4622 | |
| dc.identifier.uri | https://ir.lib.pdn.ac.lk/handle/20.500.14444/5993 | |
| dc.language.iso | en | |
| dc.publisher | Postgraduate Institute of Science (PGIS), University of Peradeniya, Sri Lanka | |
| dc.relation.ispartofseries | Volume 12 | |
| dc.subject | Climatic factors | |
| dc.subject | k-nearest neighbours | |
| dc.subject | Long short-term memory | |
| dc.subject | Random forest | |
| dc.subject | Voting regressor | |
| dc.title | Data-driven approach to rubber yield forecasting: A case study | |
| dc.type | Article |