PriMera Scientific Engineering (ISSN: 2834-2550)

Review Article

Volume 3 Issue 3

Big Data Analytics in Weather Forecasting using Gradient Boosting classifiers Algorithm

Kamel Maaloul* and Brahim Lejdel

August 18, 2023


Weather forecasting, a crucial and vital process in people's everyday lives, assesses the change taking place in the atmosphere's current state. Big data analytics is the practice of studying big data to uncover hidden patterns and useful information that might produce more beneficial outcomes. Big data is currently a topic of fascination for many facets of society, and the meteorological institute is no exception. Big data analytics will therefore produce better results for weather forecasting and assist forecasters in providing more accurate weather predictions. Several big data techniques and technologies have been proposed to manage and evaluate the enormous volume of weather data from various resources in order to accomplish this goal and to identify beneficial solutions. A smart city is a project that uses computers to process vast amounts of data gathered from sensors, cameras, and other devices in order to manage resources, provide services, and address problems that arise in daily life, such as the weather. Forecasting the weather is a crucial process in daily life because it assesses changes in the atmosphere's current state. A machine learning-based weather forecasting model was proposed in this paper, and it was implemented using 5 classifier algorithms, including the Random Forest classifier, the Decision Tree Algorithm, the Gaussian Naive Bayes model, the Gradient Boosting Classifier, and Artificial Neural Networks. These classifier algorithms were trained using a publicly available dataset. When the model's performance was assessed, the Gradient Boosting Classifier algorithm, which had a plus 98% predicted accuracy, came out on top.

Keywords: Weather forecasting; Big data; Machine Learning; smart city; Gradient Boosting Classifier


  1. S Kumari and P Muthulakshmi. “A Wide Scale Survey on Weather Prediction Using Machine Learning Techniques”. J. Inf. Knowl. Manag (2022): 2250093.
  2. F Mehrpour. “Prediction of Bridge Fires Characteristics Using Machine Learning”. Text, Carleton University (2022).
  3. X Ren., et al. “Deep Learning-Based Weather Prediction: A Survey”. Big Data Res 23 (2021): 100178.
  4. K Maaloul and L Brahim. “Comparative Analysis of Machine Learning for Predicting Air Quality in Smart Cities”. WSEAS Trans. Comput 21 (2022): 248‑256.
  5. KY Ngiam and IW Khor. “Big data and machine learning algorithms for health-care delivery”. Lancet Oncol 20.5 (2019): e262‑e273.
  6. I Munoko, HL Brown-Liburd and M Vasarhelyi. “The Ethical Implications of Using Artificial Intelligence in Auditing”. J. Bus. Ethics 167.2 (2020): 209‑234.
  7. TK Shivaprasad and J Shetty. “Sentiment analysis of product reviews: A review”. 2017 International Conference on Inventive Communication and Computational Technologies (ICICCT) (2017): 298‑301.
  8. DV Sahasrabuddhe and P Jamsandekar. “Data Structure for Representation of Big Data of Weather Forecasting: A Review”. International Journal of Computer Science Trends and Technology 3.6 (2015): 10.
  9. H Jain and R Jain. “Big data in weather forecasting: Applications and challenges”. 2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC) (2017): 138‑142.
  10. NH Rao. “Big Data and Climate Smart Agriculture - Review of Current Status and Implications for Agricultural Research and Innovation in India”. Rochester, NY 25 (2017).
  11. S Murugan Bhagavathi., et al. “Retracted: Weather forecasting and prediction using hybrid C5.0 machine learning algorithm”. Int. J. Commun. Syst 34.10 (2021): e4805.
  12. S Mittal and OP Sangwan. “Big Data Analytics Using Data Mining Techniques: A Survey”. in Advanced Informatics for Computing Research, Singapore (2019): 264‑273.
  13. A Moosavi, V Rao and A Sandu. “Machine learning based algorithms for uncertainty quantification in numerical weather prediction models”. J. Comput. Sci 50 (2021): 101295.
  14. M Alam and M Amjad. “Weather forecasting using parallel and distributed analytics approaches on big data clouds”. J. Stat. Manag. Syst 22.4 (2019): 791‑799.
  15. A Shehadeh., et al. “Machine learning models for predicting the residual value of heavy construction equipment: An evaluation of modified decision tree, LightGBM, and XGBoost regression”. Autom. Constr 129 (2021): 103827.
  16. D Mukhin., et al. “Revealing recurrent regimes of mid-latitude atmospheric variability using novel machine learning method”. Chaos Interdiscip. J. Nonlinear Sci 32.11 (2022): 113105.
  17. IH Sarker. “Machine Learning: Algorithms, Real-World Applications and Research Directions”. SN Comput. Sci 2.3 (2021): 160.
  18. F Kamalov, S Moussa and J Avante Reyes. “KDE-Based Ensemble Learning for Imbalanced Data”. Electronics 11.17 (2022).
  19. N Boonnam., et al. “Coral Reef Bleaching under Climate Change: Prediction Modeling and Machine Learning”. Sustainability 14.10 (2022).
  20. EK Ampomah., et al. “Stock Market Prediction with Gaussian Naïve Bayes Machine Learning Algorithm”. Informatica 45.2 (2021).
  21. Z Qinghe., et al. “Optimised extreme gradient boosting model for short term electric load demand forecasting of regional grid system”. Sci. Rep 12.1 (2022): 19282.
  22. R Sibindi, RW Mwangi and AG Waititu. “A boosting ensemble learning based hybrid light gradient boosting machine and extreme gradient boosting model for predicting house prices”. Eng. Rep (2022).
  23. C Jayakumar., et al. “Using the Artificial Neural Networks to Predict the Solubility Effects of Theophylline Drug in Hydrotropic Solutions”. International Journal of Pharmaceutical Research (2021).
  24. F Wang., et al. “Dynamic spatio-temporal correlation and hierarchical directed graph structure based ultra-short-term wind farm cluster power forecasting method”. Appl. Energy 323 (2022): 119579.
  25. K Maaloul, NM Abdelhamid and B Lejdel. “Machine Learning Based Indoor Localization Using Wi-Fi and Smartphone in a Shopping Malls”. Artificial Intelligence and Its Applications, Cham (2022): 1‑10.
  26. M Madhavi and D Nethravathi. “Gradient Boosted Decision Tree (Gbdt) And Grey Wolf Optimization (Gwo)Based Intrusion Detection Model”. 16 (2022): 15.
  27. G Sambasivam and GD Opiyo. “A predictive machine learning application in agriculture: Cassava disease detection and classification with imbalanced dataset using convolutional neural networks”. Egypt. Inform. J 22.1 (2021): 27‑34.
  28. P Lipinski, E Brzychczy and R Zimroz. “Decision Tree-Based Classification for Planetary Gearboxes’ Condition Monitoring with the Use of Vibration Data in Multidimensional Symptom Space”. Sensors 20.21 (2020).
  29. A Kanervo. “Random Forests an Application to Tumour Classification”.
  30. Maswadi., et al. “Human activity classification using Decision Tree and Naïve Bayes classifiers”. Multimed. Tools Appl 80.14 (2021): 21709‑21726.