year 14, Issue 4 (Winter 2025)                   E.E.R. 2025, 14(4): 19-38 | Back to browse issues page


XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Rahimpour T, Rezaei Moghaddam M H. Modeling the Flood Hazard Potential in the Aji Chai basin using Data Mining Algorithms. E.E.R. 2025; 14 (4) :19-38
URL: http://magazine.hormozgan.ac.ir/article-1-862-en.html
Faculty of Planning and Environmental Sciences, University of Tabriz, Tabriz, Iran , t.rahimpour@tabrizu.ac.ir
Abstract:   (1749 Views)
1- Introduction
Due to the large area and receiving adequate rainfall during the cold and spring seasons, the Aji Chai River and its tributaries become flooded with the beginning of the spring season. In the current study, an attempt was made to prepare a flood hazard potential map in the Aji Chai basin using data mining algorithms. For this purpose, 18 effective parameters in flood occurrence were used. The investigated parameters were Elevation, Slope, Aspect, Topographic wetness index, Sediment transport index, Stream power index, earth curvature, Rainfall, Normalized Difference Vegetation Index, land use, Distance to dam, Distance to bridge, Distance to the river, River density, hydrological soil groups, Drainage texture, Geomorphology and lithology. The information layers of all parameters were prepared in raster format in the ArcGIS software.
2- Research Methodology
The study area of ​​the current study is the Aji Chai basin, which is located in East Azerbaijan province in terms of political divisions. This basin is located in the east of Lake Urmia and its area is about 10985.9 Km2. The elevation changes of the basin are from 1255 meters at the outlet of the basin to 3816 meters in the slopes of Sablan Mountain. The most important river that drains the surface water of this basin is Aji Chai. Four data mining algorithms including Random Forest, Random Subspace, Rotation Forest, and Dagging were used to achieve the purpose of the research. To implement the research models, the location of 274 flood points that happened in the past was used. The map of the location of the flood points in the area was prepared through the information of the regional water company of East Azerbaijan province, field survey, and also the Landsat 8 satellite image of the OLI-TIRS sensor. The implementation steps of all models have been done in WEKA data mining software. WEKA software has been introduced as a machine learning software for the first time in New Zealand and at the University of Waikato. In this research, in order to evaluate the accuracy of flood risk potential maps, receiver operating characteristic curve or system performance characteristic curve (ROC) and area under the curve (AUC) have been used. In the ROC curve, the X-axis shows the detection value or specificity (the percentage of non-flooded pixels that are correctly classified as non-flooded) and the Y-axis shows the sensitivity value (the percentage of flood pixels that are correctly classified as flooded). Variance inflation factor (VIF) and tolerance (T) indexes were used to determine multiple collinearity between independent variables. The presence of collinearity between the selected parameters causes the final maps to be of low accuracy.
3- Results
The results of multiple collinearity analysis showed that except for the drainage texture parameter, other independent variables selected to prepare flood hazard potential maps have low collinearity. Therefore, 17 parameters have been used in flood hazard modeling using data-mining algorithms. The results of examining the importance of each of the parameters in the process of implementing data mining algorithms showed that in the Random subspace model, the parameters of the elevation classes, slope, distance to river and lithology were the most important, respectively. In the Dagging model, the most important effective factors were: Elevation, hydrological soil groups, Topographic wetness index, Distance to the river and Normalized Difference Vegetation Index. In the Rotation Forest model, the parameters of lithology, slope, Rainfall, Elevation and Distance to the river were the most important factors, respectively. The most important effective factors in the Random Forest (RS) model were: Elevation, Distance to the river, slope and Rainfall, respectively.
4- Discussion & Conclusions
Flood hazard potential maps were prepared based on data mining algorithms in the ArcGIS software environment and in five classes with the title of very low, low, moderate, high and very high potential. Examining the final maps shows that the spatial distribution pattern of hazards zones is the same in all maps. So that in all the maps, the heights and steep slopes have very low potential. The distribution of flood points in the classes of the slope map shows that more than 80% of the floods occurred on the slopes of 0-10%, which indicates the effect of this factor on the floods of the region. Examining the area of ​​each of the flood hazard classes in the final maps obtained from data mining algorithms shows that in all maps, more than 30% of the area of ​​the basin are located in high and very high classes. The evaluation of the accuracy of the models using the ROC curve and the area under the curve showed that Random forest model has performed better than other models with the AUC 0.94.
Full-Text [PDF 2593 kb]   (192 Downloads)    

Received: 2024/10/5 | Published: 2024/12/21

References
1. Aldiansyah, S. & Wardani, F. (2023). Evaluation of flood susceptibility prediction based on a resampling method using machine learning. Journal of Water and Climate Change, 14(3), 937-961. [DOI:10.2166/wcc.2023.494]
2. Breiman, L., HFriedman, J., Olshen, R.A. & Stone, C.J. (1984). Classification and regression trees. Chapman & Hall, New York.
3. Bui, D.T., Tsangaratos, P., Ngo P-TT., Thai Pham, T. & Thai Phamet, B. (2019). Flash flood susceptibility modeling using an optimized fuzzy rule based feature selection technique and tree based ensemble methods. Sci Total Environ, 668,1038-1054. [DOI:10.1016/j.scitotenv.2019.02.422]
4. Cheraghi Ghalehsari, A., Habibnejad Roshan, M. & Roshun, S. H. (2020). Flood Susceptibility Mapping Using a Support Vector Machine Models (SVM) and Geographic Information System (GIS). Journal of Natural Environmental Hazards, 9(25), 61-80. (In Persian). doi: 10.22111/jneh.2020.31018.1547
5. Chezgi, J., Poyan, S. (2024). Determining Flood-Prone Areas Using Machine Learning Models in the Shahrestank Watershed Area of Khosef City. Jwmseir, 17 (63). 38-50.
6. Costache, R., Popa, M.C., Tien Bui, D., Diaconu, D.C., Ciubotaru, N., Minea, G., Pham, QB. (2020). Spatial predicting of flood potential areas using novel hybridizations of fuzzy decision-making, bivariate statistics, and machine learning. Journal of Hydrology, 585:124808. https:// doi. org/ 10. 1016/j. jhydr ol. 2020. 124808. [DOI:10.1016/j.jhydrol.2020.124808]
7. Diakakis, M., Mavroulis, S. & Deligiannakis, G. (2012). Floods in Greece, a statistical and spatial approach. Nat. Hazards, 62, 485-500. https://doi. org/10.1007/s11069-012-0090-z [DOI:10.1007/s11069-012-0090-z]
8. Gujarati, DN. (2004). Basic econometrics. 4th ed. New York City (NY), The MacGraw Hill Company, p. 1002.
9. Hirabayashi, Y., Mahendran, R., Koirala, S., Konoshima, L., Yamazaki, D., Watanabe, S., Kim, H. & Kanae, S. (2013). Global flood risk under climate change. Nature Climate Change, 3(9), 816. [DOI:10.1038/nclimate1911]
10. Hitouri, S., Mohajane, M., Lahsaini, M., Ali, S.A., Setargie, T.A., Tripathi, G., D'Antonio, P., Singh, S.K., Varasano, A. (2024). Flood Susceptibility Mapping Using SAR Data and Machine Learning Algorithms in a Small Watershed in Northwestern Morocco. Remote Sens, 16, 858, 1-21. [DOI:10.3390/rs16050858]
11. Hudson, P., Botzen, W.J.W., Kreibich, H., Bubeck, P. & Aerts, J.C.J.H. (2014). Evaluating the effectiveness of flood damage mitigation measures by the application of propensity score matching. Nat. Hazards Earth Syst. Sci, 14, 1731-1747. [DOI:10.5194/nhess-14-1731-2014]
12. Jolliffe, I. (2002). Principal component analysis. Wiley Online Library.
13. Karami, P., Eslamnezhad, S. A., Eftekhari, M., Akbari, M., & Rastgoo, M. (2023). Flood susceptibility zoning using machine learning improved by genetic algorithm. Journal of Natural Environment, 76(1), 43-60. doi: 10.22059/jne.2022.350170.2485
14. Kazemi, H., Mansouri, N., Jozi, S.A. (2021). Flood risk zoning in Nowshahr city using machine learning models. JHRE. 40(176), 71-86.
15. Khosravi, K., Panahi, M., Golkarian, A., Keesstra, S.D., Saco, P.M., Bui, D.T., Lee, S. (2020). Convolutional neural network approach for spatial prediction of flood hazard at national scale of Iran. J. Hydrol, 591. [DOI:10.1016/j.jhydrol.2020.125552]
16. Khosravi, K., Shahabi, H., Thai Pham, B., Adamowski, , J., Shirzadi, A., Pradhan, B., Dou, J., Ly, H., Gróf, G., Loc Ho, H., Hong, H., Chapi, K. & Prakash, I. (2019). A comparative assessment of flood susceptibility modeling using Multi-Criteria Decision-Making Analysis and Machine Learning Methods. Journal of Hydrology, 573, 311-323. doi: 10.1016/j.jhydrol.2019.03.073 [DOI:10.1016/j.jhydrol.2019.03.073]
17. Kotsianti, S.B. & Kanellopoulos, D. (2007). Combining Bagging, Boosting and Dagging for Classification Problems. In: Apolloni, B., Howlett, R.J., Jain, L. (Eds.), KnowledgeBased Intelligent Information and Engineering Systems. Springer, Berlin Heidelberg, Berlin, Heidelberg, 493-500. [DOI:10.1007/978-3-540-74827-4_62]
18. Kourgialas, N.N. & Karatzas, G.P. (2011). Flood management and a GIS modelling method to assess flood- hazard areas-a case study. Hydrological Sciences Journal, 56(2), 212-225. doi: [DOI:10.1080/02626667.2011.555836]
19. Lee, G., Jun, K. & Chung, E. (2013). Integrated multi-criteria flood vulnerability approach using fuzzy Atmospheric TOPSIS and Delphi technique. Nat. Hazards Earth Syst. Sci, 13, 1293-1312. [DOI:10.5194/nhess-13-1293-2013]
20. Lieb, M., Glaser, B. & Huwe, B. (2012). Uncertainty in the spatial prediction of soil texture: comparison of regression tree and Random Forest models. Geoderma, 170, 70-79. [DOI:10.1016/j.geoderma.2011.10.010]
21. Luu, C., Thai Pham, B., Van Phong, T., Costache, R., Duy Nguyen, H., Amiri, M., Duy Bui, Q., Thanh Nguyen, L., Van Le, H., Prakash, I. & Trong Trinh, P. (2021). GIS-based ensemble computational models for flood susceptibility prediction in the Quang Binh Province, Vietnam, Journal of Hydrology, 599, 126500. https://doi.org/10.1016/j.jhydrol.2021.126500 [DOI:10.1016/j.jhydrol.2021.12650.]
22. Ozcift, A. (2012). SVM feature selection based rotation forest ensemble classifiers to improve computer aided diagnosis of Parkinson disease. J. Med. Syst, 36(4), 2141-2147. [DOI:10.1007/s10916-011-9678-1]
23. Paryani, S., Bordbar, M., Jun,C., Panahi, M., M. Bateni, S., M. U. Neale, C., Moeini, H. & Lee, S. (2022). Hybrid based approaches for the flood susceptibility prediction of Kermanshah province, Iran. Natural Hazards, 116(2), 1-32. [DOI:10.1007/s11069-022-05701-4]
24. Peters, J., Verhoest, N., Samson, R., Boeckx, P. & De Baets, B. (2008). Wetland vegetation distribution modelling for the identification of constraining environmental variables. Landscape Ecology, 23, 1049- 1065. [DOI:10.1007/s10980-008-9261-4]
25. Piao,Y., Piao, M., Hao, Jin, C., Sun, S-H., Chung, J-M., Hwang, B. & Ho, R. (2015). A New Ensemble Method with Feature Space Partitioning for High- Dimensional Data Classification, Hindawi Publishing Corporation Mathematical Problems in Engineering, 1-12. http://dx.doi.org/10.1155/2015/590678 [DOI:10.1155/2015/590678]
26. Pradhan, B., Abokharima, M.H., Jebur, M.N. & Tehrany, M.S. (2014). Land subsidence susceptibility mapping at Kinta Valley (Malaysia) using the evidential belief function model in GIS. Natural Hazards, 73(2), 1019-1042. [DOI:10.1007/s11069-014-1128-1]
27. Rahimpour, T., Rezaei Moghaddam, M. H., Hejazi, S. A. & Valizadeh Kamran, K. (2021). Spatial Variations Analysis of Flood hazard Susceptibility based on a new ensemble model (Case Study: Aland Chai Basin, Khoy city). Environmental Management Hazards, 8(4), 371-393. (In Persian). [DOI:10.22059/jhsci.2022.335204.692]
28. Rahimpour, T., Rezaei Moghaddam, M. H., Hejazi, S. A. & Vlaizadeh Kamran, K. (2023). Flood Susceptibility Modeling in the Aland Chai Basin using New Ensemble Classification Approach (FURIA-GA-LogitBoost). Journal of Geography and Environmental Hazards, 12(1), 1-24. (In Persian). doi: 10.22067/geoeh.2022.74170.1141
29. Rahman, R., Dhruba, S.R., Ghosh, S. & Pal, R. (2019). Functional random forest with applications in dose-response predictions. Scientific reports, 9(1), 1-14. [DOI:10.1038/s41598-018-38231-w]
30. Rezaei Moghaddam, M. H. & Rahimpour, T. (2024a). Preparation of flood hazard potential map using two methods: Frequency Ratio and Statistical Index (Case study: Aji Chai Basin). Environmental Management Hazards, 10(4), 291-308. (In Persian). doi: 10.22059/jhsci.2024.369163.803
31. Rezaei Moghaddam, M. H., & Rahimpour, T. (2024b). Evaluating of Flood hazard potential using bivariate statistical analysis method (Case study: Aji Chai basin). Quantitative Geomorphological Research, 12(4), 91-107. (In Persian). doi: 10.22034/gmpj.2024.429929.1473
32. Roy, J. & Saha, S. (2019). GIS-based gully erosion susceptibility evaluation using frequency ratio, cosine amplitude and logistic regression ensembled with fuzzy logic in Hinglo River basin, India. Remote Sensing Applications: Society and Environment, 15, 100247. [DOI:10.1016/j.rsase.2019.100247]
33. Roy, J. & Saha, S. (2021). Integration of artificial intelligence with meta classifiers for the gully erosion susceptibility assessment in Hinglo river basin. Eastern India. Advances in Space Research, 67(1), 316-333. [DOI:10.1016/j.asr.2020.10.013]
34. Roy, J. & Saha, S. (2022). Ensemble hybrid machine learning methods for gully erosion susceptibility mapping: K- fold cross validation approach. Artificial Intelligence in Geosciences, 3, 28-45. doi: [DOI:10.1016/j.aiig.2022.07.001]
35. Sofia, G., Roder, G., Dalla Fontana, G. & Tarolli, P. (2017). Flood dynamics in urbanised landscapes: 100 years of climate and humans' interaction. Sci. Rep, 7, 40527. [DOI:10.1038/srep40527]
36. Tien Bui, D., Pradhan, B., Nampak, H., Bui, Q.T., Tran, Q.A. & Nguyen, Q.P. (2016). Hybrid artificial intelligence approach based on neural fuzzy inference model and metaheuristic optimization for flood susceptibility modeling in a high-frequency tropical cyclone area using GIS. Journal of Hydrology, 540, 317-330. [DOI:10.1016/j.jhydrol.2016.06.027]
37. Towfiqul Islam, A.B., Talukdar, S., Mahato, S., Kundu, S., UddinEibek, K., BaoPham, Q., Kuriqi, A. & ThuyLinh, N.T. (2021). Flood susceptibility modelling using advanced ensemble machine learning models. Geoscience Frontiers, 12(3), 101075. [DOI:10.1016/j.gsf.2020.09.006]
38. Walia, S. & Kumar, K. (2019). Digital image forgery detection: a systematic scrutiny. Australian Journal of Forensic Sciences, 51(5), 488-526. [DOI:10.1080/00450618.2018.1424241]
39. WHO (World Health Organization). (2022). Floods. 2017. Available online: https://www.who.int/health- topics/floods (accessed on 13 January 2022).
40. Zarei, M., Zandi, R. & Naemitabar, M. (2022). Assessment of Flood Occurrence Potential using Data Mining Models of Support Vector Machine, Chaid and Random Forest (Case study: Frizi watershed). J Watershed Manage Res, 13(25), 133-144. (In Persian). doi:10.52547/jwmr.13.25.133 [DOI:10.52547/jwmr.13.25.133]
41. Zhao, C., Zhang, X., Zhang, B., Dang, Q. & Lian, J. (2013). Driver's fatigue expressions recognition by combined features from pyramid histogram of oriented gradient and contourlet transform with random subspace ensembles. IET Intelligent Transport Systems, 7(1), 36-45. [DOI:10.1049/iet-its.2012.0005]

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2025 CC BY-NC 4.0 | Environmental Erosion Research Journal

Designed & Developed by : Yektaweb