Preview

GEOGRAPHY, ENVIRONMENT, SUSTAINABILITY

Advanced search

Which Climate Model Evaluation Methods Can Consistently Select Skillful Models from the Cmip6 Ensemble?

https://doi.org/10.24057/2071-9388-2025-3694

Abstract

When considering the possible use of climate model data, it is necessary to choose which model is most appropriate to use. There are many methods for evaluating and selecting climate models in the literature, but there is no established consensus on which method is the most robust for determining model skill. In this article, we tested seven widely used methods for evaluating climate models in the Arctic using CMIP6 surface air temperature data: a single statistical metric method (root mean square error, spatial trends), a single skill score method (Taylor skill score, probability density function), a combination of several statistical metric methods (Taylor diagram, interannual variability skill score, comprehensive rating metric, etc.), and a multiple statistical criteria method (percentile-based approach). To evaluate their consistency, each method was applied to two periods: 1951-1980 and 1981-2010. For each method, the models were ranked and classified into three quality groups (very good, satisfactory, unsatisfactory). The comparison of methods was performed by comparing the differences in the average values of the normalized statistical measures, the differences in the model ranks, and the definition of the model quality groups. For each method, an optimal set of models corresponding to the top 25% was selected. One of the main objectives of the study was to compare the ability of the methods to identify the best model for the selected ensemble, regardless of the time period (i.e., without sensitivity to natural variability). The results suggest a preference for methods using root mean square error and a percentile-based approach.

About the Authors

Natalia V. Gnatiuk
Nansen International Environmental and Remote Sensing Centre
Russian Federation

14-Liniya V.O. 7, St. Petersburg, 199034 



I. V. Radchenko
Nansen International Environmental and Remote Sensing Centre
Russian Federation

14-Liniya V.O. 7, St. Petersburg, 199034 



Richard Davy
Nansen Environmental and Remote Sensing Center
Norway

Jahnebakken 3, Bergen, 5006 



Jiechen Zhao
Qingdao Innovation and Development Base of Harbin Engineering University ; First Institute of Oceanography, MNR & Decade Collaborative Center on Ocean-Climate Nexus and Coordination
China

Sansha Road 1777, Qingdao, 266000 

Xianxialing Road 6, Qingdao, 266000 



Leonid P. Bobylev
Nansen International Environmental and Remote Sensing Centre
Russian Federation

14-Liniya V.O. 7, St. Petersburg, 199034



References

1. Aghakhani Afshar A., Hasanzadeh Y., Besalatpour A.A., and Pourreza-Bilondi M. (2017). Climate change forecasting in a mountainous data scarce watershed using CMIP5 models under representative concentration pathways. Theoretical and Applied Climatology, 129, 683-699, DOI: 10.1007/s00704-016-1908-5.

2. Agosta C., Fettweis X., and Datta R. (2015). Evaluation of the CMIP5 models in the aim of regional modelling of the Antarctic surface mass balance. Cryosphere, 9, 2311-2321, DOI: 10.5194/tc-9-2311-2015.

3. Ahmadalipour A., Rana A., Moradkhani H., Sharma A. (2017). Multi-criteria evaluation of CMIP5 GCMs for climate change impact analysis. Theoretical and Applied Climatology, 128, 71-87, DOI: 10.1007/s00704-015-1695-4.

4. Ahmed K., Sachindra D., Shahid S., et al. (2020). Multi-model ensemble predictions of precipitation and temperature using machine learning algorithms. Atmospheric Research, 236, 104806, DOI: 10.1016/j.atmosres.2019.104806.

5. Ahmed K., Sachindra D.A., Shahid S., et al. (2019). Selection of multi-model ensemble of general circulation models for the simulation of precipitation and maximum and minimum temperature based on spatial assessment metrics. Hydrology and Earth System Sciences, 23,4803-4824, DOI: 10.5194/hess-23-4803-2019.

6. Caballero R., Huber M. (2013). State-dependent climate sensitivity in past warm climates and its implications for future climate projections. Proceedings of the National Academy of Sciences, 110, 14162-14167, DOI: doi.org/10.1073/pnas.1303365110.

7. Cai Z., You Q., Wu F., et al. (2021). Arctic Warming Revealed by Multiple CMIP6 Models: Evaluation of Historical Simulations and Quantification of Future Projection Uncertainties. Journal of Climate, 34, 4871-4892, DOI: 10.1175/JCLI-D-20-0791.1.

8. Calvin K., Dasgupta D., Krinner G., et al. (2023). IPCC, 2023: Climate Change 2023: Synthesis Report. Contribution of Working Groups I, II and III to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change [Core Writing Team, H. Lee and J. Romero (eds.)]. IPCC, Geneva, Switzerland.

9. Chai T. and Draxler R.R. (2014). Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature. Geoscientific Model Development, 7, 1247-1250, DOI: 10.5194/gmd-7-1247-2014.

10. Chen W., Jiang Z., and Li L. (2011). Probabilistic Projections of Climate Change over China under the SRES A1B Scenario Using 28 AOGCMs. Journal of Climate, 24, 4741-4756, DOI: 10.1175/2011JCLI4102.1.

11. Eyring V., Cox P.M., Flato G.M., et al. (2019). Taking climate model evaluation to the next level. Nature Climate Change, 9, 102-110, DOI: 10.1038/s41558-018-0355-y.

12. Fu G., Liu Z., Charles S.P., et al. (2013). A score-based method for assessing the performance of GCMs: A case study of southeastern Australia. Journal of Geophysical Research-Atmospheres, 118, 4154-4167, DOI: doi.org/10.1002/jgrd.50269.

13. Gleckler P.J., Taylor K.E., and Doutriaux C. (2008). Performance metrics for climate models. Journal of Geophysical Research-Atmospheres, 113, 1-20, DOI: 10.1029/2007JD008972.

14. Gnatiuk N., Radchenko I., Davy R., et al. (2020). Simulation of factors affecting Emiliania huxleyi blooms in Arctic and sub-Arctic seas by CMIP5 climate models: model validation and selection. Biogeosciences, 17, 1199-1212, DOI: 10.5194/bg-17-1199-2020.

15. Hausfather Z., Marvel K., Schmidt G.A., et al. (2022). Climate simulations: recognize the ‘hot model’ problem. Nature, 605, 26-29, DOI: 10.1038/d41586-022-01192-2.

16. Herger N., Abramowitz G., Knutti R., et al. (2018). Selecting a climate model subset to optimise key ensemble properties. Earth System Dynamics, 9, 135-151, DOI: 10.5194/esd-9-135-2018.

17. Inoue T., and Ueda H. (2011). Delay of the First Transition of Asian Summer Monsoon under Global Warming Condition. SOLA, 7, 81-84, DOI: 10.2151/sola.2011-021.

18. Jain S., Scaife A.A., Shepherd T.G., et al. (2023). Importance of internal variability for climate model assessment. npj Climate and Atmospheric Science, 6, 68, DOI: 10.1038/s41612-023-00389-0.

19. Jia K., Ruan Y., Yang Y., and You Z. (2019). Assessment of CMIP5 GCM Simulation Performance for Temperature Projection in the Tibetan Plateau. Earth and Space Science, 6, 2362-2378, DOI: 10.1029/2019EA000962.

20. Jiang Z., Li W., Xu J., and Li L. (2015). Extreme Precipitation Indices over China in CMIP5 Models. Part I: Model Evaluation. Journal of Climate, 28, 8603-8619, DOI: 10.1175/JCLI-D-15-0099.1.

21. Kadel I., Yamazaki T., Iwasaki T., and Abdillah M. (2018). Projection of future monsoon precipitation over the central Himalayas by CMIP5 models under warming scenarios. Climate Research, 75, 1-21, DOI: 10.3354/cr01497.

22. Knutti R., Furrer R., Tebaldi C., et al. (2010). Challenges in Combining Projections from Multiple Climate Models. Journal of Climate, 23, 2739-2758, DOI: 10.1175/2009JCLI3361.1.

23. Kumar D., Mishra V., and Ganguly A.R. (2015). Evaluating wind extremes in CMIP5 climate models. Climate Dynamics, 45, 441-453, DOI: 10.1007/s00382-014-2306-2.

24. Kumar S., Merwade V., Kinter J.L., and Niyogi D. (2013). Evaluation of Temperature and Precipitation Trends and Long-Term Persistence in CMIP5 Twentieth-Century Climate Simulations. Journal of Climate, 26, 4168-4185, DOI: 10.1175/JCLI-D-12-00259.1.

25. Macadam I., Pitman A.J., Whetton P.H., and Abramowitz G. (2010). Ranking climate models by performance using actual values and anomalies: Implications for climate change impact assessments. Geophysical Research Letters, 37, 16704, DOI: 10.1029/2010GL043877.

26. Maxino C.C., McAvaney B.J., Pitman A.J., and Perkins S.E. (2008). Ranking the AR4 climate models over the Murray-Darling Basin using simulated maximum temperature, minimum temperature and precipitation. International Journal of Climatology, 28, 1097-1112, DOI: 10.1002/joc.1612.

27. McMahon T.A., Peel M.C., and Karoly D.J. (2015). Assessment of precipitation and temperature data from CMIP3 global climate models for hydrologic simulation. Hydrology and Earth System Sciences, 19, 361-377, DOI: 10.5194/hess-19-361-2015.

28. Ogata T., Ueda H., Inoue T., et al. (2014). Projected Future Changes in the Asian Monsoon: A Comparison of CMIP3 and CMIP5 Model Results. Journal of the Meteorological Society of Japan, 92, 207-225, DOI: 10.2151/jmsj.2014-302.

29. Otero N., Sillmann J., and Butler T. (2018). Assessment of an extended version of the Jenkinson-Collison classification on CMIP5 models over Europe. Climate Dynamics, 50, 1559-1579, DOI: 10.1007/s00382-017-3705-y.

30. Perkins S.E., Pitman A.J., Holbrook N.J., and McAneney J. (2007). Evaluation of the AR4 Climate Models’ Simulated Daily Maximum Temperature, Minimum Temperature, and Precipitation over Australia Using Probability Density Functions. Journal of Climate, 20, 4356-4376, DOI: 10.1175/JCLI4253.1.

31. Raju K.S., and Kumar D.N. (2020). Review of approaches for selection and ensembling of GCMs. Journal of Water and Climate Change, 11, 577-599, DOI: 10.2166/wcc.2020.128.

32. Rao X., Lu X., and Dong W. (2019). Evaluation and Projection of Extreme Precipitation over Northern China in CMIP5 Models. Atmosphere, 10, 691, DOI: 10.3390/atmos10110691.

33. Reifen C., and Toumi R. (2009). Climate projections: Past performance no guarantee of future skill? Geophysical Research Letters, 36, DOI: 10.1029/2009GL038082.

34. Rohde R., Muller R.A., Jacobsen R., et al. (2013). A New Estimate of the Average Earth Surface Land Temperature Spanning 1753 to 2011. Geoinformatics and Geostatistics: An Overview, 1:1, DOI: 10.4172/2327-4581.1000101.

35. Rohde R.A., and Hausfather Z. (2020). The Berkeley Earth Land/Ocean Temperature Record. Earth System Science Data, 12, 3469-3479, DOI: 10.5194/essd-12-3469-2020.

36. Ruan Y., Liu Z., Wang R., and Yao Z. (2019). Assessing the Performance of CMIP5 GCMs for Projection of Future Temperature Change over the Lower Mekong Basin. Atmosphere, 10, 93, DOI: 10.3390/atmos10020093.

37. Rupp D.E., Abatzoglou J.T., Hegewisch K.C., and Mote P.W. (2013). Evaluation of CMIP5 20 th century climate simulations for the Pacific Northwest USA. Journal of Geophysical Research-Atmospheres, 118, 884-894, DOI: 10.1002/jgrd.50843.

38. Sharmila S., Joseph S., Sahai A., et al. (2015). Future projection of Indian summer monsoon variability under climate change scenario: An assessment from CMIP5 climate models. Global and Planetary Change, 124, 62-78, DOI: 10.1016/j.gloplacha.2014.11.004.

39. Sherwood S.C., Webb M.J., Annan J.D., et al. (2020). An Assessment of Earth’s Climate Sensitivity Using Multiple Lines of Evidence. Reviews of Geophysics, 58, e2019RG000678, DOI: 10.1029/2019RG000678.

40. Sillmann J., Kharin V.V., Zhang X., et al. (2013). Climate extremes indices in the CMIP5 multimodel ensemble: Part 1. Model evaluation in the present climate. Journal of Geophysical Research-Atmospheres, 118, 1716-1733, DOI: 10.1002/jgrd.50203.

41. Stocker T.F., Qin D., Plattner G.-K., et al. (2014). Climate Change 2013: The Physical Science Basis. Working Group I Contribution to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA.

42. Taylor K.E. (2001). Summarizing multiple aspects of model performance in a single diagram. Journal of Geophysical ResearchAtmospheres, 106, 7183-7192, DOI: 10.1029/2000JD900719.

43. Taylor K.E., Stouffer R.J., and Meehl G.A. (2012). An Overview of CMIP5 and the Experiment Design. Bulletin of the American Meteorological Society, 93, 485-498, DOI: 10.1175/BAMS-D-11-00094.1.

44. Walsh J.E., Chapman W.L., Romanovsky V., et al. (2008). Global Climate Model Performance over Alaska and Greenland. Journal of Climate, 21, 6156-6174, DOI: 10.1175/2008JCLI2163.1.

45. Yang X., Yu X., Wang Y., et al. (2020). The Optimal Multimodel Ensemble of Bias-Corrected CMIP5 Climate Models over China. Journal of Hydrometeorology, 21, 845-863, DOI: 10.1175/JHM-D-19-0141.1.

46. You Q., Jiang Z., Wang D., et al. (2018). Simulation of temperature extremes in the Tibetan Plateau from CMIP5 models and comparison with gridded observations. Climate Dynamics, 51, 355-369, DOI: 10.1007/s00382-017-3928-y.

47. Zhou B., Wen Q.H., Xu Y., et al. (2014). Projected Changes in Temperature and Precipitation Extremes in China by the CMIP5 Multimodel Ensembles. Journal of Climate, 27, 6591-6611, DOI: 10.1175/JCLI-D-13-00761.1.


Review

For citations:


Gnatiuk N.V., Radchenko I.V., Davy R., Zhao J., Bobylev L.P. Which Climate Model Evaluation Methods Can Consistently Select Skillful Models from the Cmip6 Ensemble? GEOGRAPHY, ENVIRONMENT, SUSTAINABILITY. 2025;18(2):125-149. https://doi.org/10.24057/2071-9388-2025-3694

Views: 19


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2071-9388 (Print)
ISSN 2542-1565 (Online)