Sie sind auf Seite 1von 86

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

North American Climate in CMIP5 Experiments. Part I: Evaluation of 20th Century Continental and Regional Climatology Justin Sheffield, Andrew Barrett, Brian Colle, Rong Fu, Kerrie L. Geil, Qi Hu, Jim Kinter, Sanjiv Kumar, Baird Langenbrunner, Kelly Lombardo, Lindsey N. Long, Eric Maloney, Annarita Mariotti, Joyce E. Meyerson, Kingtse C. Mo, J. David Neelin, Zaitao Pan, Alfredo Ruiz-Barradas, Yolande L. Serra, Anji Seth, Jeanne M. Thibeault, Julienne C. Stroeve
Justin Sheffield, Department of Civil and Environmental Engineering, Princeton University, Princeton, NJ Brian Colle, Kelly Lombardo, School of Marine and Atmospheric Sciences, Stony Brook University SUNY Rong Fu, Jackson School of Geosciences, University of Texas at Austin, TX Kerrie L. Geil, Department of Atmospheric Sciences, University of Arizona, Tucson, AZ Qu Hu, School of Natural Resources and Department of Earth and Atmospheric Sciences, University of Nebraska-Lincoln, Lincoln, NE Sanjiv Kumar, Jim Kinter, Center for Ocean-Land-Atmosphere Studies, Calverton, MD Baird Langenbrunner, Joyce E. Meyerson, J. David Neelin, Department of Atmospheric and Oceanic Sciences, University of California Los Angeles Lindsey N. Long, Wyle Laboratories and Climate Prediction Center, NCEP/NWS/NOAA, Camp Springs, MD Eric D. Maloney, Department of Atmospheric Science, Colorado State University, Fort Collins, CO Annarita Mariotti, National Oceanic and Atmospheric Administration, Office of Oceanic and Atmospheric Research (NOAA/OAR), Silver Spring, MD Kingtse C. Mo, Climate Prediction Center, NCEP/NWS/NOAA, Camp Springs, MD Zaitao Pan, Saint Louis University, St. Louis, MO

25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

Alfredo Ruiz-Barradas, Department of Atmospheric and Oceanic Science, University of Maryland, College Park, MD Yolande L. Serra, Department of Atmospheric Sciences, University of Arizona, Tucson, AZ Anji Seth and Jeanne M. Thibeault, Department of Geography, University of Connecticut, Storrs, CT Andrew Barrett, Julienne C Stroeve, National Snow and Ice Data Center, Cooperative Institute for Research in Environmental Sciences, University of Colorado, Boulder, CO

Journal of Climate Submitted on July 30, 2012

*Corresponding author address: Justin Sheffield, Department of Civil and Environmental Engineering, Princeton University, Princeton, NJ, 08540. Email: justin@princeton.edu

40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62

Abstract This is the first part of a three-part paper on North American climate in CMIP5 that evaluates the 20th simulations of continental and regional climatology. In general the models capture the main features of North American climate including seasonal precipitation, air temperature and sea surface temperature. The hydrological cycle is also reasonably well simulated for the main characteristics of atmospheric moisture convergence and seasonality of the surface water budget but the latter is subject to the biases in precipitation. The spatial distribution of growing season length and number of frost days are generally well simulated, with biases highest in western regions. The frequency of hydroclimate extreme events is not well represented by the models. Regionally, the skill of the models is variable and can often be attributed to model resolution. The models capture the location of cool season west North Atlantic cyclone density but under predict the magnitude. The models do reasonably well at simulating temperature and precipitation extremes in the southern US, with a tendency to underestimate extreme temperatures and heavy rainfall. The timing and magnitude of the North American monsoon is generally too late and too small in the models. The main features of the summer time Great Plains low-level jet are simulated with model fidelity dependent on resolution. Observed sea ice extent and its decline are generally underestimated. The skill of the multi-model ensemble in capturing the main features of North American climate has not improved significantly since CMIP3, although improvements in some individual models are noticeable.

63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84

1. Introduction This is the first part of a three-part paper on the Climate Model Intercomparison Project phase 5 (CMIP5; Taylor et al., 2012) model simulations for North America. The first two papers evaluate the CMIP5 models in their ability to replicate the observed features of North American continental and regional climate, and related climate processes for the recent past. This first part evaluates the models in terms of continental and regional climatology and the second part (Sheffield et al. 2012) evaluates intraseasonal to decadal variability. The third part (Maloney et al., 2012) describes the projected changes for the 21st century. The CMIP5 provides an unprecedented collection of climate model output data for the assessment of future climate projections as well as evaluations of climate models for contemporary climate, the attribution of observed climate change and improved understanding of climate processes and feedbacks. As such, these data will feed into the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5), and other global, regional and national assessments. The goal of this study is to provide a broad evaluation of CMIP5 models in their depiction of North American climate and associated processes. It synthesizes and draws from individual work by investigators within the CMIP5 Task Force of the US National Oceanic and Atmospheric Administration (NOAA) Modeling Analysis and Prediction Program (MAPP). This is part of a Journal of Climate special collection on North America in CMIP5 models and we draw from individual papers within the special collection, which provide more detailed analysis that can be presented in this synthesis paper.

85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106

We begin in Section 2 by describing the CMIP5, providing an overview of the models analyzed, the historical simulations and the general methodology for evaluating the models. Details of the observational datasets to which the climate models are compared are also given in this section. The next two sections focus on different aspects of North American climate and surface processes. Section 3 begins with an overview of climate model depictions of continental climate, including seasonal precipitation, air temperature, sea surface temperatures, surface hydrology and its extremes, and temperature-based biophysical indicators such as growing season length and temperature extremes. Section 4 focuses on regional climate features such as north Atlantic winter storms, the Great Plains low level jet, and Arctic sea ice. The results are synthesized in Section 5. 2. CMIP5 Models and Simulations 2.1. CMIP5 Models We use data from multiple model simulations of the historical scenario from the CMIP5 database. The scenarios are described in more detail below. The CMIP5 experiments were carried out by 20 modeling groups representing more than 50 climate models with the aim of further understanding past and future climate change in key areas of uncertainty (Taylor et al., 2012). In particular, experiments focus on understanding model differences in clouds and carbon feedbacks, quantifying decadal climate predictability and why models give different answers when driven by the same forcings. The CMIP5 builds on the previous phase (CMIP3) experiments in several ways. Firstly a greater number of modeling centers and models have participated. Secondly, the models 5

107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129

run at higher spatial resolution with some models being more comprehensive in terms of the processes that they represent, therefore hopefully resulting in better skill in representing current climate conditions and reducing uncertainty in future projections. Table 1 provides an overview of the models used. The specific models used vary for each individual analysis because of data availability at the time of this study, and so the model names are provided within the results section where appropriate. 2.2. Overview of Methods Data from the historical CMIP5 scenarios are evaluated in this study. The historical simulations are run in coupled atmosphere-ocean mode forced by historical estimates of changes in atmospheric composition from natural and anthropogenic sources, volcanoes, greenhouse gases and aerosols, as well as changes in solar output and land cover. For certain climate features we also analyze model simulations from the CMIP3 that provided the underlying climate model data to the fourth assessment report (AR4) of the IPCC. Several models have contributed to both the CMIP3 and CMIP5 experiments, either for the same version of the model, or for a newer version, and this allows a direct evaluation of changes in skill in individual models as well as the model ensemble. Historical scenario simulations were carried out for the period from the start of the industrial revolution to near present: 1850-2005. Our evaluations are generally carried out for the most recent 30 years, depending on the type of analysis and the availability of observations. For some analyses the only, or best available, data are from satellite remote sensing which restricts the analysis to the satellite period, which is generally from 1979 onwards. For other analyses, multiple observational datasets are used to capture the 6

130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152

uncertainty in the observations. An overview of the observational datasets used in the evaluations is given in Table 2, categorized by variable. Further details of these datasets and any data processing are given in the relevant sub-sections and figure captions. Where the comparisons go beyond 2005 (e.g. 1979-2008), model data from the RCP8.5 future projection scenario simulation are appended to the model historical time series. About half the models have multiple ensemble members and these are averaged where appropriate or used to assess the variability across ensemble members.

3. Continental Climate

3.1. Seasonal Atmospheric Climate A) Seasonal precipitation climatology Figure 1 shows the model precipitation climatology and CMAP observations for December-February (DJF) and June-August (JJA) for 1979-2008. Most of the models do reasonably well in producing essential large-scale precipitation features, but there are substantial differences among the models, and with observations at the regional scale. For the winter season (Fig. 1, left), the Pacific storm track is slightly too strong in terms of the amount of precipitation, especially north of about 37N, but is very reasonably placed in latitude as it approaches the coast. One important aspect of this, the angle of the storm track as it bends northward approaching the coast from roughly Hawaii to Central California, is well reproduced in the models. The model rainfall penetrates slightly too far inland, as might be expected for the typical model resolution which does not fully resolve mountain ranges. The east coast storm tracks are well placed in DJF and the multi-model 7

153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175

ensemble mean does a good job in replicating the Eastern Pacific Inter-Tropical Convergence Zone (ITCZ), although northern Mexico receives too much rainfall. Figures 1d and 1e (left panels) provide a model by model view of these features using the 3 mm/day contour for each model to provide an outline of the major precipitation features. If the models were perfect, all contours would lie exactly along the boundary of the shaded observations. Taking into account the high latitude precipitation excess in the Pacific storm track, individual models do quite well at reproducing each of the main features of the DJF climatology, including the arrival point at the North American West Coast of the southern edge of the Pacific storm track. Two models, HadGEM2 and MRI_CGCM3 are significant contributors to the northern Mexico precipitation excess. For the summer season (JJA; Fig. 1, right), the ITCZ and the Mexican monsoon are reasonably well simulated in terms of position, although the precipitation magnitude in the ocean just off Central America is underestimated relative to CMAP on both the Pacific and Caribbean side. The East Coast storm track in the multi-model ensemble mean is too spread out. This is due to substantial differences in the placement of these storm tracks in the individual models (Fig. 1d,e, right). Some models spread the storm tracks or even split them with part lying too far south or too far out to sea, and part extending poleward or inland of the better organized storm tracks in observations. The majority of the models exhibit excessive precipitation in at least some part of the continental interior. While the bulk of the models do reasonably well at the poleward extension of the monsoon over Central America, Mexico and the Inter-Americas Seas region, a few models underestimate this extent, putting a split between the poleward extension of the monsoon feature and the start of the East Coast storm track. 8

176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 C) Difference between CMIP3 and CMIP5 We also compared the precipitation and temperature climatologies for CMIP3 and CMIP5 models to see if there have been improvements in the representation of observed 9 B) Seasonal surface air temperature climatology Figure 2 compares the model simulated surface air temperature climatology to the NCEP-DOE observation estimate. The multi-model ensemble mean compares well to the observations in most respects. Differences from NCEP-DOE are less than 1C over most of the continent except for high latitudes, where NCEP-DOE itself is slightly warmer than the North American Regional Reanalysis (not shown). Beyond the overall simulation of the north-south temperature gradient and seasonal evolution, certain regional features are well captured. In JJA, this includes the regions of temperatures exceeding 30C over Texas and near the Gulf of California, and the extent of temperatures above 10C, including the northward extension of this region into the Canadian prairies. Individual models exhibit substantial regional scatter, including excessive northward extent of the region above 30C through the Great Plains in three of the models (CanESM2, CSIRO-Mk3-6-0 and FGOALS-s2). In DJF, the multi-model ensemble mean does a good job of capturing the 0C contour, while the 10C contour extends slightly too far south, yielding slightly cool temperatures over Mexico, but mostly within 1C of NCEP-DOE except in high latitudes where the models are biased a few degrees cold. This modest wintertime cold bias in high latitudes is more pronounced in certain models such as HadGEM2.

199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221

values. Winter and summer climatological precipitation over land is analyzed first (Fig. 3); winter as the main rainy season over western US and northwestern Mexico, and summer as the main rainy season over central US and most of Mexico. Models capture the winter maximum over the Pacific Northwest but the maximum over southeastern US is best captured by the UKMO-HadCM3 and the ECHAM5/MPI-OM; the low coastal precipitation over Mexico, with the exception of the UKMO-HadCM3 model, is not well simulated as the models tend to precipitate more than observations indicate. The majority of the models in summer, except by the MPI ones, tend to put a maximum of precipitation over central US to the west of the 100W meridian, which is not present in observations. The maximum along the Atlantic and Gulf Coastal Plains in summer is best captured by the CCSM3 model; the other models fail to reproduce the observed structure. The summer maximum over western Mexico is reasonably captured by the models in spite of the different resolutions, with lower values for the UKMO-HadCM3 model, probably due to its coarse resolution. The maximum over eastern Mexico along the Gulf of Mexico in summer is best captured by the UKMO-HadGEM2-ES model, with the other models failing to have a maximum over the region. The CCSM4 model has more realistic precipitation in the Pacific Northwest during winter and over northwestern Mexico in summer. The CMIP5 version of the GFDL model shows some improvement from its CMIP3 version over the southeastern US in winter but very limited improvement over the Midwest and eastern US during summer. The CMIP5 version of the UKMO model better constrains the winter maximum over the Pacific Northwest and better captures the summer maximum along the Mexican coasts. The high bias over Mexico during winter from the ECHAM5 model is slightly reduced in the CMIP5 model. 10

222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244

Surface air temperature is also compared but for the summer-winter difference climatology (Fig. 4). Observed warm differences are largest over the northern continental interior, where seasonal differences in insolation are the largest, and away from the influence of the oceans; a mild warm region is evident over the Rocky Mountains of Wyoming and Colorado. The intense zonal gradient of air temperature along the Pacific coast of the US is evidence of the abrupt contrasting conditions of the terrain and contiguous ocean; the weaker gradient along the eastern and southeastern US, as well as over Mexico, is due to the less severe contrasting conditions at the surface. Models in general capture these features however differences with observations are clear in the southward extension of the warming and the size of the warm region over the Rockies. The change in the version of the CCSM and MPI models from CMIP3 to CMIP5 increases the difference in the seasonal warming and gets closer to observations, by reaching 33K over the US-Canada border, although CCSM4 surpasses this value. The performance of the GFDL model declines as the CMIP5 version decreases the seasonal warming difference and moves it northward largely due to cooler summer temperatures (not shown); on the other hand, the performance of the UKMO models also declines as the CMIP5 version increases the seasonal warming difference largely due to warmer summer temperatures (not shown). The CMIP5 simulations of the HadCM3 model have not been displayed here but their results (from nine ensembles) are very similar to those from the CMIP3 version (with only two ensembles); differences between these simulations are mainly due to the different forcing. Figure 5 shows Taylor diagrams of model performance for the CMIP3 and CMIP5 models. Winter precipitation has less spatial variability than the observed 1.8 mm 11

245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267

day-1 for the majority of the models, ranging from 1.3 mm day-1 for CCSM3 to 1.9 mm day-1 for HadGEM2-ES, and correlations with observations range from just above 0.5 for CCSM3 to 0.8 for HadGEM2-ES. The largest improvements between CMIP3 and CMIP5 for winter precipitation is for the CCSM and Had models. The spatial statistics for summer precipitation have larger spread than for winter precipitation. Spatial variability in summer precipitation is 3.1 mm day-1 for observations and it ranges from 1.8mm day-1 for GFDL-CM3 to 3.3 mm day-1 for HadGEM2-ES, while correlations range from just under 0.5 for GFDL-CM2.1 to 0.8 for HadGEM2-ES. Improvements between CMIP3 and CMIP5 versions of the models in summer precipitation are apparent for CCSM and Had models by increasing both correlation and variability, while the GFDL and MPI models deteriorate by reducing the variability in the former and the correlation in the latter. Surface air temperature (Fig. 5, middle panels) is better simulated than precipitation. Observed spatial variability in winter is 15.6C and, while CMIP3 models are close to this value, the CMIP5 models range from 14.3C for GFDL-CM3 to 16.6C for HadGEM2-ES. However correlations with observations are all above 0.98. With this in mind, a slight deterioration is evident for the GFDL and Had models from decreases and increases, respectively, in the spatial variability with respect to the observations. Spatial statistics for summer surface air temperature have a spread similar to that for winter temperature but with lower values. Observed summer variability is 5.7C and CMIP3 models group very close around this value. However, CMIP5 models range from 5.3C for CCSM4 to 6.5C for MPI-ESM-LR with correlations between 0.89 and 0.96. Again, a small deterioration is evident for the Had models and there is no substantial improvement for the others. 12

268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 3.2. Seasonal Sea Surface Conditions The annual cycle of sea surface temperature (SST) and precipitation in observations and model simulations is shown in Figure 6 as winter-to-spring (DecemberMay) and summer-to-fall (June-November) means. The Western Hemisphere Warm Pool (WHWP), where temperatures are equal or larger than 28.5C, usually is absent from December to February, and appears in the Pacific from March to May, while it is present in the Caribbean and Gulf of Mexico from June to November (Wang and Enfield, 2001). The cooler part of the year is characterized by the small extension of SST in excess of 27C and a suggestion of a cold tongue in the eastern equatorial Pacific, while during the warmer part of the year the extension of SST in excess of 27C is maximum and the cold tongue is well defined over the eastern Pacific. High precipitation along the Mexican coasts, Central America, the Caribbean Islands and the central-eastern US are associated with the warm tropical SSTs during the warm half of the year. A decrease in the regional precipitation south of the equator is also evident in this warm half of the year. Models show the observed change in SST from cold to warm around the WHWP region. However, except by the Had models, the other models are biased cool around the Pacific side of the WHWP region in the cooler part of the year, and are accompanied by spurious precipitation over Mexico. On the other hand, and except for the CCSM3 model, the other models are biased warm over the same region in the warm part of the year and are accompanied by increased precipitation over Mexico as well. The eastern Pacific in the models is slightly cooler than observations in the cold part of the year but not in the form of weak cold tongue from the Peruvian coast but rather as a confined equatorial cooling 13

291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313

away from the coast. The cold tongue along the eastern equatorial Pacific and along the coast of Peru during the warmer part of the year is reasonably captured by the models although its extension is farther to the west. The exception is the HadCM3 model which has warmer temperatures than observations and the other models along the coast of Peru. There is a cool bias over the Intra-American seas part of WHWP in all models during the cold part of year, as well as in CCSM3, GFDL-CM3 and HadGEM2-ES in the warm part of the year. There is little improvement from CMIP3 to CMIP5 models, particularly in the cooler seasons. The CCSM4 model has improved by reducing the cool bias over the Intra-American seas through the year, while the GFDL-CM3 and MPI-ESM-LR models have deteriorated by cooling over the same region in the cold part of the year. The warm bias in HadCM3 over the WHWP region has been considerably reduced in CMIP5, however this reduction has induced a cold bias over the Intra-Americans sea. Results from the CMIP3 and CMIP5 Had models are very close (not shown). Observed spatial variability is 3.9C in the cold part of the year (Fig. 5, lower panels) and CMIP5 models tend to have spatial variability closer to observations than CMIP3, with marginal differences in their spatial correlations with observations. Observed spatial variability in the warm part of the year is 2.7C and in general the CMIP3 models CCSM, GFDL and MPI are closer to this value than CMIP5, but with smaller spatial correlations.

3.2. Atmospheric Moisture Fluxes Vertically integrated moisture transport (vectors) and its divergence (contours) are shown in Fig. 7 for four CMIP5 models and reanalysis estimates for mean JJA and DJF 14

314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336

for 1981-2000. In summer, the observtional estimates from 20CR shows southerly transport from the North Atlantic anticyclone splitting into two distinct branches: one flanking the Atlantic seaboard with large scale convergence off the east coast and a second branch of moisture flows into the interior central plains which is associated with convergence over the Rocky Mountains. The western U.S. is dominated by divergence associated with the northerly component of the North Pacific anticyclone. The four models capture the two branches of moisture transport, with associated convergence off the east coast and divergence in the plains, and they also simulate the divergence in much of the west, but they do not simulate the strong convergence over the Rockies and Mexican Plateau as seen in 20CR. In winter 20CR shows a more zonal transport than during summer, with weaker flow around the subtropical anticyclones and moisture convergence across much of the continent. The models capture both the moisture transport and divergence patterns well including the stronger convergence in the Pacific Northwest and northern California and divergence in southern California.

3.3. Seasonal Terrestrial Hydroclimate Evaluations of the terrestrial hydroclimate are shown in Figures 8-10 against offline observation forced land surface model (LSM) simulations. Fig. 8 shows the regional mean seasonal cycles of the components of the land surface water budget (precipitation, evapotranspiration, runoff, change in water storage). Water storage includes soil moisture, surface water such as lakes, reservoirs, wetlands, groundwater and snowpack. Generally the climate models only simulate soil moisture and snowpack. Figure 8 also separates out the snow component of the water budget in terms of the snow water 15

337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358

equivalent (SWE). Most models have a reasonable seasonal cycle of precipitation and evapotranspiration but tend to underestimate precipitation in the western regions and overestimate evpotranspiration throughout the year and especially in the cooler months. Runoff is generally underestimated, particularly in the central and eastern North American regions and peaks earlier in the spring in some models (that can be linked to a shortened snow season; see below), although the models generally capture the spatial variability in annual total runoff (Figure 9). The majority of models overestimate total runoff over dry regions and high latitudes, particularly for the Pacific Northwest and Newfoundland. SWE is generally overestimated by the multi-model ensemble for western North America, underestimated in the east and overestimated in the Alaskan/Western Canada region, which are a reflection of the precipitation biases. Figure 10 shows the runoff ratio for the US (monthly runoff divided by precipitation), which indicates the production of water at the land surface that is subsequently potentially available as water resources. The remaining precipitation is partitioned into evapotranspiration (assuming that storage does not change much over long time periods). The observation is output from the VIC LSM run within the NLDAS2 (Xia et al., 2012), which is a high resolution regional dataset forced by observedreanalysis hybrid forcing dataset. Overall the models replicate the high ratios in the western and eastern parts of the US and the minimum in the central US. There is an overall tendency to underestimate runoff relative to precipitation, especially in the eastern US as seen for the seasonal cycles in Figure 8 even relative to the model precipitation. Over the western US the models tend to overestimate the ratio. The standard deviation of

16

359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381

the model errors indicates the largest spread across models in the lower Mississippi region.

3.5. Temperature Extremes and Biophysical Indicators Temperature extremes have important consequences for many sectors including human health, ecosystem function, and agricultural production. We evaluate the models ability to replicate the observed spatial distribution of the frequency of extremes for the number of summer days with Tmax > 25oC (Figure 11) and the number of frost days with Tmin < 0oC (Figure 12) (Frich et al., 2002). Overall, the models tend to underestimate the number of summer days by over 50 days in the western US and Mexico, and parts of the eastern US, but otherwise are within 20 days of the observations. Two of the models (HadGem2-ES and CCSM4) overestimate the number of summer days over the central provinces of Canada, with CCSM4 having biases of more than 40 days. The number of frost days are better simulated but there is a high bias for all models across the Canadian Rockies and down into the US Rockies for some models. Some of the models are biased low in the central US by up to 50 days. We also calculated a set of biophysical indicators related to temperature: spring and fall freeze dates and growing season length. We define the growing season length following Schwartz et al. (2006) which is the number of days between the last spring freeze of the year and the first hard freeze of the autumn in the same year. A hard freeze is defined as when the daily minimum temperatures drops below -2oC. Figure 13 shows the average growing season length for seven models in terms of the difference from the HadGHCND dataset. The models do a reasonable job of depicting the spatial distribution 17

382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403

of growing season length. The largest biases of about 40-50 days are along coastal regions, which may be related to differing resolutions in the underlying data. There is a tendency for the models to have too short a growing season in western Canada and too long in the central US. The former is mainly because of the last spring freeze is too late in western Canada and for the latter because of biases in both the last spring freeze (too early) and the first autumn freeze (too late). 3.6. Hydroclimate Extremes We examine the ability of CMIP5 models to simulate persistent drought and wet spells in terms of precipitation and soil moisture (SM). Meteorological drought and wet spells are characterized by the 6-month Standardized Precipitation Index (SPI6; McKee et al., 1993). Agricultural drought and wet spells are evaluated in terms of soil moisture percentiles (Mo, 2008). The record length, Ntotal, is defined as the total months from all runs of a given model experiment or the total months of the observed data set. At each grid point, an extreme negative (positive) event is selected when the SPI6 index is below (above) -0.8 (0.8) for a dry (wet) event [Svoboda et al. 2002]. For SM percentiles, the threshold is 20% (80%) for a dry (wet) event. At each grid cell, the number of months that extreme events occur (N) is 20% of the record length (N/Ntotal = 20%). Because a persistent drought event (wet event) usually means persistent dryness (wetness), a drought (wet) episode is selected when the index is below/above this threshold for 3 consecutive seasons (9 months) or longer. The frequency of occurrence of persistent drought or wet spells (FOC) is defined as: FOC= Np/N, where Np is the number of months that an extreme event persists for 9 months.

18

404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425

Figure 14a shows the FOC averaged for persistent wet and dry events for SPI6. The east-west contrast as driven by the gradient in precipitation amount and variability is evident (Mo and Schemm, 2008). Persistent drought and wet spells are more likely to occur over the western interior region, while extreme events are less likely to persist over the eastern US and the west coast. The maxima of the FOC are located in two bands, one located over the mountains and one extending from Oregon to Texas. Persistent events are also found over the Great Plains. The CanESM2 and CCSM4 models capture the eastwest contrast, although magnitudes of FOC are too weak for the CanESM2 model. The MPI-ESM-LR also captures the signal with one maximum located at Utah and another one over the Great Plains, but the second maximum is too strong. The MIROC4h is a high-resolution model so the precipitation field is very noisy, but it does show larger values over the western interior region. The MIROC-ESM, MRI-CGCM3, and NorEMS1-M models all show a band of maxima over the Southwest, but the FOC north of 35oN is too weak. Other models such as CSIRO-Mk3.6.0, IPSL-CM5A-LR, CNRMCM5, and GISS-E2-H have the maxima located over the Gulf region, which is too far south. Finally, the HadGEM2 and HadCM3 models do not have enough persistent events (not shown). For SM (Figure 15), the FOC from the NLDAS-UW shows that persistent anomalies are located west of 90oN over the western interior region. Many models such as the IPSL-CM5A-LR and BCC-CSM1.1 do not have enough persistent events. The CanESM2, GISS-E2-H, and MRI-CGCM3 shift the maxima to the central US. The CCSM4, MIROC4h, and NorESM1-M fail to capture any east-west contrast because of their high FOC values throughout most of the US. The model that has the best FOC for

19

426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448

SM is the MPI-ESM-LR model as it captures the east-west contrast and also has realistic magnitudes.

4. Regional Climate

4.1. Cool Season Western Atlantic Extratropical Cyclones Extratropical cyclones can have major impacts (heavy snow, storm surge, winds, flooding) along the east coast of North America given the proximity of the western Atlantic storm track. The Hodges (1994; 1995) cyclone tracking scheme was implemented to track cyclones in the CMIP5 models for the cool seasons (November to March) for 1979-2004. The CFSR reanalysis was used to estimate observed cyclone tracks. Six-hourly mean sea-level pressure (MSLP) data were used to track the cyclones, since it was found that including 850-hPa vorticity tracking yielded too many cyclones. Since MSLP is strongly influenced by large spatial scales and strong background flows, a spectral bandpass filter was used to preprocess the data. Those wavelengths between 600 and 10,000 km were kept, and the MSLP pressure anomaly had to persist for at least 24 hours and move at least 1000 km. Colle et al. (2012) describes the details of the tracking approach and some validation of the tracking procedure. Figure 16 shows the cyclone density during the cool season for the CFSR, mean and spread of 15 CMIP models (see Colle et al., 2012 for complete listing), and select CMIP5 models for eastern North America and the western and central North Atlantic. There is a maximum in cyclone density in the CFSR over the Great Lakes, the western Atlantic from just east of the Carolinas to northeastward to just east of Canada, and just 20

449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470

east of southern Greenland (Figure 16a). The largest maximum over the western Atlantic (6-7 cyclones per cool season per 50,000 km2) was located along the northern boundary of the Gulf Stream current. The multi-model ensemble mean is able to realistically simulate the three separate maxima locations (Figure 16b), but the amplitude is 10-20% under-predicted. The cyclone density maximum over the western Atlantic does not conform to the boundary of the Gulf Stream as much as observed. There is a large cyclone density spread near the Gulf Stream, since some models are able to better simulate western Atlantic density amplitude, such as the CESM and HadGEM2-CC (Figures 16e,f). However, the CESM maximum is shifted a few hundred kilometers to the north. Colle et al. (2012) show that the higher resolution models are able to better represent the cyclone density patterns, while lower resolution models, such as GFDLESM2M and MPI-ESM-LR (Figures 16c,d) under-predict the cyclone density. The distribution of cyclone central pressures at their maximum intensity were also compared between the CFSR, CMIP5 ensemble mean, and various CMIP5 models for the dashed box region in Figure 16b (Figure 17). There is a peak in cyclone intensity in both the CFSR and CMIP5 mean around 900-1000 hPa, and there is large spread in the CMIP intensity distribution by almost a factor of two. The ensemble mean realistically predicts the number of average strength to relatively weak cyclones; however, the intensity distribution is too narrow compared to the CFSR, especially for the deeper cyclones < 980 hPa. Colle et al. (2012) illustrates that the higher resolution CMIP5 models can better predict weaker cyclones with no under-prediction on average; however, most of the higher resolution models still under-predict the deeper cyclones. Nevertheless, there is no

21

471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493

under-prediction for these relatively deep cyclones closer to the US East coast in the mean of the higher resolution models (Colle et al., 2012). 4.2. Northeast Precipitation In a warmer climate, precipitation is expected to increase globally since the amount of water vapor will increase. However, there will be regional variations, so it is important to understand these smaller scale changes, especially in highly populated regions such as the Northeast US. The focus of this analysis is the cool season, since extratropical cyclones provide much of the heavy precipitation in the Northeast. The nine models listed on Figure 18d were evaluated for the cool seasons (November to March) of 1979-2004. The model daily precipitation was summed individually and averaged, and t compared with the CPC-Unified precipitation at 0.5 degree and CMAP at 2.5 degree resolution. Figures 18a-c show the seasonal average precipitation for the two CPC analyses and the model mean and spread. The heaviest precipitation (700-1000 mm) was over the Gulf Stream, which is associated with the western Atlantic storm track. This maximum is well depicted in the multi-model mean, although it is unpredicted by 50-200 m, and there is a large spread between members (300-450 mm). The precipitation over the Northeast US ranges from 375 mm over northern and western portions to around 500 mm at the coast. The CPC Unified has more variability downstream of the Great Lakes (lake effect snow) as well as some terrain enhancements. The models can not resolve these smaller scale precipitation features, but the mean realistically represents the north to south variation. However, the mean over predicts precipitation by 25-75 mm (5-20%) over north parts of the Northeast. Much of this over prediction is for thresholds greater than 5 22

494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516

mm/day over land. (Fig. 18d). The seasonal precipitation spread over the Northeast is 100-150 mm (25-40%), and much of this spread is reflected in the higher (> 10/day) thresholds, with the BCC-CMS1 predicting less than the CPC Unified analysis, and a cluster of models, such as the INMCM4 and MIROC5, predicting many more heavy precipitation events than observed.

4.3. Extreme Temperature and Rainfall over the Southern US The southern US is historically prone to extreme climate events such as extreme summer temperatures, flood and dry spells. Previous CMIP and US climate impact assessments (Karl et al. 2009) have projected a large increase of these extreme events over regions of the south (southwest (SW), south central (SC), southeast (SE)), especially for the SW and SC US. However, to what extent climate models can adequately capture the statistical distributions of these extreme events over these regions is still unclear. Figure 19 illustrates the SE, SC and SE US domains we used in our analysis. Figure 20 compares the probability density distributions (PDF) of surface daily maximum (Tmax), minimum temperatures (Tmin) and precipitation intensity over the SE, SC and SW US, respectively, between nine selected CMIP5 models and observations derived from GHCN daily Tmax and Tmin and CPC US-Mexico daily gridded rainfall. Over the SE (Fig. 20a, top), the CCSM4 and MIROC5 models best capture the distribution of Tmax, while other models tend to underestimate the most frequently occurringly Tmax values. In winter (DJF), spring (MAM) and fall (SON), the biases are mainly caused by a shift of the PDF toward colder temperatures by 9K, whereas during summer (JJA), the biases are mainly due to the unrealistic skewed shape of the PDF, i.e., 23

517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538

underestimation of the occurrence of warmer than normal and extreme Tmax values, and overestimation of the occurrence of colder than normal and extreme Tmax values. By contrast, these models tend to overestimate Tmin (Fig. 20b, top). During spring and fall, such a bias is mainly contributed to by a shift of the PDF toward warmer temperatures by about 9K, except for GFDL-CM3. During winter and summer, the biases are mainly due to overestimation of the frequency of occurrence of warmer than normal and warm extreme Tmin values, and underestimation of the occurrence of cooler than normal Tmin values. HadCM3, HadGEM2 and MIROC5 have the least of such biases. Except for HadGEM2 and MIROC5, the models generally underestimate extreme rainrates, over estimate moderate and heavy rainrates, and especially during the summer (Fig. 20c, top). Over the SC (Fig. 20, middle), most of the models have PDFs skewed toward underestimation of the occurrence of warmer Tmax and overestimation of the occurrence of cooler Tmax during winter, spring and fall (Fig. 20a, middle). During the summer, CCSM4 and HadGEM2 capture well the PDF of Tmax. The rest of the models underestimate warmer Tmax and overestimate cooler Tmax, especially the coolest Tmax values, except for MIROC5, which over estimates the occurrence of warmer Tmax and underestimates cooler Tmax, including extreme warm and cold Tmax. Most of the models overestimate the occurrence of warmer Tmin and underestimate cooler Tmin (Fig. 20b, middle). Overall, HadGEM2 and HadCM3 have the least biases in Tmin over this region. HadGEM2 generally captures the observed precipitation rate PDF (Fig. 20c, middle). All other models underestimate the heavy to extreme precipitation rates and overestimate moderate to precipitation rates.

24

539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561

Over the SW (Fig. 20, bottom), the 9 selected models generally underestimate Tmax during winter, spring and fall, mainly due to a shift of PDFs toward cooler Tmax (Fig. 20a, bottom). This shift causes a substantial underestimation of warmer Tmax, including extreme warmer Tmax. During the summer, CCSM4 and MIROC5 realistically capture the PDF of Tmax, whereas other models overestimate colder than normal Tmax and underestimate warmer than normal Tmax, except for HadCM3 and HadGEM2. These two models overestimate warmer Tmax, and underestiamte cooler Tmax. The models underestimate the spread of Tmin during winter and spring (Fig. 20b, bottom), and have warmer biases during summer and fall, except for HadCM3, which as cold bias in Tmin. All 9 models substantially underestimate the occurrence of heavy and extreme rainfall. Such biases are strongest in summer, but also occur in spring and fall (Fig. 20c, bottom). In winter, the majority of models realistically capture the distribution of rain rates, except for MRI-CGCM3 and CCSM4 that overestimate the occurrence of extreme rainfall (>50 mm/day). Fig. 20 suggests that the models generally underestimate Tmax and overestimate Tmin, and underestimate extreme rain rates over all three southern US regions. They also overestimate moderate to heavy rain rates over SE and SC US. MIROC5 best captures the distributions of all three surface climate variables over SE US, whereas HadGEM2 overall best capture these distributions over SC US. Over SW US, no models can adequately capture the distributions of all three surface climate conditions. How well the models capture the statistical distributions of severe to extreme droughts over the southern US is evaluated in terms of the PDFs of the six and nine month standardized precipitation index (SPI6, SPI9) in Figure 21. SPI6 and SPI9 25

562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584

represent multi-seasonal droughts that have great societal impacts. Most of the models closely capture the occurrence of severe to exceptional drought (SPI < -1.5), except for CCSM4 that overestimates the frequency of exceptional droughts (SPI < -2) and underestimates the frequency of moderate to extreme droughts (-1 < SPI < -2). CCSM4 also overestimates the occurrence of mild to moderate wet anomalies and underestimates the occurrence of severe wet anomalies. The modeled PDFs of nine-month SPI show similar biases as SPI6.

4.4. North American Monsoon The North American Monsoon (NAM) brings rainfall to southern Mexico in May, expanding northward to the Southwest US by late June or early July. Monsoon rainfall accounts for roughly 50-70% of the annual totals in these regions (Douglas et al. 1993; Adams and Comrie 1997), with the annual percentages decreasing northward where winter rains become increasingly important. The annual cycle of precipitation in the NAM region is examined in Figure 22. The multi-model ensemble mean monthly precipitation from sixteen CMIP5 models (averaged for longitudes 102.5-115W for 1979-2005) captures the northward migration of precipitation in the NAM region during the warm season. The models' precipitation (Figure 22b) begins later and is weaker than the observed estimate from CMAP (Figure 22a). The seasonal cycle of monthly precipitation in the core NAM region of northwest Mexico (23-30N, 110-105W) is also examined in Figures 23 and 24 for 21 CMIP5 models. Following the methodology of Liang et al. (2008) for analysis of CMIP3 data, we calculate a phase and RMS error of each model's seasonal cycle, where the phase 26

585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607

error is defined as the lag in months with the best correlation to the observations (Fig. 23a-b). We additionally calculate each model's annual mean bias (Fig. 23c). The seasonal cycle for models with small (lag=0), moderate (lag=1) and large (lag=2-4) phase errors are shown in Figure 24. The dashed gray line in Figure 24a shows the mean of all the models with a small phase error. Overall the small phase error models tend to fall in between the two observational data sets (P-NOAA and CMAP) in summer but tend to overestimate rainfall compared to these observations in fall and winter in the core NAM region (Fig. 23a). An exception is the CanESM2 model, which tends to underestimate monsoon rainfall but shows good agreement in the winter. The overestimation of rainfall by the models beyond the end of the monsoon season is seen to extend from the southern most to the northern most latitudes of the NAM region (Fig. 22c) and was also apparent in the small and large phase error CMIP3 models (Liang et al. 2008). Four models with small phase errors for CMIP5 also have small phase errors for CMIP3 (Liang et al. 2008): CCCMA, Had, MIROC, and CSIRO. In addition, the CNRM model reduced its phase error and halved its RMS error when compared to the CMIP3 generation model, making it one of the better CMIP5 models for capturing seasonal variability in the core NAM region. The moderate phase error models show a maximum in precipitation during the summer months, however a subset of these models are also the models with the greatest overestimation of rainfall from July through January of all 21 models examined (Fig. 24b). The large phase error models all indicate a maximum in rainfall during late fall or winter (Fig. 24c). The CMIP3 version of INM and the GISS/EH also had large phase errors similar to what is shown in Fig. 24c, while the MIROC Hires and Medres CMIP3 27

608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629

models were small and medium phase error models (Liang et al. 2008) and thus best compare to the MIROC4h and MIROC5 models for CMIP5.

4.5. Great Plains Low Level Jet An outstanding feature in the warm season (May-September) circulation in North America is the strong and channeled southerly low-level flows, or the Great Plains lowlevel jet, from the Gulf of Mexico to the central U.S. and the Midwest (Bonner and Paegle 1970; Mitchell et al. 1995). The jet emerges in early May in the transition of the circulation from the cold to the warm season. It reaches its maximum strength in June and July. After August, the jet weakens and disappears in September when the cold season circulation starts to set in. While many studies have examined specific processes associated with the low-level jet (Blackadar, 1957; Wexler, 1961; Holton, 1967), such as its nocturnal peak in diurnal wind speed oscillation, the jet is a part of the seasonal circulation shaped primarily by the orographic configuration in North America, particularly the Rocky Mountain Plateau on the west and the North Atlantic Ocean and the Gulf of Mexico with the subtropical high pressure system in the east and the south (e.g., Wexler, 1961; Veres and Hu, 2012). An important climatic role of the low-level jet (LLJ) is to transport moisture from the Gulf of Mexico to the central and eastern US (Benton and Estoque, 1954; Rasmusson, 1967; Helfand and Schubert, 1995; Byerle and Paegle, 2003). Because the moisture is essential for development of precipitation, correctly describing the LLJ and its seasonal cycle is critical for simulating and predicting warm season precipitation and climate in North America.

28

630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651

Outputs from the five models (CCSM4, HadGEM2-CC, GFDL-CM3, GISS-ER, and MPI-ESM) were analyzed for the LLJ during boreal summer (June, July, and August). Figure 25 shows the 925hPa winds simulated by the models and derived from the NCEP-NCAR reanalysis data and suggests that the higher resolution models (the CCSM4, HadGEM2-CC, MPI-ESM) capture the major features of the LLJ in both its intensity and horizontal structure. The vertical structures of the jet simulated by these models (not shown) are also consistent with the reanalysis. When the simulated seasonal average winds are broken into monthly averages, the models results remain consistent with that from the reanalysis. The jet simulated by the relatively coarse resolution models (GFDL-CM3 and GISS-E2-R), compares less favorably with the reanalysis, however. The jet is virtually absent in GISS-E2-R (Fig. 25d). These coarse resolution models also describe a weak or absent core of strong southerly flow in the vertical profile of the lower tropospheric winds. These results suggest that the accuracy in simulating the LLJ is affected by the model horizontal resolution, which directly affects the orographic role in LLJ development (Byerle and Paegle, 2003). Comparison of the seasonal cycle of the LLJ is summarized in Figure 26. Again, higher resolution models describe the transition and establishment/collapse of the LLJ fairly consistently (Figs. 26b, e and f), and the coarser models have difficulty in capturing the meridional scale and the intensity of the LLJ during the summer months. The GISSE2-R captures the LLJ peak in intensity in April and May in a more southerly latitude and collapses in June, nearly three months before the reanalysis. The consistent simulation of the transition and seasonal cycle of the LLJ by the higher resolution models suggest that

29

652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673

they are able to describe the circulation and climate of North America, but the ability of the GISS-E2-R model to do the same is questionable.

4.6. Arctic/Alaska Sea Ice Since routine monitoring by satellites began in late October 1978, Arctic sea ice has declined in all calendar months (e.g. Serreze et al., 2007). Trends are largest at the end of the summer melt season in September with a current rate of decline through 2011 of -12.9% per decade. Regionally summer ice losses have been pronounced in the Beaufort, Chukchi and East Siberian seas since 2002 causing a lengthening of the ice-free season. The presence of sea ice helps to protect Alaskan coastal regions from wind-driven waves and warm ocean water that can weaken frozen ground. As the sea ice has retreated further from coastal regions, and ice-free summer conditions are lasting for longer periods of time (in some regions by more than 2 months during the satellite data record), wind-driven waves, combined with permafrost thaw and warmer ocean temperatures, have led to rapid coastal erosion (Mars and Houseknecht, 2007; Jones, et al., 2009). While the winter ice cover is not projected to disappear in the near future, all models that contributed to the IPCC 2007 report showed that as temperatures rise, the Arctic Ocean will eventually become ice-free in summer (e.g. Stroeve et al., 2007). However, estimates differed widely, with some models suggesting a transition towards a seasonally ice-free Arctic may happen before 2050, and others, sometime after 2100. To reduce the spread some studies suggest using only models that are able to reproduce the historical sea ice extent (e.g. Overland et al., 2011; Wang and Overland, 2009).

30

674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696

Historical sea ice extent (1979-2005) from 18 climate models during September and March is presented as box and whisker plots (Figure 27), constructed from all ensemble members of all models. Three climate models (CanESM2, GISS-E2-R and MIROC4h) have mean September extents that fall below the minimum observed value, with GISS-E2-R and CanESM2 having more than 75% of their extents below the minimum observed value. Only one model (NorESM1) has more than 75% of its extent above the maximum observed value. Overall, 11 climate models have mean extents below the observed 1979-2005 mean September extent. During March, several models fall outside the observed range of extents, with 11 models having more than 75% of their extents outside the observed maximum and minimum values (6 above, 5 below). Four models essentially straddle the mean observed March sea ice extent. Spatial maps of March and September CMIP5 sea ice thickness averaged from 1996 to 2005 are shown in Figure 28 together with the ICESat-derived thickness from 2003 to 2008. While we do not expect the models to be in phase with the observed natural climate variability and therefore, accurately capture the magnitude of the ICESat thickness fields, it is important to assess whether or not the models are able to reproduce the observed spatial distribution of ice thickness. Data from ICESat, as well as earlier radar altimetry missions and submarine tracks indicate that the thickest ice is located north of Greenland and the Canadian Archipelago (> 5m thick). Only a few of the models show a similar spatial distribution of ice thickness (e.g. CCSM4, CESM1-CAM5, GFDCL-CM3, HadGEM2-CC, HadGEM2-ES, MIROC5, NorESM1). Instead, several models show a ridge of thick ice that spans north of Greenland across the Lomonosov Ridge towards the East Siberian Shelf, with thinner ice in the Beaufort/Chukchi and the 31

697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718

Kara/Barents seas. Many models also significantly underestimate the observed mean ice thickness of the Arctic basin, with basin-wide averaged thickness less than 2m in March, and less than 1m in September (e.g. BCC-CSM1, CanESM2, CNRM-CM5, GFDLESM2M, MIROC-ESM). This in part explains the low bias in September ice extent for some of the models, as thinner ice is more prone to melting out in summer. Models with extensively thick winter ice (e.g. NorESM1-M and MIROC5) on the other hand tend to overestimate the observed September ice extent. Figure 29 shows the trends in September sea ice extent from 1979-2005 based on 45 ensemble members. Trends are derived using linear least squares and are reported as 106 km2. Corresponding error bars are derived after taking into account temporal autocorrelation following Stroeve et al. (2012). The observed rate of decline for September is -0.594 km2 per decade. Results show that 29 (or 64%) of the ensemble members have mean trends smaller than the observed mean trend and 12 ensemble members (or 27%) do not have trends that are statistically different from zero. Twelve ensemble members have mean trends larger than observed, with 3 ensemble members having mean trends below the 2 bound of the observations (from CNRM-CM5, GISS E2-R, and MIROC5). Thirteen ensemble members have mean trends smaller than the 2 bound of the observations. As in Stroeve et al. (2012), a d-statistic is calcuated to test the null hypothesis that the trend from any given ensemble member is consistent with the observed trend. From 1979-2005, the null hypothesis is rejected for 78% of the ensemble members at the 90% confidence level, 11% at the 95% confidence level. The multi-model ensemble mean trend is -0.484 km2 per decade, which is not statistically different from

32

719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741

zero. For comparison the CMIP3 multi-model ensemble mean trend from 1979 to 2005 is -0.365 km2 per decade.

5. Discussion and Conclusions We have evaluated the CMIP5 multi-model ensemble for its depiction of North American continental and regional climatology. The multi-model ensemble does reasonably well in representing the main features of precipitation over North America and the adjoining seas. There is large spread in individual model performance regionally however. For example, most models have too much precipitation in the continental interior and the multi-model mean has too much spread in the location of East Coast storm tracks. For some regions, a few outlier models provide much of the bias in the multi-model mean. Similar to precipitation, the multi-model ensemble reproduces reasonably well the spatial distribution of temperature and is generally within 1oC of observations. In particular the multi-model mean does well at representing regional features such as the southerly regions with mean summertime temperatures in excess of 30oC, the extension of temperatures above 10oC into the Canadian prairies and the wintertime 0oC contour. The models show a reasonable depiction of the spatial distribution and seasonal changes in sea surface temperatures, but with certain regional features that are biased warm or cool, such as a warm bias over the WHWP, which is associated with too high precipitation over Mexico. For the small subset of models analyzed, they are able to replicate the main features of atmospheric moisture transport into the North American continent, with associated convergence off the east coast and divergence in the central plains and most of the west. However, they do not simulate well 33

742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764

the strong convergence over the Rockies and Mexican Plateau. The models have a reasonable seasonal cycle of terrestrial hydrology but the regional biases in precipitation filter down into biases in snow accumulation and runoff. In particular, the models tend to underestimate precipitation in western regions and overestimate evapotranspiration in the cooler months. The spatial distribution of runoff is reasonable but is generally underestimated relative to observations and relative to model precipitation. The models do reasonably well at capturing the spatial distribution of seasonal temperature extremes, as characterized by the number of summer days and the number of frost days, and the timing of biophysical indicators such as the growing season length. The frequency of summer days is too low in western regions and the frequency of frost days is too high in the western mountains for most models. The models growing season is generally too long overall with the largest positive biases (up 40-50 days) in the central US and coastal regions, but is too short in western Canada. The models show quite different skill in simulating the frequency of occurrence of persistent hydroclimate anomalies. Some models are able to capture the east-west contrast in precipitation events and these models also have a realistic precipitation climatology. Interestingly, even if a model captures the frequency of occurrence for precipitation events, it may not capture that for soil moisture. As well as being able to simulate large-scale circulation anomalies, the model must also have a realistic land surface model, including soil properties and vegetation, that captures the feedbacks between precipitation, and evaporation. We also analyzed a set of regional climate features that potentially could provide a difficult test of coarse resolution models. The multi-model mean captures the cool season cyclone density over the western North Atlantic but under predicts the magnitude 34

765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787

by 10-20%, which can be attributed partly to model resolution. All models tend to under predict the frequency of strong cyclones. The increase in precipitation towards the Northeast US coast associated with the storm track is realistically simulated by the models, although there is an over prediction over northern New England and southeast Canada and for higher thresholds in most models. Over the southern US, the models tend to underestimate the frequency of the hottest temperature maximums in summer and overestimate the frequency of cooler temperature maximums. Most models underestimate heavy rainfall over the southern US, although a few models, in particular HadGEM2, do reasonably well. The observed frequency of longer-term dry and wet spells, as represented by SPI6 and SPI9 indices, is well simulated by most models. The North American monsoon is generally later and underestimated in terms of precipitation for the multi-model mean. In the core monsoon region, models with no phase error in the seasonal cycle of precipitation tend to over estimate rainfall in the fall and winter but capture the peak rainfall during the monsoon months. Models with phase errors of one month tend to overestimate precipitation throughout most of the year, and there are a few models that peak later in the fall or early winter which is up to 4 months too late. Overall, onset is captured well by several models but almost all have difficulty ending the monsoon, a problem also observed with CMIP3 (e.g. Liang et al. 2008). The five models evaluated in terms of the Great Plains LLJ during summer time capture its main features, with the accuracy of the northerly extension and intensity related to the model resolution. There is a tendency for the models to under predict sea ice extent in September and for most models to fall outside of the range of observations in May. Trends in September sea ice extent was examined for 1979-2005 and showed that trends from most models 35

788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810

underestimate the observed rate of decline. The multi-model mean trend is not statistically different from zero. Overall, the performance of the CMIP5 models in representing observed climate features has not improved dramatically compared to CMIP3, at least for the set of models and climate features analyzed here. There are some models that have improved for certain features (e.g. the timing of the NAM, the maximum precipitation over the Pacific Northwest in winter, or the WHWP in the warm part of the year), but others that have become worse (e.g. the summer minus winter difference in surface air temperature, or the cold tongue along the equatorial Pacific SST in winter). For sea ice extent, the spread as to when a seasonally ice-free Arctic may be realized is essentially the same as in CMIP3 (Stroeve et al., 2012) and so they have not reduced the uncertainty. Model performance for basic climate variables has not improved. An outstanding issue with the models is their tendency to place a spurious maximum of precipitation westward of the 100W meridian over the US in summer. In spite of the problems in simulating certain regional features of these basic climate variables, some CMIP5 models seem to have improved the spatial variability of precipitation over North American, however, the improvement for surface air temperature and SST over neighboring oceans is not as evident given that the CMIP3 models already did well at reproducing their spatial statistics. The results of this paper have implications for the robustness of future projections of climate and its associated impacts. Part three of this paper (Maloney et al., 2012) evaluates the CMIP5 models for N. America in terms of the future projections for the same set of climate features as evaluated for the 20th century in this first part and the second part of the paper (Sheffield et al., 2012). Whilst model historical performance is 36

811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826

not sufficient for credible projections, the depiction of at least large scale climate features is necessary. Overall, the models do well in capturing the broad scale climate of N. America and some regional features, but biases in some aspects are of the same magnitude as the projected changes (Maloney et al., 2012). For example, the low bias in daily maximum temperature over the southern US in some models is similar to the future projected changes. Furthermore, the uncertainty in the future projections across models can also be of the same magnitude the model spread for the historic period. Acknowledgements. We acknowledge the World Climate Research Programme's Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modeling groups for producing and making available their model output. For CMIP the U.S. Department of Energy's Program for Climate Model Diagnosis and Intercomparison provides coordinating support and led development of software infrastructure in partnership with the Global Organization for Earth System Science Portals. The authors acknowledge the support of NOAA Climate Program Office Modeling, Analysis, Predictions and Projections (MAPP) Program as part of the CMIP5 Task Force.

37

827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847

References Adler, R. F., G. J. Huffman, A. Chang, R. Ferraro, P. Xie, J. Janowiak, B. Rudolf, U. Schneider, S. Curtis, D. Bolvin, A. Gruber, J. Susskind, and P. Arkin, 2003: The Version 2 Global Precipitation Climatology Project (GPCP) Monthly Precipitation Analysis (1979Present). J. Hydrometeor., 4,11471167. Arora, V. K., J. F. Scinocca, G. J. Boer, J. R. Christian, K. L. Denman, G. M. Flato, V. V. Kharin, W. G. Lee, and W. J. Merryfield, 2011: Carbon emission limits required to satisfy future representative concentration pathways of greenhouse gases, Geophys. Res. Lett., 38, L05805, doi:10.1029/2010GL046270. Bao, Q., and co-authors, 2012: The Flexible Global Ocean-Atmosphere-Land System model Version: FGOALS-s2. Adv. Atmos. Sci., submitted. Bi, D., M. Dix, S. Marsland, T. Hirst, S. OFarrell and coauthors, 2012: ACCESS: The Australian Coupled Climate Model for IPCC AR5 and CMIP5. AMOS conference, 2012, Sydney, Australia (available online at

https://wiki.csiro.au/confluence/display/ACCESS/ACCESS+Publications) Blackadar, A. K., 1957: Boundary layer wind maxima and their significance for their growth of nocturnal inversions. Bull. Amer. Meteor. Soc., 38, 283290. Bonner, W. D., 1968: Climatology of the low level jet. Mon. Wea. Rev., 96, 833-850. Byerle, L. A., and J. Paegle, 2003: Modulation of the Great Plains low-level jet and moisture transports by orography and large scale circulations. J. Geophys. Res., 108, 8611, doi:10.1029/2002JD003005.

38

848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869

Caesar, J., L. Alexander, and R. Vose, 2006: Large-scale changes in observed daily maximum and minimum temperatures: Creation and analysis of a new gridded data set. J. Geophys. Res., 111, D05101, doi:10.1029/2005JD006280 Chylek, P., J. Li, M. K. Dubey, M. Wang, and G. Lesins, 2011: Observed and model simulated 20th Century Arctic temperature variability: Canadian Earth System Model CanESM2. Atmospheric Chemistry and Physics Discussions, 11 (8), 22,89322,907, doi: 10.5194/acpd-11-22893-2011 Colle, B. A., Z. Zhang, K. Lombardo, P. Liu, E. Chang, M. Zhang, and S. Hameed, 2012: Historical and future predictions of eastern North America and western Atlantic extratropical cyclones in CMIP5 during the cool season. J. Climate, submitted. Collins, M., S. F. B Tett, and C. Cooper, 2001: The internal climate variability of HadCM3, a version of the Hadley Centre Coupled Model without flux adjustments. Climate Dynamics, 17 (1), 61-81. Compo, G. P., J. S. Whitaker, P. D. Sardeshmukh, N. Matsui, R. J. Allan, X. Yin, B. E. Gleason, R. S. Vose, G. Rutledge, P. Bessemoulin, S. Brnnimann, M. Brunet, R. I. Crouthamel, A. N. Grant, P. Y. Groisman, P. D. Jones, M. Kruk, A. C. Kruger, G. J. Marshall, M. Maugeri, H. Y. Mok, . Nordli, T. F. Ross, R. M. Trigo, X. L. Wang, S. D. Woodruff, and S. J. Worley, 2011: The Twentieth Century Reanalysis Project. Quarterly J. Roy. Meteorol. Soc., 137, 1-28. DOI: 10.1002/qj.776. Donner, L. J., with 28 co-authors, 2011: The dynamical core, physical parameterizations, and basic simulation characteristics of the atmospheric component AM3 of the GFDL Global Coupled Model CM3. J. Climate, 24, doi:10.1175/2011JCLI3955.1.

39

870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891

Dufresne, J-L., and 58 co-authors, N, 2012: Climate change projections using the IPSLCM5 Earth System Model: from CMIP3 to CMIP5, Clim. Dyn., submitted. Fetterer, F., K. Knowles, W. Meier, and M. Savoie, 2002, updated 2009. /Sea Ice Index/. Boulder, Colorado USA: National Snow and Ice Data Center. Digital media. Frich, P., L. V. Alexander, P. Della-Marta, B. Gleason, M. Haylock, A. M. G. Klein Tank, and T. Peterson, 2002: Observed coherent changes in climatic extremes during the second half of the twentieth century. Climate Res., 19, 193212. Gent, Peter R., and Coauthors, 2011: The Community Climate System Model Version 4. J. Climate, 24, 49734991. doi: http://dx.doi.org/10.1175/2011JCLI4083.1 Giorgi, F. and R. Francisco, 2000: Uncertainties in regional climate change prediction: a regional analysis of ensemble simulations with the HADCM2 coupled AOGCM. Climate Dynamics, 16 (2-3), 169182, doi:10.1007/PL00013733. Hazeleger, W., and 31 co-authors, 2010: EC-Earth: A seamless Earth system prediction approach in action. Bull. Amer. Meteor. Soc., 91, 1357-1363, doi:

10.1175/2010BAMS2877.1 Helfand, H. M., and S. D. Schubert, 1995: Climatology of the simulated Great Plains low-level jet and its contribution to the continental moisture budget of the United States. J. Climate, 8, 784806. Higgins, R. W., J. E. Janowiak and Y.-P. Yao, 1996: A gridded hourly precipitation data base for the United States (1963-1993). NCEP/Climate Prediction Center Atlas No. 1, U. S. Department of Commerce, National Oceanic and Atmospheric Administration, National Weather Service.

40

892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914

Hodges, K. I., 1994: A general method for tracking analysis and its application to meteorological data. Mon. Wea. Rev., 122, 25732586. Hodges, K. I., 1995: Feature tracking on the unit sphere. Mon. Wea. Rev., 123, 3458 3465. Holton, J. R., 1967: The diurnal boundary layer wind oscillation above sloping terrain. Tellus, 19, 199205. Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, K. P. Bowman, Y. Hong, E. F. Stocker, D. B. Wolff, 2007: The TRMM Multisatellite Precipitation Analysis: QuasiGlobal, MultiYear, CombinedSensor Precipitation Estimates at Fine Scale. J. Hydrometeor., 8 (1), 3855. Jones, B. M., C. D. Arp, M. T. Jorgenson, K. M. Hinkel, J. A. Schmutz, and P. L. Flint , 2009: Increase in the rate and uniformity of coastline erosion in Arctic Alaska. Geophys. Res. Lett., 36, L03503, doi:10.1029/2008GL036205 Jones, C. D., and others, 2011: The HadGEM2-ES implementation of CMIP5 centennial simulations, Geosci. Model Dev., 4, 543-570, doi:10.5194/gmd-4-543-2011. Kalnay E., Kanamitsu M., Kistler R., Collins W., Deaven D., Gandin L., Iredell M., Saha S., White G., Woollen J., Zhu Y., Chelliah M., Ebisuzaki W., Higgins W., Janowiak J., Mo K.C., Ropelewski C., Wang J., Leetman A., Reynolds R., Jenne R., Joseph D., 1996: The NCEP/NCAR 40-year reanalysis project. Bull. Am. Meteorol. Soc., 77, 437471. Kim, D., A. H. Sobel, A. D. Del Genio, Y. Chen, S. Camargo, M.-S. Yao, M. Kelley, and L. Nazarenko, 2012: The tropical subseasonal variability simulated in the NASA GISS general circulation model, J. Clim., in press. 41

915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937

Kwok, R., and G. F. Cunningham, 2008: ICESat over Arctic sea ice: Estimation of snow depth and ice thickness. J. Geophys. Res., 113, C08010, doi:10.1029/2008JC004753. Liang, X., D. P. Lettenmaier, E. F. Wood, and S. J. Burges, 1994: A simple hydrologically based model of land surface water and energy fluxes for GCMs. J. Geophys. Res., 99, 14,415-14,428. Liang, X.-Z., J. Zhu, K. E. Kunkel, M. Ting, and J. X. L. Wang, 2008: Do CGCMs simulate the North American monsoon precipitation seasonal-interannual variations. J. Climate, 21, 3755-3775. Maloney, E. D., S. J. Camargo, E. Chang, B. Colle, R. Fu, K. L. Geilw, Q. Hu, X. Jiang, N. Johnson, K. B. Karnauskas, J. Kinter, B. Kirtman, S. Kumar, B. Langenbrunner, K. Lombardo, L. Long, A. Mariotti, J. E. Meyerson, K. Mo, J. D. Neelin, Z. Pan, R. Seager, Y. Serraw, A. Seth, J. Sheffield, J. Thibeault, S.-P. Xie, C. Wang, B. Wyman, and M. Zhao, 2011: North American Climate in CMIP5 Experiments: Part III: Assessment of 21st Century Projections. J. Climate, submitted. Mars, J. C., and D. W. Houseknecht, 2007: Quantitative remote sensing study indicates doubling of coastal erosion rate in past 50 yr along a segment of the Arctic coast of Alaska. Geology, 35 (7), 583-586. doi: 10.1130/G23672A.1 Mitchell, M. J., R. W. Arritt, and K. Labas, 1995: A climatology of the warm season Great Plains low-level jet using wind profiler observations. Wea. Forecasting, 10, 576591. Mitchell, T. D., and P. D. Jones, 2005: An improved method of constructing a database of monthly climate observations and associated high-resolution grids. Int. J. Climatol., 25, 693712. 42

938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959

Overland, J. E., 2011: Potential Arctic change through climate amplification processes. Oceanography, 24 (3), 176-185. http://dx.doi.org/10.5670/oceanog.2011.70 Rasmusson, E.M., 1967: Atmospheric water vapor transport and the water balance of the North America. Part I: Characteristics of the water vapor flux field. Mon. Wea. Rev., 95, 403-426. Rayner, N. A., D. E. Parker, E. B. Horton, C. K. Folland, L. V. Alexander, D. P. Rowell, E. C. Kent, and A. Kaplan, 2003: Globally complete analyses of sea surface temperature, sea ice and night marine air temperature, 1871-2000. J. Geophys. Res., 108 (4407). Saha, S., and Coauthors, 2010: The NCEP Climate Forecast System Reanalysis. Bull. Amer. Meteor. Soc., 91, 10151057. doi: http://dx.doi.org/10.1175/2010BAMS3001.1 Sakamoto, T. T., Y. Komuro, T. Nishimura, M. Ishii, H. Tatebe, H. Shiogama, A. Hasegawa, T. Toyoda, M. Mori, T. Suzuki, Y. Imada, T. Nozawa, K. Takata, T. Mochizuki, K. Ogochi, S. Emori, H. Hasumi and M. Kimoto, 2012: MIROC4h - a new high-resolution atmosphere-ocean coupled general circulation model. J. Meteor. Soc. Japan, 90 (3), 325-359. Sheffield, J., G. Goteti, and E. F. Wood, 2006: Development of a 50-yr high-resolution global dataset of meteorological forcings for land surface modeling, J. Climate, 19 (13), 3088-3111. Sheffield, J. and E. F. Wood, 2007: Characteristics of global and regional drought, 19502000: Analysis of soil moisture data from off-line simulation of the terrestrial hydrologic cycle. J. Geophys. Res., 112 (D17), doi:10.1029/2006JD008288656

43

960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981

Sheffield, J., S. J. Camargo, B. Colle, Q. Hu, X. Jiang, N. Johnson, S. Kumar, K. Lombardo, B. Langenbrunner, E. Maloney, J. E. Meyerson, J. D. Neelin, Y. L. Serra, D.-Z. Sun, C. Wang, S.-P. Xie, J.-Y. Yu, T. Zhang, 2012: North American Climate in CMIP5 Experiments: Part II: Evaluation of 20th Century Intra-Seasonal to Decadal Variability, J. Climate, submitted. Stroeve, J., M. M. Holland, W. Meier, T. Scambos, and M. Serreze, 2007: Arctic sea ice decline: Faster than forecast. Geophys. Res. Lett., 34, L09501,

doi:10.1029/2007GL029703 Stroeve, J. C., V. Kattsov, A. Barrett, M. Serreze, T. Pavlova, M. Holland, and W. N. Meier, 2012: Trends in Arctic sea ice extent from CMIP5, CMIP3 and observations, Geophys. Res. Lett., doi: 10.1029/2012GL052676R, in press. Taylor, K. E., R. J. Stouffer, and G. A. Meehl, 2012: An overview of CMIP5 and the experiment design. Bull. Am. Meteorol. Soc., 93, 485498, doi:10.1175/BAMS-D-1100094.1. Universidad Nacional Autnoma de Mxico (UNAM), 2007: Gridded precipitation and temperature analysis from the Centro de Ciencias de la Atmsfera, Mexico; available from the International Research Institute for Climate and Society

(http://ingrid.ldeo.columbia.edu/SOURCES/UNAM/gridded/monthly/v0705/). Veres, M. C., and Q. Hu, 2012: AMO-forced regional processes affecting summertime precipitation variations in the central United States. J. Climate, in press. Voldoire, A., and others, 2012: The CNRM-CM5.1 global climate model: Description and basic evaluation, Clim. Dyn., doi:10.1007/s00382-011-1259-y, in press.

44

982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003

Volodin, E. M., N. A. Diansky, and A. V. Gusev, 2010: Simulating Present-Day Climate with the INMCM4.0 Coupled Model of the Atmospheric and Oceanic General Circulations. Izvestia, Atmospheric and Oceanic Physics, 46, 414-431 Vose, R. S., R. L. Schmoyer, P. M. Steurer, T. C. Peterson, R. Heim, T. R. Karl, and J. Eischeid, 1992: The Global Historical Climatology Network: long-term monthly temperature, precipitation, sea level pressure, and station pressure data.

ORNL/CDIAC-53, NDP-041. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, Oak Ridge, Tennessee. Wang, C., and D. B. Enfield, 2001: The tropical Western Hemisphere warm pool. Geophys. Res. Lett., 28, 1635-1638. Wang, M. and J. E. Overland (2009), A sea ice free summer Arctic within 30 years?, Geophys. Res. Lett., 36, L07502, doi:10.1029/2009GL037820. Wang, A., T. J. Bohn, S. P. Mahanama, R. D. Koster, and D. P. Lettenmaier, 2009: Multimodel ensemble reconstruction of drought over the continental United States. J. Climate, 22, 26942712. Watanabe, M., and Coauthors, 2010: Improved Climate Simulation by MIROC5: Mean States, Variability, and Climate Sensitivity. J. Climate, 23, 63126335. Wexler, H., 1961: A boundary layer interpretation of the low level jet. Tellus, 13, 368 378. Xie, P., and P.A. Arkin, 1997: Global precipitation: A 17-year monthly analysis based on gauge observations, satellite estimates, and numerical model outputs. Bull. Amer. Meteor. Soc., 78, 2539 - 2558.

45

1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022

Xie, P., M. Chen, and W. Shi, 2010: CPC unified gauge-based analysis of global daily precipitation. Preprints, 24th Conf. on Hydrology, Atlanta, GA, Amer. Meteor. Soc., 2.3A Xin X., Wu T., Zhang J., 2012: Introductions to the CMIP 5 simulations conducted by the BCC climate system model (in Chinese). Advances in Climate Change Research. submitted. Yukimoto, S., et al., 2012: A new global climate model of the Meteorological Research Institute: MRI-CGCM3Model description and basic performance, J. Meteorol. Soc. Jpn., 90a, 2364. Zanchettin, D., A. Rubino, D. Matei, O. Bothe, and J. H. Jungclaus, 2012: Multidecadalto-centennial SST variability in the MPI-ESM simulation ensemble for the last millennium. Clim. Dyn., 39, 419-444 doi:10.1007/s00382-012-1361-9. Zhang, Z. S., Nisancioglu, K., Bentsen, M., Tjiputra, J., Bethke, I., Yan, Q., Risebrobakken, B., Andersson, C., and Jansen, E., 2012: Pre-industrial and midPliocene simulations with NorESM-L. Geosci. Model Dev., 5, 523-533, doi:10.5194/gmd-5-523-2012.

46

1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063

Figure Captions Figure 1. Precipitation climatology for (left) December-February and (right) June-August (1979-2005). a) CMAP estimate of observed precipitation. b) Multi-model, multi-run ensemble mean over the 15 models; for models with multiple runs, all runs are averaged before inclusion in the multi-model ensemble. c) Comparison of individual models to observations using the 3 mm day-1 contour as an index of the major precipitation features, color-coded for each model (half the models shown in this panel). Shading shows the regions where CMAP exceeds 3 mm day-1; a model with no error would have its contour fall exactly along the edge of the shaded region. d) as in c) except for the other half of the models. Figure 2. Surface air temperature climatology for (left) December-February and (right) June-August (1979-2005). a) Multi-model, multi-run ensemble mean (over the 15 models). b) NCEP-DOE Reanalysis estimate of observed surface air temperature. c) Difference between multi-model ensemble mean and reanalysis. d)-r) As in a) but for individual models denoted by their acronyms. Figure 3. Winter and summer climatological precipitation from observations (CRUTS3.1), and historical simulations of the 20th century climate from CMIP3 and CMIP5 models for the common period 1971-1999. The number in parenthesis denotes the number of ensembles used from each model. Values equal and above 2 mm day-1 are shaded green; contour interval is 1 mm day-1. Figure 4. Summer minus winter difference in climatological surface air temperature in observations (CRUTS3.1), and historical simulations of the 20th century climate from CMIP3 and CMIP5 models for the common period 1971-1999. The number in parenthesis denotes the number of ensembles used from each model. Values equal and above 21 K are shaded red; contour interval is 3K. Figure 5. Taylor diagrams of spatial statistics from observations and CMIP3 and CMIP5 historical simulations of climatological continental precipitation (upper row), surface air temperature (middle row), and sea surface temperature (lower row) for the period 19711999. The spatial standard deviations and correlations are calculated over the continental area displayed in Figures 3 and 4 (130-60W, 0-60N), while the domain for the statistics of SSTs is the oceanic domain displayed in Figure 6 (170-35W, 10S-40N). Precipitation and surface air temperature were regridded to a 11 grid, and SSTs to a 52.5 grid. CMIP3 models are represented by red symbols and CMIP5 models by blue ones. Displayed values correspond to the mean of the values from the different ensembles of each model as indicated in previous figures. Figure 6. Climatological winter-to-spring (December to May) and summer-to-fall (June to November) sea surface temperature and precipitation in observations from HadISSTv1.1 and CRUTS3.1 data sets, and historical simulations of the 20th century climate from CMIP3 and CMIP5 models for the common period 1971-1999. The number in parenthesis denotes the number of ensembles used from each model. Temperatures are shaded blue/red for values equal or lower/larger than 23/24C; the thick black line 47

1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103

highlights the 28.5C isotherm as indicator of the Western Hemisphere Warm Pool. Precipitation is shaded green for values equal or larger than 2 mm day-1. Contour intervals are 1C and 1 mm day-1. Figure 7. Vertically integrated moisture transport (vectors) and its divergence (contours) for the 20CR reanalysis and four models for mean JJA and DJF for 1981-2000. Vertically integrated moisture transport (to 500 hPa) is computed to 500 hPa using 6-hourly data from the 20CR and the CanESM2, CNRM-CM5, GFDL-ESM2M, and MIROC5 models. One realization is examined for each model. Figure 8. Mean seasonal cycle (1971-2000) of North American regional land water budget components for 12 CMIP5 models (CanESM2, CSIRO-Mk3-6-0, GFDL-ESM2G, GISS-E2-H, GISS_E2-R, IPSL-CM5A-LR, IPSL-CM5A-MR, MIROC-ESM, MIROCESM-CHEM, MPI-ESM-LR, MRI-CGCM3, NorESM1-M) compared to the off-line VIC land surface model (forced by observed meteorology and calibrated to observed streamflow). Regions are Western North America (WNA), Central North America (CNA), Eastern North America (ENA), Alaska and Western Canada (ALA), and Northeast Canada (NEC) as modified from Giorgi and Francisco (2000). Figure 9. Mean annual total runoff from observation (GLDAS2) and the multi-model average from 15 CMIP5 climate models (CanESM2, CCM4, CNRM-CM5, GFDLESM2G, GFDL-ESM2M, GISS-E2-H, GISS-E2-R, HadCM3, INMCM4, MIROC5, MIROC4h, MIROC-ESM, MPI-ESM-LR, MRI-CGCM3, NorESM1-M). Numbers on the plot show North America average total runoff between 15N to 70N latitude, and 160W to 60W longitude (land only). Figure 10. Mean annual (1971-2000) runoff ratio (runoff, Q, divided by precipitation, P) for (a) off-line NLDAS2 VIC land surface model (forced by observed meteorology and calibrated to observed streamflow) and (b) 12 CMIP5 models show as the multi-model ensemble mean. (c) Difference between the ensemble mean and the LSM. (d) Standard deviation of the difference for individual models. All model datasets are interpolated to 2.0-degree resolution for the comparisons. The CMIP5 models are listed in the caption for Figure 8. Figure 11. Number of summer days from seven CMIP5 models for the historical simulation averaged over 1979-2005 shown as the difference from the HadGHCND observations. The bottom two panels show the CMIP5 multi-model ensemble mean and the difference from the observations. The frequencies are calculated on the model grid and then interpolated to 2.0 degree resolution for comparison with the observational estimates. Figure 12. As Fig. 11 but for the number of frost days. Figure 13. As Fig. 11 but for growing season length (days). Figure 14. The frequency of occurrence of persistent extreme precipitation events defined by SPI6 averaged over positive and negative events for (a) observed precipitation based on the CPC and UW datasets, (b) CanESM2, (c) CSIRO-Mk3.6.0, (d) IPSL48

1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143

CM5A-LR, (e) MPI-ESM-LR, (f) BCC-CSM1.1, (g) CCSM4, (h) CNRM-CM5, (i) GISS-E2-H, (j) MIROC4h, (k) MIROC-ESM, (l) MRI-CGCM3 and (m) NorESM1-M. Each data set is treated as one member of the ensemble. Figure 15. Same as Figure 14 but for persistent soil moisture events. Estimates of observed soil moisture are taken from the multi-model NLDAS-UW dataset. Figure 16. (a) Cyclone density for the CFSR analysis showing the number of cyclones per cool season (November to March) per 50,000 km2 for 1979-2004. (b) Same as (a) except for the mean (shaded) and spread (contoured every 0.3) of 15 CMIP5 models ordered from higher to lower spatial resolution: CanESM2, EC-EARTH, MRI-CGCM3, CNRM-CM5, MIRCO5, HadGem2-ES, HadGEM2-CC, INMCM4, IPSL-CM5A-MR, MPI-ESM-LR, NorESM1-M, GFDL-ESM2M, IPSL-CM5A-LR, BCC-CSM1, MIROCESM-CHEM. Same as (a) except for the (c) MPI-ESM-LR, (d) GFDL ESM2M, (e) HadGEM2-CC, and (f) CESM models. Figure 17. Number of cyclone central pressures at their maximum intensity (minimum pressure) for the 1979-2004 cool seasons within the dashed box region in Fig. 16 for a 10 hPa range centered every 10 hPa showing the CFSR (bold blue), (b) CMIP5 mean (bold red), and all the CMIP5 models in Colle et al. (2012) Figure 18. Cool seasonal average precipitation (shaded every 75 mm) for the 1979-2004 cool seasons (November March) for (a) CMAP at 2.5 degree resolution, (b) same as (a) except for the CPC-Unified precipitation at 0.5 degree resolution, (c) same as (a) except for the mean of the CMIP5 members listed in (d) and spread (in mm). (d) Number of days that the daily average precipitation (in mm day-1) for the land areas in the black box in (b) occurred within each amount bin for select CMIP5 members, CMIP5 mean, and the CPC Unified. Figure 19. Map of the three regions Southwest (SW), South Central (SC) and Southeast (SE) used for evaluations of extreme temperature and precipitation over the southern US. Figure 20. Comparison of (a) daily maximum and (b) minimum temperatures, and (c) precipitation rainrates between CMIP5 models (CCSM4, GFDL-CM3, GISS-E2-R, HadCM3, HadGEM2-LR, MIP-ESM-LR, IPSL-CM5A-LR, MIROC5 and MRICGCM3) and observations from the GHCN and CPC-US-Mexico datasets for the southeast (top), south central (middle) and southwest (bottom). The GHCN station data is mapped to 2.5 grid. The CPC US Mexico data set is obtained from the combined CPC US-Mexico real-time and retrospective data. Figure 21. Probability density functions (PDFs) of the Standardized Precipitation Index (SPI) for 6 months (SPI6) and 9 months (SPI9) for the South Central region of the US for 11 CMIP5 models. The observations are from the CPC-unified dataset. Figure 22. Multi-model average monthly precipitation in the North American monsoon region (longitudes 102.5 to 115W) from sixteen CMIP5 models: CanESM2, CCSM4, CNRM-CM5, CSIRO-Mk3, GFDL-CM3, GFDL-ESM2G, GFDL-ESM2M, GISS-E2-R, HadGEM2-ES, INMcm4, IPSL-CM5A-LR, MIROC-ESM, MIROC5, MPI-ESM-LR, 49

1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183

MRI-CGCM3, NorESM1-M. The observation is the CMAP precipitation given in mm day-1. Figure 23. (a) RMS error, (b) phase lag and (c) mean bias of 21 CMIP5 models with respect to the P-NOAA observed precipitation within the core NAM region for 19792005. Figure 24. Annual cycle in rainfall for the NAM region for the historical (1979-2005) period of 21 CMIP5 models compared to the P-NOAA AND CMAP observational datasets for (a) small (phase error = 0), (b) moderate (phase error = 1) and (c) large (phase error = 2-4) phase errors. Figure 25. Averaged summer 925hPa wind during 1971-2000 for a) NCEP-NCAR reanalysis, b) CCSM4, c) GFDL-CM3, d) GISS-E2-R, e) HadGEM2-CC, and f) MPIESM-LR. Shadings indicate meridional wind stronger than 3.0 ms-1. Figure 26. Long-term mean (1971-2000) monthly meridional wind averaged over 95100W for a) NCEP-NCAR reanalysis, b) CCSM4, c) GFDL-CM3, d) GISS-E2-R, e) HadGEM2-CC, and f) MPI-ESM-LR. Shadings indicate the meridional wind is larger than 3.0 ms-1. Figure 27. September and March sea ice extent from 18 CMIP5 models compared to observations from the NSIDC. For each model, the boxes represent inter-quartile ranges (25th to 75th percentiles). Median (50th percentile) extents are shown by the thick horizontal bar in each box. The width of each box corresponds to the number of ensemble members for that model. Whiskers (vertical lines and thin horizontal bars) represent the 10th and 90th percentiles. Mean monthly extents are shown as diamonds. Corresponding mean, minimum and maximum observed extends are shown as red and green lines, respectively. Figure 28. March (top) and September (bottom) ice thickness (m) for 18 CMIP5 models averaged over 1996-2005 versus IceSat observations for 2003-2008. Figure 29. Trends in September sea ice extent from 1979 to 2005 for all individual model ensembles as well as the multi-model ensemble mean with confidence intervals (vertical lines). Observations are from the NSIDC. The 1 and 2 observed trends are shown in dark gray shading (1 ) and light gray shading (2 ). The linear trends were estimated using the standard least-squares approach and are reported as 106 km2 decade-1. An effective sample size was calculated to adjust the standard error of the modeled or observe trend for the effects of temporal autocorrelation (Santer et al., 2008).

50

1184 1185

Tables Table 1. CMIP5 models evaluated and their attributes.

Model

Center

Atmospheric Horizontal Resolution (lon. x lat.)

Number of model levels

Reference

ACCESS1-0

Commonwealth Scientific and Industrial Research Organization/Bureau of Meteorology, Australia

1.875 x 1.25

38

Bi et al. (2012)

BCC-CSM1.1

Beijing Climate Center, China Meteorological Administration, China

2.8 x 2.8

26

Xin et al. (2012)

CanCM4

Canadian Centre for Climate Modelling and Analysis, Canada

2.8 x 2.8

35

Chylek et al. (2011)

CanESM2

Canadian Center for Climate Modeling and Analysis, Canada

2.8 x 2.8

35

Arora et al. (2011)

CCSM4

National Center for Atmospheric Research, USA

1.25 x 0.94

26

Gent et al. (2011)

CESM1CAM5-1-FV2

Community Earth System Model Contributors (NSF-DOENCAR)

1.4 x 1.4

26

Gent et al. (2011)

CNRM-CM5.1

National Centre for Meteorological Research, France

1.4 x 1.4

31

Voldoire et

51

al. (2011)

CSIRO-MK3.6

Commonwealth Scientific and Industrial Research Organization/Queensland Climate Change Centre of Excellence, AUS

1.8 x 1.8

18

Rotstayn et al. (2010)

EC-EARTH

EC-EARTH consortium

1.125 x 1.12

62

Hazeleger et al. (2010)

FGOALS-S2.0

LASG, Institute of Atmospheric Physics, Chinese Academy of Sciences

2.8 x 1.6

26

Bao et al. (2012)

GFDL-CM3

NOAA Geophysical Fluid Dynamics Laboratory, USA

2.5 x 2.0

48

Donner et al. (2011)

GFDLESM2G/M GISS-E2-H/R

NOAA Geophysical Fluid Dynamics Laboratory, USA NASA Goddard Institute for Space Studies, USA

2.5 x 2.0

48

Donner et al. (2011)

2.5 x 2.0

40

Kim et al. (2012)

HadCM3

Met Office Hadley Centre, UK

3.75 x 2.5

19

Collins et al. (2001)

HADGEM2CC (Chemistry coupled) HadGEM2-ES

Met Office Hadley Centre, UK

1.875 x 1.25

60

Jones et al. (2011)

Met Office Hadley Centre, UK

1.875 x 1.25

60

Jones et al. (2011)

INMCM4

Institute for Numerical Mathematics, Russia

2 x 1.5

21

Volodin et

52

al. (2010) IPSL-CM5ALR IPSL-CM5AMR MIROC4h Atmosphere and Ocean Research Institute (The University of Tokyo), National Institute for Environmental Studies, and Japan Agency for Marine-Earth Science and Technology, Japan MIROC5 Atmosphere and Ocean Research Institute (The University of Tokyo), National Institute for Environmental Studies, and Japan Agency for Marine-Earth Science and Technology, Japan MIROC-ESM Japan Agency for Marine-Earth Science and Technology, Atmosphere and Ocean Research Institute (The University of Tokyo), and National Institute for Environmental Studies MIROC-ESMCHEM Japan Agency for Marine-Earth Science and Technology, Atmosphere and Ocean Research Institute (The University of Tokyo), and National Institute for Environmental Studies 2.8 x 2.8 80 Watanabe et al. (2010) 2.8 x 2.8 80 Watanabe et al. (2010) 1.4 x 1.4 40 Watanabe et al. (2010) 0.56 x 0.56 56 Institut Pierre Simon Laplace, France 2.5 x 1.25 39 Institut Pierre Simon Laplace, France 3.75 x 1.8 39 Dufresne et al. (2012) Dufresne et al. (2012) Sakamoto et al. (2012)

53

MPI-ESM-LR

Max Planch Institute for Meteorology, Germany

1.9 x 1.9

47

Zanchettin et al. (2012)

MRI-CGCM3

Meteorological Research Institute, Japan

1.1 x 1.1

48

Yukimoto et al. (2011)

NorESM1-M

Norwegian Climate Center, Norway

2.5 x 1.9

26

Zhang et al. (2012)

1186 1187

54

1188

Table 2. Observational and reanalysis datasets used in the evaluations

Dataset

Type

Spatial Domain Precipitation

Temporal Domain

Reference

CMAP v2

Gauge/satellite

2.5 deg, global

Monthly/pentad, 1979-present

Xie and Arkin, 1997

TMPA 3B43 GPCP v2.1 CRU TS3.1 UNAM v0705

Satellite Gauge/satellite Gauge Gauge

0.25 deg, 50S-50N 1.0 deg, global 0.5 deg, global land

monthly, 1998-2010 1979-2009 Monthly, 1901-2008

Huffman et al. 2007 Adler et al., 2003 Mitchell et al. (2005) UNAM (2007)

0.5 deg, Mexico and 1901-2002 surroundings

CPC unified CPC-USMexico UW P-NOAA

Gauge Gauge

0.5 deg, US 1.0 deg, US/Mexico

Daily, 1948-2010 Daily, 1948-present

Xie et al., 2010 Higgins et al. (1996)

Gauge Gauge

0.5 deg, US 0.5 deg,

Daily, 1916-2009 North Monthly, 1895-2010

Maurer et al. (2002) Cook 2011* and Vose,

America Temperature CRU TS3.1 GHCN Gauge Gauge 0.5 deg, global land 2.5 land HadGHCND Gauge 2.5x3.75 global land degree, Daily, 1950-2000 degree, global Monthly, 1901-2008 Daily, varies

Mitchell et al. (2005) Vose et al. (1992)

Caesar et al. (2006)

55

Sea Surface Temperature and Sea Ice HadISSTv1.1 In-situ/satellite 1.0 oceans NSIDC Sea Ice Satellite Index IceSAT Satellite 25km, Arctic Basin Arctic Basin deg, global Monthly, present Monthly, present Monthly, 2003-2008 Kwok and Cunningham (2008) Land Surface Hydrology NLDAS2 NLDAS-UW VIC Multiple LSMs 0.125 deg, US Multiple LSMs 0.5 deg US VIC LSM 1.0 deg, global land Hourly, 1979-present Daily, 1916-2009 3-hourly, 1948-2008 Xia et al. (2012) Wang et al. (2009) Sheffield (2007) GLDAS Noah LSM 1.0 deg, global land Reanalyses NCEP-NCAR Model reanalysis NCEP-DOE Model reanalysis CFSR Model reanalysis 20CR Model reanalysis 2.0 deg, global 6-hourly, present 1871- Compo et al. (2011) ~0.3 deg, global ~1.9 deg, global ~1.9 deg, global 6-hourly, present 6-hourly, present 6-hourly, 1979-2010 1979- Kanamitsu (200X) Saha et al. (2010) et al. 1948- Kalnay et al. (1996) 3-hourly, 1979-2008 Rodell et al. (2004) et al. 1979- Fetterer et al., 2002 1870Rayner et al. (2003)

1189

*P-NOAA dataset provided by Drs. Russ Vose and Ed Cook. 56

1190 1191

57

1192

1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 Figure 1. Precipitation climatology for (left) December-February and (right) June-August (1979-2005). a) CMAP estimate of observed precipitation. b) Multi-model, multi-run ensemble mean over the 15 models; for models with multiple runs, all runs are averaged before inclusion in the multi-model ensemble. c) Comparison of individual models to observations using the 3 mm day-1 contour as an index of the major precipitation features, color-coded for each model (half the models shown in this panel). Shading shows the regions where CMAP exceeds 3 mm day-1; a model with no error would have its contour fall exactly along the edge of the shaded region. d) as in c) except for the other half of the models.

58

1204 1205 1206 1207 1208 1209 1210 Figure 2. Surface air temperature climatology for (left) December-February and (right) June-August (1979-2005). a) Multi-model, multi-run ensemble mean (over the 15 models). b) NCEP-DOE Reanalysis estimate of observed surface air temperature. c) Difference between multi-model ensemble mean and reanalysis. d)-r) As in a) but for individual models denoted by their acronyms.

59

1211 1212 1213 1214 1215 1216 1217 Figure 3. Winter and summer climatological precipitation from observations (CRUTS3.1), and historical simulations of the 20th century climate from CMIP3 and CMIP5 models for the common period 1971-1999. The number in parenthesis denotes the number of ensembles used from each model. Values equal and above 2 mm day-1 are shaded green; contour interval is 1 mm day-1.

60

1218 1219 1220 1221 1222 1223 Figure 4. Summer minus winter difference in climatological surface air temperature in observations (CRUTS3.1), and historical simulations of the 20th century climate from CMIP3 and CMIP5 models for the common period 1971-1999. The number in parenthesis denotes the number of ensembles used from each model. Values equal and above 21 K are shaded red; contour interval is 3K.

61

1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 Figure 5. Taylor diagrams of spatial statistics from observations and CMIP3 and CMIP5 historical simulations of climatological continental precipitation (upper row), surface air temperature (middle row), and sea surface temperature (lower row) for the period 19711999. The spatial standard deviations and correlations are calculated over the continental area displayed in Figures 3 and 4 (130-60W, 0-60N), while the domain for the statistics of SSTs is the oceanic domain displayed in Figure 6 (170-35W, 10S-40N). Precipitation and surface air temperature were regridded to a 11 grid, and SSTs to a 52.5 grid. CMIP3 models are represented by red symbols and CMIP5 models by blue ones. Displayed values correspond to the mean of the values from the different ensembles of each model as indicated in previous figures. 62

1235 1236 1237 1238 1239 1240 1241 1242 Figure 6. Climatological winter-to-spring (December to May) and summer-to-fall (June to November) sea surface temperature and precipitation in observations from HadISSTv1.1 and CRUTS3.1 data sets, and historical simulations of the 20th century climate from CMIP3 and CMIP5 models for the common period 1971-1999. The number in parenthesis denotes the number of ensembles used from each model. Temperatures are shaded blue/red for values equal or lower/larger than 23/24C; the thick black line highlights the 28.5C isotherm as indicator of the Western Hemisphere Warm Pool. 63

1243 1244

Precipitation is shaded green for values equal or larger than 2 mm day-1. Contour intervals are 1C and 1 mm day-1.

1245 1246 1247 1248 Figure 7. Vertically integrated moisture transport (vectors) and its divergence (contours) for the 20CR reanalysis and four models for mean JJA and DJF for 1981-2000. Vertically integrated moisture transport (to 500 hPa) is computed to 500 hPa using 6-hourly data 64

1249 1250

from the 20CR and the CanESM2, CNRM-CM5, GFDL-ESM2M, and MIROC5 models. One realization is examined for each model.

1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 Figure 8. Mean seasonal cycle (1971-2000) of North American regional land water budget components for 12 CMIP5 models (CanESM2, CSIRO-Mk3-6-0, GFDL-ESM2G, GISS-E2-H, GISS_E2-R, IPSL-CM5A-LR, IPSL-CM5A-MR, MIROC-ESM, MIROCESM-CHEM, MPI-ESM-LR, MRI-CGCM3, NorESM1-M) compared to the off-line VIC land surface model (forced by observed meteorology and calibrated to observed streamflow). Regions are Western North America (WNA), Central North America (CNA), Eastern North America (ENA), Alaska and Western Canada (ALA), and Northeast Canada (NEC) as modified from Giorgi and Francisco (2000).

65

1261

1262 1263 1264 1265 1266 1267 1268 1269 1270 Figure 9. Mean annual total runoff from observation (GLDAS2) and the multi-model average from 15 CMIP5 climate models (CanESM2, CCM4, CNRM-CM5, GFDLESM2G, GFDL-ESM2M, GISS-E2-H, GISS-E2-R, HadCM3, INMCM4, MIROC5, MIROC4h, MIROC-ESM, MPI-ESM-LR, MRI-CGCM3, NorESM1-M). Numbers on the plot show North America average total runoff between 15N to 70N latitude, and 160W to 60W longitude (land only).

66

1271

1272 1273 1274 1275 1276 1277 1278 1279 1280 Figure 10. Mean annual (1971-2000) runoff ratio (runoff, Q, divided by precipitation, P) for (a) off-line NLDAS2 VIC land surface model (forced by observed meteorology and calibrated to observed streamflow) and (b) 12 CMIP5 models show as the multi-model ensemble mean. (c) Difference between the ensemble mean and the LSM. (d) Standard deviation of the difference for individual models. All model datasets are interpolated to 2.0-degree resolution for the comparisons. The CMIP5 models are listed in the caption for Figure 8.

67

1281 1282 1283 1284 1285 1286 1287 Figure 11. Number of summer days from seven CMIP5 models for the historical simulation averaged over 1979-2005 shown as the difference from the HadGHCND observations. The bottom two panels show the CMIP5 multi-model ensemble mean and the difference from the observations. The frequencies are calculated on the model grid and then interpolated to 2.0 degree resolution for comparison with the observational estimates. 68

1288 1289

Figure 12. As Fig. 11 but for the number of frost days.

69

1290 1291 Figure 13. As Fig. 11 but for growing season length (days).

70

1292 1293 1294 1295 1296 1297 1298 Figure 14. The frequency of occurrence of persistent extreme precipitation events defined by SPI6 averaged over positive and negative events for (a) observed precipitation based on the CPC and UW datasets, (b) CanESM2, (c) CSIRO-Mk3.6.0, (d) IPSLCM5A-LR, (e) MPI-ESM-LR, (f) BCC-CSM1.1, (g) CCSM4, (h) CNRM-CM5, (i) GISS-E2-H, (j) MIROC4h, (k) MIROC-ESM, (l) MRI-CGCM3 and (m) NorESM1-M. Each data set is treated as one member of the ensemble.

71

1299 1300 1301 1302 72 Figure 15. Same as Figure 14 but for persistent soil moisture events. Estimates of observed soil moisture are taken from the multi-model NLDAS-UW dataset.

1303 1304 1305 1306 1307 1308 1309 1310 1311 1312

Figure 16. (a) Cyclone density for the CFSR analysis showing the number of cyclones per cool season (November to March) per 50,000 km2 for 1979-2004. (b) Same as (a) except for the mean (shaded) and spread (contoured every 0.3) of 15 CMIP5 models ordered from higher to lower spatial resolution: CanESM2, EC-EARTH, MRI-CGCM3, CNRM-CM5, MIRCO5, HadGem2-ES, HadGEM2-CC, INMCM4, IPSL-CM5A-MR, MPI-ESM-LR, NorESM1-M, GFDL-ESM2M, IPSL-CM5A-LR, BCC-CSM1, MIROCESM-CHEM. Same as (a) except for the (c) MPI-ESM-LR, (d) GFDL ESM2M, (e) HadGEM2-CC, and (f) CESM models.

73

1313

1314 1315 1316 1317 1318 1319 Figure 17. Number of cyclone central pressures at their maximum intensity (minimum pressure) for the 1979-2004 cool seasons within the dashed box region in Fig. 16 for a 10 hPa range centered every 10 hPa showing the CFSR (bold blue), (b) CMIP5 mean (bold red), and all the CMIP5 models in Colle et al. (2012)

74

1320 1321 1322 1323 1324 1325 1326 1327

Figure 18. Cool seasonal average precipitation (shaded every 75 mm) for the 1979-2004 cool seasons (November March) for (a) CMAP at 2.5 degree resolution, (b) same as (a) except for the CPC-Unified precipitation at 0.5 degree resolution, (c) same as (a) except for the mean of the CMIP5 members listed in (d) and spread (in mm). (d) Number of days that the daily average precipitation (in mm day-1) for the land areas in the black box in (b) occurred within each amount bin for select CMIP5 members, CMIP5 mean, and the CPC Unified.

75

1328 1329 1330 1331 1332 Figure 19. Map of the three regions Southwest (SW), South Central (SC) and Southeast (SE) used for evaluations of extreme temperature and precipitation over the southern US.

76

1333

1334

1335 1336 1337 1338 1339 1340 1341 1342 1343 Figure 20. Comparison of (a) daily maximum and (b) minimum temperatures, and (c) precipitation rainrates between CMIP5 models (CCSM4, GFDL-CM3, GISS-E2-R, HadCM3, HadGEM2-LR, MIP-ESM-LR, IPSL-CM5A-LR, MIROC5 and MRICGCM3) and observations from the GHCN and CPC-US-Mexico datasets for the southeast (top), south central (middle) and southwest (bottom). The GHCN station data is mapped to 2.5 grid. The CPC US Mexico data set is obtained from the combined CPC US-Mexico real-time and retrospective data.

77

1344

1345

1346 1347 1348 1349 1350 Figure 21. Probability density functions (PDFs) of the Standardized Precipitation Index (SPI) for 6 months (SPI6) and 9 months (SPI9) for the South Central region of the US for 11 CMIP5 models. The observations are from the CPC-unified dataset.

78

1351

1352 1353 1354 1355 1356 1357 1358 1359 79 Figure 22. Multi-model average monthly precipitation in the North American monsoon region (longitudes 102.5 to 115W) from sixteen CMIP5 models: CanESM2, CCSM4, CNRM-CM5, CSIRO-Mk3, GFDL-CM3, GFDL-ESM2G, GFDL-ESM2M, GISS-E2-R, HadGEM2-ES, INMcm4, IPSL-CM5A-LR, MIROC-ESM, MIROC5, MPI-ESM-LR, MRI-CGCM3, NorESM1-M. The observation is the CMAP precipitation given in mm day-1.

1360

1361 1362 1363 1364 Figure 23. (a) RMS error, (b) phase lag and (c) mean bias of 21 CMIP5 models with respect to the P-NOAA observed precipitation within the core NAM region for 19792005. 80

1365

1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 Figure 24. Annual cycle in rainfall for the NAM region for the historical (1979-2005) period of 21 CMIP5 models compared to the P-NOAA AND CMAP observational datasets for (a) small (phase error = 0), (b) moderate (phase error = 1) and (c) large (phase error = 2-4) phase errors.

81

1377 1378 1379 1380 1381 1382 1383

Figure 25. Averaged summer 925hPa wind during 1971-2000 for a) NCEP-NCAR reanalysis, b) CCSM4, c) GFDL-CM3, d) GISS-E2-R, e) HadGEM2-CC, and f) MPIESM-LR. Shadings indicate meridional wind stronger than 3.0 ms-1.

82

1384 1385 1386 1387 1388 1389 1390

Figure 26. Long-term mean (1971-2000) monthly meridional wind averaged over 95100W for a) NCEP-NCAR reanalysis, b) CCSM4, c) GFDL-CM3, d) GISS-E2-R, e) HadGEM2-CC, and f) MPI-ESM-LR. Shadings indicate the meridional wind is larger than 3.0 ms-1.

83

1391

1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 Figure 27. September and March sea ice extent from 18 CMIP5 models compared to observations from the NSIDC. For each model, the boxes represent inter-quartile ranges (25th to 75th percentiles). Median (50th percentile) extents are shown by the thick horizontal bar in each box. The width of each box corresponds to the number of ensemble members for that model. Whiskers (vertical lines and thin horizontal bars) represent the 10th and 90th percentiles. Mean monthly extents are shown as diamonds. Corresponding mean, minimum and maximum observed extends are shown as red and green lines, respectively.

84

1402

1403 1404 1405 Figure 28. March (top) and September (bottom) ice thickness (m) for 18 CMIP5 models averaged over 1996-2005 versus IceSat observations for 2003-2008. 85

1406

1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420

Figure 29. Trends in September sea ice extent from 1979 to 2005 for all individual model ensembles as well as the multi-model ensemble mean with confidence intervals (vertical lines). Observations are from the NSIDC. The 1 and 2 observed trends are shown in dark gray shading (1 ) and light gray shading (2 ). The linear trends were estimated using the standard least-squares approach and are reported as 106 km2 decade-1. An effective sample size was calculated to adjust the standard error of the modeled or observe trend for the effects of temporal autocorrelation (Santer et al., 2008).

86