Dramatic environmental changes are leading to a growth and diversification of human activity in the Arctic region (Dawson et al., 2017). Numerous sectors, such as transportation, tourism, fishing and scientific operations, are increasingly dependent on accurate weather information. Numerical weather prediction (NWP) models provide direct and indirect guidance for safe operations in the Arctic.
Arctic NWP capabilities have advanced in recent years due to, among other improvements, increased model resolution (Bauer et al., 2016). Several operational weather centres have started to run convection-permitting kilometre-scale models for limited areas at high latitudes. For example, the ALADIN-HIRLAM (Aire Limitée Adaptation dynamique Développement InterNational-High Resolution Limited Area Model) NWP system is run at 2.5 km horizontal grid spacing both for Greenland (Yang et al., 2018) and the European Arctic (Müller et al., 2017), and the Canadian Arctic Prediction System with 3-km grid spacing (Casati et al., in preparation) for a pan-Arctic domain. The added value of the kilometre-scale regional models over global models has been found to be most pronounced near complex coastlines and in mountainous terrain (e.g. Køltzow et al., 2019).
Complex coastlines and mountainous terrain at high latitudes strongly influence the atmospheric flow and generate phenomena with large spatial variability such as wind channelling (Nawri and Stewart, 2008), down-slope wind storms (Oltmanns et al., 2015) and cold-air pools in valleys and basins (Clements et al., 2003). Improved description of orography enables the models to capture the terrain-induced atmospheric features (e.g. Moore et al., 2015), but these fine-scale features still pose challenges for NWP even at kilometre-scale grid spacing.
A number of NWP studies have employed kilometre-scale grid spacing on Svalbard. Køltzow et al. (2019) show that large near-surface wind speed errors and a general cold bias, which change to a warm bias in the presence of a stable boundary layer (SBL), is common for NWP systems. The latter warm bias is a well-known problem in NWP (e.g. Atlaskin and Vihma, 2012; Sandu et al., 2013; Esau et al., 2018) and has also been investigated on Svalbard employing 1-km grid spaced models (Kilpeläinen et al., 2011; Kilpeläinen et al., 2012; Mayer et al., 2012). The reduced grid spacing gives only a minor improvement on temperature forecasts, but a substantial improvement for wind. The same studies additionally indicate that a better description of the orography improves wind forecast quality, while temperature forecasts are more sensitive to surface type.
Extending the horizontal resolution to sub-kilometre grid spacing has been proposed as a possible solution for further improving the forecast skill. A sub-kilometre set-up of the Canadian NWP system in wintertime mountainous terrain improved wind speed and temperature forecasts at high altitudes (Vionnet et al., 2015). Yang (2019) found that using 750 m grid spacing in the ALADIN-HIRLAM NWP system gave a substantial improvement in forecasting the spatial variability in wind extremes in Greenland. However, in relatively flat terrain around a mid-latitude airport Hagelin et al. (2014) found only minor improvements. These studies had a limited set of observations available to investigate the weather development and the model performance in detail. Particularly, the vertical structure of the atmosphere remained mainly unstudied. Furthermore, Køltzow et al. (2019) argued that a substantial part of the difference between forecast and observation comes from subgrid variability not resolved by the kilometre-scale models. Contrary to this, Atlaskin and Vihma (2012) argued that the subgrid variability plays a less important role for errors in the SBL in relatively flat terrain.
The purpose of this study is to evaluate the benefits of a sub-kilometre NWP system in predicting near-surface atmospheric conditions over an Arctic fjord-valley system compared to the 2.5-km system currently operational at the Norwegian Meteorological Institute. The ALADIN-HIRLAM NWP system with 0.5 km horizontal grid spacing is run for 5 days in February 2018 over a domain covering the Svalbard archipelago in the Arctic Ocean. The results are compared to both the same model system with 2.5 km grid spacing and a wide set of observations conducted during a measurement campaign in the Adventdalen valley. During the measurement campaign, several temporary weather stations were set up, components of the surface energy budget were measured, spatial variability of temperature was obtained from snowmobile transects and a tethersonde was utilised for measuring the vertical structure of the atmosphere. We use the campaign data, in addition to the data from permanent weather stations, to study the impact of the increased resolution on the forecast ability to represent the boundary layer in an Arctic valley. Compared to previous studies in Svalbard, we utilise finer horizontal and vertical resolution close to the surface. In addition, compared to previous sub-kilometre studies, we compare NWP results with a wider set of observations. We also analyse the surface energy budget (SEB) to investigate the coupling between the surface and the atmosphere in the performed experiments. The studied period is a part of the Year of Polar Prediction (YOPP) Special Observing Period Northern Hemisphere 1 (SOP-NH1). Therefore, this study is also a contribution to YOPP.
Data and methods
Study site and period
This study focuses on the Adventdalen fjord-valley system located in the central part of Spitsbergen, the largest island of the Svalbard archipelago (Fig. 1). The valley is approximately 4 km wide and 30 km long and has several smaller side valleys, which are approximately 1 km wide (Fig. 2). The valley ends at the 8 km long Adventfjorden, which is connected to the larger Isfjorden. The surrounding mountains reach elevations up to about 1000 m. The investigated area is a typical fjord-valley system on Spitsbergen.
The study period was from 12 to 16 February, 2018, at the end of the polar night. The period was characterised by a high pressure situation with prevailing weak easterly flow and very little precipitation. The land areas were fully covered with snow. The snow thickness was approximately 20 cm on flat areas in Adventdalen, while it varied locally elsewhere due to redistribution of snow by wind. Sea ice was partly present in Adventfjorden. In the beginning of the study period, new ice started to form in the otherwise ice-free Adventfjorden. By the end of the period, a large part of the fjord was ice-covered. Isfjorden was free of sea ice during the whole period. The average 2-m temperature was −8.2 °C at the Svalbard Airport from 12 to 16 February, and the study period can be considered representative of typical winter conditions in the region in the current climate (Isaksen et al., 2016).
The model system utilised in this study is the HARMONIE-AROME (HIRLAM–ALADIN Research on Mesoscale Operational NWP in Euromed–Application of Research to Operations at Mesoscale) configuration (Bengtsson et al., 2017) of the ALADIN–HIRLAM NWP model system version 40h1.1. Two model experiments were set up with different model domains and resolution. The first one, AA25, was run with 2.5 km horizontal grid spacing and 65 vertical levels over a domain covering the European Arctic, including the Svalbard archipelago (Fig. 1a). The same model system for the same domain and model resolution is run operationally by the Norwegian Meteorological Institute under the name AROME-Arctic (Müller et al., 2017). The second experiment, AS05, was run with 0.5 km horizontal grid spacing and 90 vertical levels over a domain covering the Svalbard archipelago (Fig. 1b). While the lowest two model levels in AA25 are at 12 and 35 m above the ground level (AGL), the same levels in AS05 are 5 and 15 m AGL. Even with the much smaller domain, AS05 is about 14 times more computationally expensive than AA25 due to the higher number of grid points and the shorter integration timestep. Details of model experiments are listed in Table 1.
At the beginning of the simulation period at 11 February 21:00 UTC, atmospheric and surface forecasts from the operational AROME-Arctic were used as model first guess in AA25 in order to avoid spin-up problems and to ensure a realistic description of surface variables, such as sea ice and snow properties. AS05 was started from interpolated initial conditions from AA25 at the beginning of the simulation period. After this, data assimilation procedures were applied every 3 h for the atmosphere in AA25 and for the surface both in AA25 and AS05. In AA25, the use of the operational AROME-Arctic forecasts as model first guess continued 3-hourly throughout the whole simulation period. This made it possible to have AA25 practically identical with the operational model. In AS05, the atmosphere was run without observation assimilation or initialisation from AA25 allowing high-resolution features to develop. The surface parameterisation package of HARMONIE-AROME was configured to use the optimal interpolation method to assimilate snow depth, and screen-level temperature and relative humidity in both AA25 and AS05. Starting from 00:00 UTC analysis times, forecasts up to 24 h were performed.
Lateral boundary conditions for AA25 were provided hourly by the European Centre for Medium-Range Weather Forecasting (ECMWF) high resolution global model (IFS-HRES). The sub-kilometre AS05 received its lateral boundary conditions from AA25 allowing us to investigate directly the added value of the sub-kilometre forecasts over the kilometre-scale system and reduce the spin up distance from the lateral boundaries.
HARMONIE-AROME uses the surface parameterisation package SURFEX (SURFace EXternalisée) (Masson et al., 2013) for representing surface processes. SURFEX in AA25 and AS05 was configured similar to the operational AROME-Arctic system: land surfaces are parameterised by the three-layer version of the ISBA (Interactions between the Soil–Biosphere–Atmosphere) force-restore scheme (Boone et al., 1999) with the single-layer snow model (Douville et al., 1995), open ocean is defined by the prescribed sea surface temperature field and turbulent fluxes calculated according to the ECUME (Exchange Coefficients from Unified Multi-campaigns Estimates) scheme (Belamari, 2005), sea ice cover is parameterised by a one-dimensional thermodynamic scheme (Batrak et al., 2018). In AA25 and AS05 the 2-m and surface temperature and 2-m humidity are diagnostics computed by SURFEX. The 10-m wind speed in AA25 is computed by SURFEX. In AS05, where the lowest model level is below 10 m AGL, the 10-m wind speed is linearly interpolated between the two lowest model levels which are at 5 m and 15 m height.
The topography in the model experiments was described by Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010, Danielson and Gesch, 2011). We utilised the elevation data with a resolution of 7.5 arcseconds of latitude and longitude (approximately 250 m). In AS05 the altitude of the corresponding model-grid cell is more accurately represented at six, less accurately at two and similarly at one of the stations than in AA25 (Fig. 2 and Table 2). Averaged over all 9 stations, the mean absolute error of station altitude is considerably lower in AS05 (38 m) than in AA25 (95 m).
To pair the NWP model data with point observations we use the nearest neighbour interpolation for all NWP simulations. This choice preserves best the difference in resolution which we want to investigate in the study (e.g. Accadia et al., 2003). Forecast lead times included are from +1 h to +24 h starting from the forecasts initialised at 00:00 UTC each day.
Observational data for model comparison were obtained from (1) automatic weather stations, (2) a tethersonde operated at the Adventdalen weather station, and (3) snowmobile transects performed in the Adventdalen valley.
Both permanent and temporary weather stations provide data for the model comparison. Permanent weather stations are located in Adventdalen, Janssonhaugen, Gruvefjellet, Breinosa, Svalbard Airport and Platåberget (Fig. 1). The largest distance between the observation sites (Janssonhaugen and Platåberget) is approximately 25 km. During the campaign conducted by the University Centre in Svalbard (UNIS), temporary weather stations were set up in Endalen, at Vestpynten and on the Northern side of Adventdalen (Hobo). The latter provides observations only from 12 February 12:00 UTC to 16 February 03:00 UTC.
The station specifications are summarised in Table 2. At the Janssonhaugen and Gruvefjellet stations, the 2-m temperature is estimated by averaging the 1-m and 3-m temperatures. For stations with wind measurements from lower elevations, the 10-m wind speed was estimated by assuming a log wind profile and a roughness length of 0.003 m which is the same value as used in the model simulations. For temperature and humidity, the 1 min-average at the beginning of each hour is used for model comparison, while the 10 min-average is used for wind measurements. The measurement uncertainty of all instruments included in this study is considerably smaller than the difference between simulated and observed values and, therefore, have negligible effect on the results.
A radiometer is permanently installed at the Adventdalen weather station. Longwave up- and down-welling radiation were measured by a radiometer. The shortwave radiation was negligible during the observation campaign since the campaign was performed during the polar night.
A sonic anemometer was used to measure three-dimensional wind speed components as well as the sonic temperature at 20 Hz frequency at the Adventdalen weather station. The data processing to derive the surface sensible heat flux consisted of several steps, following Stigter et al. (2018). Only the correction method for the sensor tilt was chosen differently. Double rotation was used as this correction is more suitable for shorter time series consisting of only a few days. The derived sensible heat fluxes were quality checked according to Mauder and Foken (2004). Fluxes with a low-quality were exempted from the calculation. Finally, the 10-minute sensible heat fluxes were aggregated to hourly values to reduce flux sampling errors (Vickers and Mahrt, 1997). The latent heat fluxes were computed using the bulk-aerodynamic method and station data is described in detail by Litt et al. (2015). The roughness lengths for momentum, humidity and sensible heat in the bulk-aerodynamic method were tuned with the observed sensible heat fluxes.
The low-level and total cloud cover is manually observed at the Svalbard Airport at three hour time intervals in the unit okta, i.e. one okta equals an eighth of the sky dome covered by clouds.
A tethersonde was operated near the Adventdalen weather station to observe the vertical atmospheric structure of temperature, humidity and the wind speed up to an altitude of 1 km. To measure spatial temperature variability in the Adventdalen area, temperature sensors were mounted on snowmobiles (at about 1 m height) and horizontal transects were made using this setup at semi-regular intervals.
Evaluation of the operational AROME-Arctic
Before the comparison of AS05 and AA25 is presented in the next sections, we discuss some of the strengths and weaknesses of the operational AROME-Arctic for a winter period (December 2017 to February 2018). As explained, the AA25 is based on the same setup as the operational AROME-Arctic model, hence the results presented for an entire winter from AROME-Arctic is representative for the forecast capabilities of AA25. Here, AROME-Arctic is verified against the six permanent observation sites around Longyearbyen (Table 2).
The observed day-to-day variations in 2-m temperature are well captured by the model (Fig. 3a), with a temporal correlation of 0.95. AROME-Arctic has a cold bias (−1.3 °C) for the period originating from a small, but relatively consistent underestimation enhanced by episodic substantial drops in the forecasted temperature. On the other hand, AROME-Arctic fails to identify days with increased spatial variability in 2-m temperature (Fig. 3b). The spatial correlation between forecasted and observed temperatures is high most of the time (on average 0.67), but with substantial drops for shorter periods (Fig. 3c). A closer investigation of these drops (not shown) reveals that they occur most frequently under conditions with strong static stability, as estimated using the difference in observed temperature between Adventdalen (15 m) and Gruvefjellet (464 m). For example, the average spatial correlation is 0.82 when Adventdalen is more than 1 °C warmer than Gruvefjellet, while it is −0.01 when Gruvefjellet is more than 1 °C warmer than Adventdalen. Hence, even if AROME-Arctic captures some of the local spatial variability of temperature (i.e. which sites are warmer/colder in the neighbourhood), it is unable to accurately predict the local spatial variability in the presence of strong static stability. Therefore, the amplitude of these local variations and which days are prone to enhanced local variability is not well forecasted (Fig. 3c).
Also for the 10-m wind speed the day-to-day variations are well forecasted by AROME-Arctic (Fig. 3d), with a temporal correlation of 0.81. However, AROME-Arctic has a positive mean error (∼1.0 m/s), which is especially pronounced during a few high wind speed events. While AROME-Arctic is able to better identify days with high spatial variability in wind speed (Fig. 3e) than in 2-m temperature, the actual spatial correlation with observations is lower (on average 0.14, Fig. 3f), compared to 2-m temperature.
In summary, the operational AROME-Arctic is able to forecast the synoptic-scale forced temporal changes in the near-surface wind speed and temperature in the Adventdalen area. Earlier studies also showed that AROME-Arctic is competitive with other state-of-the-art NWP systems in the Arctic (e.g. Køltzow et al., 2019). However, weaknesses regarding the local spatial variability of temperature and wind speed are present. Having in mind the width of Adventdalen (∼4 km) which is not resolved by 2.5 km grid spacing, this may be improved by higher-resolution forecasts. The time period for the rest of this study (marked with grey shading in Fig. 3) is representative of the documented weaknesses in AROME-Arctic; the model performs relatively poorly with respect to reproducing the observed spatial variability in both temperature and wind.
Results from sub-kilometre simulations
In this section the sub-kilometre experiment, AS05, is validated against observations obtained during the field campaign between 12 and 16 February 2018, and compared to the model experiment with 2.5 km grid spacing, AA25. First, we show verification scores for all weather stations and the time series of near-surface meteorological variables at the Adventdalen (15 m) and Gruvefjellet (464 m) weather stations. Thereafter, two distinct weather situations are investigated in more detail. These are dominated by wind channelling through the Adventdalen valley on 14 February and a cold-air pool formation on 15–16 February.
Evaluation of surface parameters and surface energy budget
Both model experiments reproduced well the synoptic-scale conditions and the relative humidity follows the temperature evolution (not shown). We therefore focus on the differences in near-surface temperature and wind speed, and understanding their errors, in the two forecast experiments.
Table 3 summarises the error statistics of AA25 and AS05 for all included weather stations. Even though there are exceptions on the level of each individual station, averaged over all stations, AS05 has significantly smaller mean error (ME) and standard deviation of error (SDE), both for 2-m temperature and 10-m wind speed than AA25. Distinguishing between average ME and SDE for low elevation stations (Vest, Adv, Air, Hobo, End) and high elevations stations (Jans, Plat, Gruv, Brein), gives the same conclusions. An exception is the average ME for 2-m temperature at the low elevation stations, which is better for AA25 than AS05, and this will be discussed later.
Figures 4 and 5 present time series of the measured and simulated near-surface temperature and wind speed at the Adventdalen (lower elevation) and Gruvefjellet (higher elevation) weather stations. For the Adventdalen weather station, both AA25 and AS05 overestimate 2-m temperature, but this is more pronounced in AS05 (Table 3). As illustrated in Fig. 4a, there is a particularly strong warm bias (on average 7 °C) in AS05 from the latter half of 15 February and onwards (cold-pool formation situation). Both AA25 and AS05 underestimate the 10-m wind speed (Table 3). The smallest underestimation is seen in AS05, which is largely due to a smaller (positive) bias in the period between about 13 and 15 February (wind channelling situation, Fig. 4c).
For the Gruvefjellet weather station, the underestimation of 2-m temperature is smaller for AS05 than for AA25, and the 2-m temperature ME at this location is the smallest of all stations (Table 3). Temperature errors at the Gruvefjellet station in AA25 are related to cold biases in the latter half of 12 and of 15 February (Fig. 5a). Both AS05 and AA25 do not accurately reproduce the difference in 2-m temperature between the Gruvefjellet and Adventdalen stations under conditions with stable static stratification (negative difference) as seen in Fig. 5c. However, AS05 shows a moderate improvement compared to AA25 in resolving the temperature difference between the two stations, e.g. around midnight on 13 February and in the latter half of 15 February. In contrast, AA25 has the smallest ME of the two model experiments for the 10-m wind speed at Gruvefjellet (Table 3). The time series of observed and simulated 10-m wind speed reveal that this is related to a consistent (stronger) positive wind speed bias throughout most of the study period in AS05. However, this is an exception, as AS05 has on average a less pronounced positive wind bias than AA25 for the higher elevated sites.
Surface-layer temperatures are mainly driven by the surface energy budget (SEB). We evaluate the SEB simulations with observations from the Adventdalen weather station. We first investigate the driving terms of the SEB following the ideas of Miller et al. (2018) and Day et al. (2020), i.e. investigate those terms that are not directly influenced by the surface properties, shortwave and longwave downward radiation. Shortwave radiation is negligible in mid-February in Svalbard and, therefore, longwave radiation unsurprisingly dominates the forcing (Fig. 6). Both AS05 and AA25 have relatively large negative biases in downwelling longwave radiation (Table 4). Downwelling radiation (Fig. 6a) is strongly affected by cloud cover, in particular low clouds. By considering the cloud fractions, observed at Svalbard airport, presented in Fig. 7, we see that the periods with the strongest downwelling radiation biases (around 12 and 14 February), coincide with the occurrence of a strong negative cloud fraction bias in both simulations.
The response terms (i.e. the terms depending on the state of the surface) in the SEB are dominated by the upwelling longwave radiation and sensible heat flux (Fig. 6). Considering upwelling longwave radiation (Fig. 6b), the model errors vary in sign throughout the period and display a correlation with surface temperature errors (0.72 in AA25, 0.51 in AS05). Sensible heat flux was observed to be positive (downwards) during most of the study period (Fig. 4d). Despite the underestimation of wind speed at this station, both model experiments overestimate this flux on average (Table 4). Following Tjernström et al (2005), Fig. 8a shows sensible heat flux divided by 10-m wind speed against temperature difference between 2 m and the surface. The sensible heat flux is proportional to both the wind speed and the near-surface temperature gradient in the bulk-flux formulation which the model system makes use of, and thus, the slope represents the heat transfer coefficient in Fig. 8a. In observations, the wind-scaled heat flux has close to linear dependence on the temperature difference in weakly stable conditions when the temperature difference is small. Figure 8a suggests that both model experiments have the heat transfer coefficient close to the observed coefficient of weakly stable conditions regardless of the temperature difference in the model experiments. This indicates that the positive turbulent flux bias is due to overestimation of the transfer coefficient in very stable conditions. The overestimation can be partly explained by the surface roughness length in the model being approximately one order of magnitude larger than determined from the sonic measurements, but it is unknown how representative the observation point is for a model grid box. In addition, the static stability in the surface layer is often overestimated. The temperature difference between 2-m and the surface is overestimated on average by 2.1 °C in AA25 and by 0.4 °C in AS05 (Fig. 4b). The observed latent heat flux (Fig. 6e) is small compared to the sensible heat flux and varied in sign during the campaign, both in the observations and in the simulations.
To study the sensitivity on how the surface responds to the net energy flux we show surface temperature and 2-m temperature against the net energy flux in Fig. 8b and c. These temperature sensitivity diagnostics provide information about the thermal inertia of the surface and indirectly also about the coupling between the surface and atmosphere. In AA25, the overall surface temperature sensitivity to the net energy flux is too low (0.01 °C/Wm−2 compared to 0.24 °C/Wm−2 of observations, based on the regression line slopes in Fig. 8b and c), whereas in AS05, the surface temperature sensitivity (0.19 °C/Wm−2) is more comparable to the observations. At 2-m, temperature sensitivity to the net energy flux is slightly lower because the 2-m temperature is diagnosed not only with the help of surface temperature but the lowest model level. AS05 has the 2-m temperature sensitivity (0.11 °C/Wm−2) closer to the observed value (0.18 °C/Wm−2) than AA25 (0.08 °C/Wm−2). In AS05, the temperature sensitivities are higher at lower elevations than at higher elevations, while AA25 does not show this pattern (not shown). This behaviour can be interpreted as a more realistic physical response because the higher elevated sites experience a more rapid exchange of air masses and are less dominated by local processes working over time.
In general, AS05 does not show clear improvements in simulating the surface energy budget components compared to AA25 (Table 4). The net surface energy budget (Fig. 6f), defined as a sum of surface net radiation, sensible heat flux and latent heat flux, has both negative and positive values in the observations, whereas both model experiments simulate negative values through virtually the whole study period. This is because both model experiments fail to capture the cloudy periods. The temperature sensitivity diagnostics, however, suggest that the sub-kilometre system improves the sensitivity of the near-surface temperatures to the net energy flux even though there are no changes in the surface description of the model.
The stronger underestimation of the surface net energy in AS05 than in AA25 for Adventdalen (Table 4) contrasts a substantially warm bias seen in 2-m temperature in AS05 (Table 3). It is likely that part of this can be explained by the force-restore method employed where the surface temperature evolves not only due to the forcing by the surface energy budget, but also due to a restoring term towards deep ground temperature (Boone et al., 1999). During the studied period this restoring term warms, on average, the surface layer more in AS05 (2.6 °C/h) than in AA25 (1.4 °C/h). This is partly due to a slightly warmer initial deep layer temperature in the beginning of the studied period originating from warmer model forecasts during the model spin-up. In addition, since AS05 captures the relatively warm air temperatures in the middle of the period better, AS05 builds up an energy reservoir in the ground from where excessive energy is released at the time of the cooling in the last part of the period. This shows the importance of describing the ground processes adequately and highlights the need for developing the surface model at the same time as moving towards higher resolution. In addition, the surface assimilation scheme behaves differently in the two experiments in terms of producing different surface temperature increments (not shown), which also affects the evolution of the surface temperature during the period studied.
In order to investigate the difference between the two model experiments in more detail, two distinct and rather persistent time periods are analysed: First, a period of wind channelling through the Adventdalen valley on 14 February. Secondly, the formation of a cold-air pool in the Adventdalen valley on 15 February.
Wind channelling through the Adventdalen valley
The highest temperatures and strongest winds during the campaign were measured on 14th February, at the Hobo (−1.0 °C) and Janssonhaugen (14.7 m/s) weather stations. The lower elevated stations along the Adventdalen valley (Vest, Air, Adv, Hobo, Jans, see Fig. 1) measured on average 1 m/s stronger winds and 3 °C higher temperatures than the nearby sites at higher elevation (Plat, Gruv, End, Brein) on 14 February, as illustrated in Fig. 9. This indicates intensified wind channelling through the Adventdalen valley.
The spatial differences in temperature and wind between the two experiments are striking (Fig. 9). AS05 matches the indicated near-surface observations at most measurement stations far better than AA25. The mean absolute error of wind speed for all stations is 2.2 m/s for AS05 and 3.0 m/s for AA25 on 14 February. As in the observations, AS05 shows clear signs of wind channelling through the valley (Fig. 9b). For example, AS05 captures the mean wind speed difference between the valley bottom in Adventdalen and the highest located station Breinosa on 14 February (3.0 m/s compared to 3.2 m/s in observations). In contrast, the channelling is only recognised to a small extent in AA25, and the mean wind speed difference between Adventdalen and Breinosa is even wrong in sign in AA25 (−1.9 m/s). This reveals that AS05 effectively resolves the topography of the Adventdalen valley, which is not the case for AA25. Another feature simulated by AS05 is increased wind speed close to and on downstream mountain slopes. The occurrence of this phenomenon is supported by the intensified flow observed at the Gruvefjellet weather station, located close to a mountain slope. On the other hand, both AS05 and AA25 simulate easterly downslope wind intensification for the Hobo weather station, where in fact weak westerly wind was measured, which is an indication for a recirculation in the Adventdalen valley. In addition, at the entrance of Adventdalen (Vest and Air) the simulated wind direction of AS05 differs by approximately 30° from the observations (not shown in figures). These examples imply that despite the remarkable improvements in capturing the topographic impact, an NWP model with increased spatial resolution (0.5 km grid spacing) does not necessarily fully resolve all topographic wind effects in complex terrain.
The temperature field simulated by AS05 highly agrees with the observations from weather stations for this situation (Fig. 9a). The warmest temperatures occur at the valley bottom, especially at the northern side of the valley, and lower temperatures occur at higher elevation sites. AA25 simulates too low temperatures for the sites in the valley.
In order to measure the spatial variability of the temperature in the valley, a transect of the near-surface temperature was performed with a snowmobile on 14 February (Fig. 10a). Measurements show noticeable spatial variability with the temperature varying between −8 and −4 °C in the Adventdalen valley with increased temperatures around the Hobo station. AS05 is in this situation, compared to AA25, in much better agreement with the snowmobile measurements (Fig. 10c) and shows higher spatial correlation with observations (0.63 compared to 0.04) and smaller bias (−0.2 °C compared to −2.7 °C). The results suggest that topographic effects play a main role here also for the temperature field in a situation of wind channelling, possibly through enhanced vertical mixing of the air mass in the valley
Cold-air pool in the Adventdalen valley
On 15 February 2018 a cold-air pool developed in the Adventdalen valley under nearly clear sky conditions. The wind speed dropped from 8 m/s to below 3 m/s and the 2-m temperatures from −10 to −15 °C between 06:00 and 12:00 UTC at the Adventdalen weather station (Fig. 4a) while the high elevation stations measured higher temperatures (Fig. 11). The maximum temperature difference between Adventdalen station (at 15 m elevation) and Gruvefjellet (464 m), which are less than 5 km apart in horizontal distance, was 9 °C on this day (Fig. 5c). After the aforementioned rapid temperature drop in the Adventdalen valley, temperature decreased gradually both at the valley bottom and higher elevations until the end of the campaign, without any major warming.
The observed net radiation, mainly driven by the clear sky conditions, varied from −70 to −10 W/m2 at the Adventdalen weather station on 15 February (Fig. 6c). As the observed sensible heat flux decreased from 50 W/m2 to close to zero at the time of the wind speed decrease on 15 February (Fig. 6d), the net surface energy budget (Fig. 6f) remained negative the whole day, having its minimum at 06–12:00 UTC when the temperature drop was steepest. This led to cooling of the surface and the near-surface air creating a stable surface layer with a strong temperature inversion.
Both model experiments capture the wind weakening on 15 February, though hours later than observed (Fig. 4c). AA25 simulates the steep temperature drop around 18:00 UTC, approximately 12 hours later than observed (Fig. 4a), and wrongly predicts a temperature drop at the high-elevated stations as well. Therefore, the temperature difference between Adventdalen and Gruvefjellet in AA25 does not resemble the observations (Fig. 5c). AS05, instead, creates a small temperature decrease in the valley around 12:00 UTC at the time of the wind speed decrease and correctly keeps the temperature nearly constant for the higher elevation stations (e.g. Gruvefjellet, Fig. 5a). The 2-m temperature decrease at the Adventdalen station (Fig. 4a) is, however, substantially underestimated in AS05, which results in a strong positive bias of the forecast.
Also on 15 February a transect of the near-surface temperature was performed with a snowmobile to measure the spatial variability (Fig. 10b and d). Unlike the day before, AS05 does not show improved spatial correlations compared to AA25. The spatial correlations between forecast and observations are 0.05 and −0.01 for AS05 and AA25, respectively. In addition, a considerable positive bias is present in both AS05 (+5.8 °C) and AA25 (+4.0 °C).
Both AS05 and AA25 overestimated the outgoing longwave radiation noticeably at the time of the cold-air pool generation (Fig. 6b). Simultaneously, the sensible heat flux was overestimated in both experiments and the corresponding error compensation led to a net surface energy budget which is comparable to observations on average. However, the experiments do not capture the temporal changes in the surface energy budget. The delay in forecasting the wind speed decrease in the model experiments explains some of the temporal difference. In addition, as mentioned earlier, the restoring term towards deep ground temperature in the surface model is particularly high in AS05 at the time of the cold-air pool generation which leads to only a small surface temperature drop in AS05.
Another factor influencing cold-air pool formation and persistence is known to be downvalley drainage of the cold air. At the Endalen weather station, located in one of the side valleys of Adventdalen, wind blew down the valley perpendicular to the wind direction in Adventdalen with a strength of 3 m/s during the cold pool event. The model simulations do not capture the drainage flow in Endalen. AA25 does not represent Endalen at all and AS05 contains only 3–4 grid cells across the valley, which is below the effective resolution of the model simulation, generally around 7 times the grid cell length (Skamarock, 2004).
On 16 February both experiments were too warm and windy in Adventdalen from the beginning of the forecasts (Fig. 4a and b). The reason for the offset is related to a different development between the cycling of 3-h model forecasts during 15 February, which is used as a model first guess field for the analysis at 16 February 00:00, and the 24-h forecast initialised at 15 February 00:00 shown in Fig. 4. The temperature and wind forecast on 16 January do not show any improvements during the duration of the forecast. This emphasises the importance of a good initial state for a successful forecast.
Vertical profiles were retrieved from a tethersonde at the Adventdalen weather station on 16 February between 17:34 and 19:12 UTC (Fig. 12). The temperature profile reveals a strong inversion near the surface with the temperature increasing from about −23 °C at the surface to −13 °C at 50 m height (Fig. 12a). Both model experiments create a shallow temperature inversion near the surface, but AA25 and AS05 overestimate temperatures by 5 and 9 °C at the surface and both by 4 °C at 50 m height compared to observations. AS05 partly reproduces the shape of the temperature inversion at the lowest model levels, however, the vertical temperature gradient is underestimated. A vertical cross section of potential temperature across the Adventdalen valley from AS05 (Fig. 13b) illustrates that the model simulates the cold pool at the valley bottom, but possibly displaced and with overestimated temperatures. Unlike AS05, AA25 lacks a clear indication of the surface-based temperature inversion at the model levels (Fig. 12a). Instead, the surface scheme computes an excessive temperature inversion between the surface and the lowest model level. Even though AA25 captures the surface and 2-m temperature closer to the observations than AS05 at this time and location, it is evident that the unrealistic temperature profile between the surface and the lowest model level improves the 2-m air temperature forecasts for the wrong reason.
10-m wind speeds of around 2 m/s were observed at the Adventdalen weather station (Fig. 12b). On top of the shallow inversion, at about 70 m height, a low-level jet was present with wind speeds of about 5 m/s. AS05 captures a realistic shape for the wind profile in the lowest 500 m, including the low-level jet, but underestimates the wind speed by up to 2 m/s above 500 m. AA25, in comparison, represents the vertical profile of wind poorly in the lowest 300 m and does not simulate any low-level jet. Above 300 m, however, AA25 matches the observed wind speed very well. The vertical cross section through Adventdalen (Fig. 13d) shows that the relatively strong, channelled flow, including the low-level jet, was banked up against the southern side of the Adventdalen valley. It was strongest between 50 and 150 m above the surface, with wind speeds of about 8 m/s. In AS05 the near-surface temperature is lowest at the northern side of Adventdalen, where the low-level jet is absent. This indicates that the jet causes too much vertical mixing in the temperature.
Discussion and conclusions
Terrain-induced atmospheric features pose challenges for prediction of near-surface atmospheric conditions, even in kilometre-scale NWP models. In this study, the ALADIN-HIRLAM NWP system with 0.5 km horizontal grid spacing and an increased number of vertical levels was compared to the 2.5-km model system similar to the current operational NWP system at MET Norway. The impact of the increased resolution on the model’s ability to represent and realistically simulate boundary-layer processes was investigated for the period from 12 to 16 February 2018 in an Arctic fjord-valley system in the Svalbard archipelago. Model simulations were compared to a wide range of observations conducted during a field campaign. The studied period is a part of the YOPP SOP-NH1 and this study is a contribution to YOPP.
The operational 2.5-km model system is able to forecast the synoptic-scale temporal changes in near-surface temperature and wind speed. It has earlier been shown that the operational model system performs well compared to other NWP systems with similar or coarser resolution for this area (e.g. Køltzow et al., 2019). However, results presented here show that it does not resolve local topographic effects in the narrow Adventdalen valley and therefore is not able to accurately predict the local variability in the weather.
We found that the sub-kilometre experiment improved both the spatial structure and overall verification scores of the near-surface temperature and wind forecasts compared to the 2.5-km experiment. The sub-kilometre experiment successfully captured the wind channelling through the Adventdalen valley and the temperature field associated with it. The added value of higher resolution on wind speed is in line with previous kilometre-scale studies made in Svalbard (Kilpeläinen et al., 2011, 2012; Kim et al., 2019) and on sub-kilometre Arctic systems (Vionnet et al., 2015; Yang, 2019).
Despite the overall added value, the sub-kilometre model system does not resolve certain aspects. The mean error of 2-m temperature forecasts at low elevation stations was particularly high in the sub-kilometre experiment. Like Vionnet et al. (2015) for sub-kilometre experiments in the Canadian Arctic and Kilpeläinen et al. (2011, 2012), and Mayer et al. (2012) for kilometre-scale experiments at Svalbard, we found an inability of the sub-kilometre model to capture the intensity of the valley cold-air pool. However, the use of measurement campaign data provides some encouraging results. For example, the sub-kilometre system had a more realistic vertical temperature gradient and wind profile in the lowest part of the atmosphere, and the surface temperature sensitivity to the net surface energy was closer to the observations. It appears to be beneficial to have more model layers close to the surface to resolve the vertical profile of the atmosphere and the surface-atmosphere sensitivity properly. Mayer et al. (2012) also pointed out that increased vertical resolution potentially can lead to improved representation of inertial-gravity waves under stable conditions.
Deficiencies in different parts of the model system, not necessarily related to the resolution, hamper the full potential of the sub-kilometre system. The way towards operational sub-kilometre simulations should therefore also address these issues. Large model biases were found in both driving and response terms of the surface energy budget. The downwelling longwave radiation had a large negative bias and was strongly affected by cloud cover and possibly cloud properties. The simulated surface turbulent fluxes were overestimated and further investigations are ongoing for the turbulent flux formulations in stable conditions.
Another factor limiting the forecast quality is simplistic treatment of the land surface and soil. The force-restore method for the surface temperature evolution provides a simplified representation of the surface and soil processes including ground snow. The force-restore method might be a limitation for taking the full advantage of sub-kilometre resolutions especially for high latitude locations without diurnal forcing. In our results, it contributed substantially to a larger warm bias during a cold-pool event. Day et al. (2020) demonstrated that usage of a multi-layer snow scheme improves the temperature forecasts in two high-Arctic sites compared to a single-layer snow model. A recent version of the SURFEX model with the adequate description of soil, surface and snow processes might improve the sub-kilometre system and should be investigated.
Having accurate initial conditions is important in general and also reported on in a number of sub-kilometre and Svalbard kilometre-scale studies (Mayer et al., 2012; Hagelin et al., 2014; Vionnet et al., 2015; Kim et al., 2019). To avoid a long spin up in the simulations, it is required that the small scales represented in the sub-kilometre simulations are present from the start of the forecast. Similarly, an accurate initialization of the surface is a prerequisite for the NWP system to be able to forecast the highly local processes in the stable boundary layer important for the Arctic region. The cold-pool event discussed here is another example showing how initial surface properties potentially contribute to larger forecast errors.
A question for future studies is the impact of drainage flows on the development of the cold-air pools. Both model experiments were too warm in the lowest part of the atmosphere. This could be due to inadequate representation of vertical mixing or advection. Although the sub-kilometre experiment simulates down-sloping winds on some valley slopes, it does not capture the cold-air drainage in Endalen at the time of the cold-air pool, and the lack of drainage flows from side valleys might limit the model’s ability to simulate cold-air pools accurately in Adventdalen.
As suggested by Casati et al. (2017), additional observations and process-based diagnostics have been important to better understand the differences between kilometre-scale and sub-kilometre-scale simulations. Our use of independent campaign observations, such as tethersondes and snowmobile observations, drastically broadened the understanding of sub-kilometre model performance compared to the use of synoptic weather stations only.
Even though the studied period is short, our work demonstrates the potential of sub-kilometre NWP systems for forecasting weather in complex Arctic terrain. We show that the sub-kilometre system performs well compared to the operational 2.5-km system even when no specific adjustments or adaptations have been made for the model system besides the improved horizontal and vertical resolution. However, our results also suggest that the increase in resolution should be accompanied with further development of other parts of the model system. The sub-kilometre model system for Svalbard is currently being further investigated for longer periods and for an optimal setup also considering the computational costs.