Data Exploration

For the initial exploration of the data, I first examined histograms for each of the variables to asses the normality of the data. Most of the variables across all 3 sites did not have normal distributions when plotted as raw values or when transformed using log, square, and square root functions (Fig 7). Because not all variables are normally distributed, I will use a non-parametric MANOVA technique when comparing pond characteristics between different groups (See Results and Discussion).

Fig 7. Examples of histograms for variables sampled in the Tundra ponds. Non-normal distributions were common across all three sampling locations under all transformations as well. See methods section for explanations of variable names.

To check for outliers within the data sets, I created individual scatter plots looking at the relationship between pond characteristics and dissolved gas concentrations. Points that were consistent outliers across the scatter plots were not included in further analyses. Only one pond from the tundra location was omitted from the final statistical analyses (Fig 8).

Fig 8. An example of an outlier within the tundra pond data set (W20). Excluding the outlier, it appears that dissolved concentrations of methane decrease with elevation (going down the hillslope). Also represented in this figure is the thaw depth within each pond. However, thaw depth does not seem to be related to elevation or dissolved methane.

Literature suggests that water depth is one of the most important controls of GHG gas concentrations in the water column. With shallower water depths, more sunlight can warm the sediments, leading to increased microbial activity and CO2 and CH4 production. I used scatter plots to examine the univariate relationship between water depth and dissolved gas concentrations in the sampling ponds from all 3 sites (Fig 9). In larch ponds, water depth explained a large portion of the dissolved methane variance, but none of the carbon dioxide variation (Fig 9a) In the tundra ponds, water depth explained some but not all of the variation of both dissolved gases (Fig 9b). Finally, in the mire ponds, water depth explained little to no variation for both gases (Fig 9c). Since water is cited as one of the best predictors for dissolved gas concentrations, there must be other variables at play which are impacting concentrations. To look at the influence of multiple pond variables at once, I will use Principle Components Analyses.

Fig 9. Scatterplots for dissolved methane (blue circles) and carbon dioxide (red diamonds) as a function of water depth. The size of the dots represents thaw depth(TD)*. Water depth describes a large portion of the variation for methane in the larch ponds (r^2=.41), but little to no variation in carbon dioxide (a). In the tundra ponds (b), water depth describes some, but not all of the variation for both methane and carbon dioxide (r^2=.2 and .31, respectively). However, in the mire ponds, water depth explains almost none of the variation for both gases (c). Note the differences in scale along the x axis as well as the log transformations for plots b and c. *Thaw depth was not recorded in the mire ponds since permafrost is not present below these ponds.

Lastly, I looked at the scree plots to identify inflection points in the data to help guide my cut for the most important principle components for explaining the variation in the data (Fig 10). In the Larch ponds, most of the variation could be explained by the first component (Fig 10b). In the mire ponds, three components explained most of the data (Fig 10a), however, in the tundra ponds the exact cut off for the most important components is less clear, indicating that there are not key components that can explain the variation in the data (Fig 10c).

Fig 10. Scree plots for principle components analyses for the three sampling locations. Arrows indicate inflection points and potential cutoffs for the most important components in explaining the data. a) Mire ponds, b) Larch ponds, c) Tundra ponds.