Here we provide statistically significant observational evidence that a feedback control system moderating atmospheric temperature is presently operating coherently at global scale. Further, this control system is of a sophisticated type, involving the corrective feedback not only of a linear error term but also its derivative and its integral. This makes it of the same type as the most widely used control system developed by humans, the proportional-integral-derivative (PID) control system.

In Leggett and Ball (_{2} – showed a high level of statistical significance in correlation to each other when expressed as the level of SOI, the first-difference of temperature and the second difference of CO_{2}.

These series can be demonstrated to relate to each other in the same way as the three terms in the most widely used industrial control system – the PID (proportional–integral–derivative) control system (Åström and Murray,

We demonstrate this as follows. Figure 7 of Leggett and Ball (_{2 -}in first-difference, level, and second-difference form respectively. The equivalent of these terms in the terminology of calculus is derivative, level, and second derivative. Raising each of these terms by one level of integration, we obtain level, integral, and derivative. Expressed in control system terms, these terms become Proportional, Integral, and Derivative (PID).

Based on these findings, we develop a quantitative hypothesis. The hypothesis is that the global temperature time series can be formally characterised as one influenced by a PID control system.

In this paper, we firstly describe the main features of our hypothesis. We then show the results of our tests of its validity.

A control system of PID form influencing global surface temperature has previously been proposed theoretically. Nisbet et al. (

_{2}. Methane too will change, especially as temperature change affects water precipitation in rain and snow.

This paper, then, seeks evidence for the existence and operation of a control system of PID type affecting global-level atmospheric surface temperature. In describing our hypothesis, we first seek to conceptualise the formal control system terms from the perspective of a putative control system for global surface temperature.

Once this is achieved, the control system is characterised using observed time series, and subsequently tested both for the full PID type of control system and also, for comparison, for simpler versions using subsets of the P, I and D terms.

The remainder of this section reviews control systems in general, explains the concept of a global level atmospheric temperature control system, and then considers how to test hypotheses relating to such a global level control system.

Control systems are in widespread everyday use, from the cruise control in a motor vehicle to the thermostat used for building heating/air-conditioning systems, among many other applications. In a control system a

To describe the building thermostat in these terms, the thermostat is the controller, the actual room temperature is the process variable, the desired room temperature is the setpoint, the activation signal to the air conditioner or heater is the controller output, and random heat sources (such as radiant heat from sunshine and warm bodies, or radiant cooling through windows) constitute the disturbances to the process (VanDoren,

The most widely used general-purpose controller uses a control-loop feedback mechanism. In this, it has been found that feeding back the

This information can be translated to the question of a control system for the atmospheric temperature as follows.

A way to determine whether or not an air conditioning system was present in a room without observing any equipment, would be to observe the process variable time series (the temperature time series). This should show signs of the disturbance (an increasing outside temperature) starting to make the room temperature rise and then the various possible elements (P and/or I, and/or D) of the control system acting such that the room temperature did not follow the disturbance but stayed closer to the setpoint. The signature of the effect of these elements should be seen in the time series record of the temperature of the room.

Hence the observer could determine not only whether there was a temperature control system present, but also which type it was likely to be – full PID or a simpler model made up of a subset of P, I or D terms. In time series analysis terms, the time series model selection process can show which of the combinations of P, I and D terms fits best with the temperature time series observed (or that there are no fits).

If there is an air conditioning system influencing the room temperature, this approach can provide strong evidence for its existence, and its type, even though the air conditioning system itself has not been seen.

This approach is in line with Machamer et al. (

If there are features of the process variable that change in ways that can best be explained by the operation of a control mechanism, then this would provide sufficient evidence for the existence of a control system. In other words, we would be seeking evidence for the existence of a mechanism based on system behaviour as a whole.

With this background, and using statistical analysis of time series of global surface temperature observations, the hypothesis explored in this paper is to see if the global atmospheric surface temperature time series can be formally characterised as one influenced by a PID control system. This is carried out, as in the example above, by using time series analysis and its model selection process to determine which, if any, of the combinations of P, I and D control system terms fit best with the observed time series for global atmospheric surface temperature.

Statistical methods used are standard (Greene,

To make it easier to visually assess the relationship between the variables, the data were normalised using statistical

A regression using Z-scored variables provides standardised regression coefficients. These coefficients report how much change a one-standard-deviation change in the independent variable produces in the dependent variable. Although comparisons between these coefficients must be interpreted with care, a standardised coefficient for independent variable

In the time-series analyses, global atmospheric surface temperature is the dependent variable. We tested the relationship between this variable and independent variables derived from (1) the level of atmospheric CO_{2}; (2) its change; and (3) its cumulative sum. We express the change in CO_{2} as its first finite difference (we label this ‘first difference’ in the text). Variability is explored using intra-annual (monthly) data. The period covered in the figures is shorter than that used in the data preparation because of the loss of some data points due to calculations of differences and of moving averages.

Smoothing was used on series incorporating first-difference CO_{2} to the degree needed to reveal significant relationships if any were present. Smoothing is carried out initially by means of a 13-month moving average – this also minimises any remaining seasonal effects. In the following analyses, as for Leggett and Ball (_{2} (D_error) term fits the regression best when smoothed by a 13-month moving average.

Variables are led or lagged relative to one another to achieve best fit. These leads or lags were determined by means of time-lagged correlations (correlograms). The correlograms were calculated by shifting the series back and forth relative to each other, one month at a time.

With this background, the convention used in this paper for unambiguously labelling data series and their treatment after smoothing or leading or lagging is depicted in the following example:

The atmospheric CO_{2} series is transformed into its first difference and smoothed with a 13-month moving average. The resultant series is then _{2}_Z5876. (As in Leggett and Ball (

Note that to assist readability in text involving repeated references, atmospheric CO_{2} is sometimes referred to simply as CO_{2} and global surface temperature as temperature.

Time series models differ from ordinary regression models in that the results are in a sequence. Hence, the dependent variable is influenced not only by the independent variables, but also by prior values of the dependent variable itself. This is termed autocorrelation between measured values. This serial nature of the measurements must be addressed by careful examination of the lag structure of the model. This type of ordinary least squares regression is termed ‘time series analysis’ (Greene,

A further issue in time series analysis concerns what is termed the ‘order of integration’ of each of the series used. Greene (

The usual tests for stationarity (for example, the ADF test) can allow for the presence of a linear deterministic trend in the time-series in question, but they cannot allow for the possibility of a polynomial trend. It will be shown that this possibility needs to be tested for in this study.

There is evidence that if a unit root test that allows for only a linear trend is applied mistakenly to a series with a polynomial trend, then the test may have extremely low power. For example, see the results of Harvey et al. (

This problem may be resolved by using the Lagrange multiplier (LM) unit root test(s) proposed by Schmidt and Phillips (

The Schmidt-Phillips test is available in the urca package in the R statistical software (R Development Core Team,

Next, a model must be established in which any autocorrelation in the short-run relationship, if present, is fully accounted for by use of an optimal lag structure. In this study, this is done within the modelling process by reference to the adjusted coefficient of determination, R^{2}, and to the Akaike Information Criterion.

The adjusted R^{2} is used because there is a problem that the unadjusted R^{2} cannot fall when, in multiple regression, independent variables are added to a model so there is a built-in tendency to add more independent variables to the model (Greene, ^{2} is a fit measure that penalises the loss of degrees of freedom that result from adding variables to the model. There is, however, some question about whether the penalty is sufficiently large to ensure that the criterion will necessarily lead the analyst to the correct model as the sample size increases (Greene,

Pesaran et al. (

To avoid over-parameterisation, we seek the model with the fewest lags that produces a model with no autocorrelation (out to 36 lags).

The specific information sought from each well-specified ARDL model is the effect on the dependent variable of each of the potential driving variables, the relative percentage of the total driving task that each achieves, and the degree of statistical significance of each.

The ARDL econometric model used here was that implemented in the Eviews 9.5 package (IHIS Eviews, 2017). The time series statistical software package Gnu Regression, Econometrics and Time-series Library (GRETL,

Intrinsic to its design, a control system which involves some or all of proportional, integral and derivative feedback terms creates a problem for its analysis by means of ordinary least squares (OLS) econometric time-series methods. The problem is that one independent variable, the proportional (linear) error (P error) variable (albeit displaying by design the opposite sign) will be exactly collinear with another variable – the Disturbance variable.

On collinearity (interdependency), Farrar and Glauber (1964) write:

The single equation, least squares regression model is not well equipped to cope with interdependent explanatory variables. In its original and most simple form the problem is not even conceived. Values of X are presumed to be the pre-selected controlled elements of a classical, laboratory experiment. Least squares models are not limited, however, to simple, fixed variate – or fully controlled – experimental situations. (Other situations) may provide the data on which perfectly legitimate regression analyses are based.

Hastie et al. (

The Gauss–Markov Theorem: One of the most famous results in statistics asserts that the least squares estimates of the parameters

On a problem arising from OLS based on the Gauss–Markov theorem, Greene (

…the Gauss–Markov theorem states that among all linear unbiased estimators, the least squares estimator has the smallest variance. Although this result is useful, it does not assure us that the least squares estimator has a small variance in any absolute sense. Consider, for example, a model that contains two explanatory variables and a constant. … If the two variables are perfectly correlated, then the variance is infinite.

The problem is that variance being infinite means that standard OLS packages will not run.

Looking to solutions for this problem, Hastie et al. (

The Gauss-Markov theorem implies that the least squares estimator has the smallest mean squared error of all linear estimators with no bias. However, there may well exist a biased estimator with smaller mean squared error. Such an estimator would trade a little bias for a larger reduction in variance.

…most models are distortions of the truth, and hence are biased; picking the right model amounts to creating the right balance between bias and variance.

Biased estimates are commonly used. Any method that shrinks or sets to zero some of the least squares coefficients may result in a biased estimate. (These include) variable subset selection and ridge regression…

On variable subset selection, Greene (

Ridge regression takes another approach to reducing a variance that would otherwise be infinite, by introducing bias in a controlled way into the regression. This enables the variables in question to stop displaying infinite variance, and so the assessment of the performance of a model including the full set of independent variables is possible.

Ridge regression (Hoerl,

For global surface temperature, we used the Hadley Centre–Climate Research Unit combined Landsat and SST surface temperature series (HadCRUT) version 4.6.0.0 (Morice et al., _{2} data from 1958 to the present, the US Department of Commerce National Oceanic and Atmospheric Administration Earth System Research Laboratory Global Monitoring Division Mauna Loa, Hawaii, annual CO_{2} series is used (Keeling et al., _{2}’.

In the analyses, monthly data is used. The period covered in the figures is sometimes shorter than that used in the data preparation because of the loss of some data points due to calculations of differences and of moving averages.

We note that to assist readability in text involving repeated references, atmospheric CO_{2} is sometimes referred to simply as ‘CO_{2}‘ and global surface temperature as ‘temperature’. The period covered by the time series used in each table or figure is given in the title of the table or figure.

In the tables of results, statistical significance at the 10%, 5% and 1% levels is indicated by the symbols *, ** and ***, respectively.

In _{2} (IPCC, _{2} to represent the control system term

Terms from climate science matched to control system terms.

_{2}( CO

_{2})

_{2}

_{2}

_{2}reduces simply to minus Z_ CO

_{2}

_{2}

_{2}

_{2}

_{2}

After Åström and Murray (

As stated above, feedback control systems derive an error term or terms which are fed back as reactions to disturbances to the system, with the aim of keeping the system at its previously determined setpoint (Åström and Murray,

We therefore express this relationship in terms of an equation as:

With regard to controller terminology, this is equivalent to:

For simplicity in what follows we will use the expression in

If the temperature control system does not reach its setpoint temperature, this can be expressed as:

Specifically regarding

However it can be argued that the complexity of having I and D controller outputs in ‘real world’ control systems, such as industrial control systems, is entirely because real world control is not as simple as setpoint being equal to Disturbance minus P_controller_output.

What we therefore use as the outcome variable is not the setpoint temperature, but

Because in this instance the setpoint equals zero due to Z-scoring, the error term simply becomes zero minus disturbance, which is equal to ‘minus disturbance’.

The I and D error terms are derived from the P error term as the cumulative sum of the disturbance series and the first difference of the disturbance series respectively.

The black curve in _{2}. We do not know the actual setpoint of the putative control system. However, the level of CO_{2} (disturbance) curve is steadily rising (that is, the level of disturbance is increasing). Hence, any point towards the start of the data series must be closer to the level of the temperature setpoint. We therefore choose a value for the setpoint (i) earlier in the time period, and (ii) as the average of a period over which the temperature was roughly level. This is taken to be the period from the start of data in November 1958 to December 1976. All curves in the figure are therefore Z-scored using this period as the base period. This enables us to see any points of departure thereafter. If there is no change in relationships between the curves, their trajectories will remain common after 1976, just as they were before. The setpoint is shown as the purple line. The observed temperature is also plotted (red curve).

Monthly data, Z-score base period November1958-December 1976 (period to left of vertical black line). Putative control system model for global surface temperature, selected elements: Disturbance (level of atmospheric CO_{2}) (black curve); control system setpoint for temperature (purple line); observed temperature (red curve).

From

Monthly data, Z-score base period November 1958 - December 1976 (period to left of vertical black line). Putative control system model for global surface temperature, full set of elements: Disturbance (level of atmospheric CO_{2}) (black curve); control system setpoint for temperature (purple line); observed temperature (red curve); level of control system error (P_error) (green curve); integral (cumulative sum) of control system error (I_error) (orange curve); derivative (first difference) of control system curve (D_error) (brown curve).

As expected,

We now turn to the time series analysis of the relationship of global surface temperature to this putative control system and its proportional, integral and derivative feedback components.

First we assess the order of integration of the series used.

For each series,

Results of ADF tests for stationarity, allowing for both drift and trend.

^{th}-order

^{rd}-order

As all series are stationary, OLS can be used for regression analysis, with ARIMA dynamic regression analysis used for any autocorrelation detected.

Both ridge regression and ARDL dynamic regression results are provided.

Putative PID control system: ridge regression results.

Penalty = 0.57.

Resid SD = 0.44516.

Eviews ARDL estimation output for period November 1958 to February 2018 for temperature as a function of putative control system P_error, I_error and D_error: short-run model dynamic relationship.

Eviews ARDL estimation output for period November 1958 to February 2018 for temperature as a function of putative control system P_error, I_error and D_error: short-run model independent variables.

All three P, I and D terms display coefficients of the same order of magnitude in the ridge regression.

To gain information on statistical significance, we now turn to enabling standard OLS by means of the variable subset selection method. The collinear variable deleted to produce the subset is Disturbance. Dynamic OLS regression is then conducted using the ARDL package.

Although the significance of the I–error term is only at the 0.1 level, all three P, I and D terms are significant in the model.

It is still possible, however, that other control system models using only one of P, I, or D, or PI or PD terms might display even better fits. The same assessment process as above was carried out for these models and the results are shown in

Fit with observed temperature of global surface temperature predicted from dynamic regression models involving various combinations of control system error terms. The PID model shows the best fit (highest Coefficient of Determination (R-squared) and lowest Akaike Information Criterion).

Fit with observed temperature of global surface temperature predicted from dynamic regression models involving various combinations of control system error terms.

To see the effect of the control system over time, we now return to the series that was Z-scored over the earlier part of the model (1958 to 1976) as in

These results are depicted in

Monthly data, Z-score base period November 1958 - December 1976 (period to left of vertical black line). Predicted temperature from dynamic regression model of a control system for global surface temperature using P, I and D error terms (green curve); Disturbance to temperature (level of atmospheric CO_{2}) (black curve); observed temperature (red curve).

The figure shows that there is a close match between the temperature expected from the control system model (green curve) and the observed temperature (red curve). This is the case especially when compared to an alternative falsifying null hypothesis of ‘no control system’, which would lead to an expectation that the disturbance driver (black curve) would be closely followed.

As background to this discussion, we have made the case that, because we are investigating a putative control system, we cannot know (i) its purpose, and therefore (ii) its setpoint.

We did, however, have a putative dimension to be controlled (global surface temperature) and a disturbance (CO_{2} levels trending higher than their previous longer-term average).

Given that we did not know the setpoint, we proposed that the variable to use for the outcome of the putative control system was best couched in terms of the

Similarly, we proposed that it is reasonable to consider that the temperature observed early in the time series when the disturbance was of lower magnitude to be nearer to the setpoint. As a result we used that observed temperature to represent the setpoint.

We consider the results from the investigation, including these concepts and using observed data at global scale, to show precisely those signs of the operation of the P, I, and D dimensions, revealed as highly statistically significant relationships of PID form. We believe that this result is both comprehensive and suggestive (and perhaps even diagnostic) of the existence of a control system presently operating at global scale on global surface temperature.

The question arises as to a candidate for this planet-scale control system mechanism.

We do not intend to attempt to rule out any candidate, but simply to note some of the factors which may pertain to further research in order to narrow down such candidates.

Our initial point of reference is that all control systems use negative feedback (Åström and Murray,

To start with, one can distinguish two temporal domains in the global carbon cycle (IPCC,

A second, slow domain consists of the extensive carbon stores in rocks and sediments which exchange carbon with the fast domain through volcanic emissions of CO_{2}, chemical weathering, erosion and sediment formation on the sea floor. Turnover times of the (mainly geological) reservoirs of the slow domain are typically 10,000 years or longer.

The degree of match of negative feedback candidates to our control system is primarily based on the estimated capacity of each candidate to match the large interannually changing temporal dynamics of our model.

In the assessment, the negative feedbacks are also classified as abiotic or biotic.

Sources of negative feedback that lead to a reduction in global atmospheric surface temperature, listed in order of capacity to generate large interannually changing temporal dynamics, are as follows (Beerling and Berner,

Abiotic rock weathering. This is the removal of CO_{2} by the weathering of silicate and carbonate minerals (Berner et al.,

Biotic (plant-accelerated) rock weathering. Accelerated breakdown (by factors of 2 to 10) of nutrient-containing silicate minerals (Lenton et al.,

Ocean (via CO_{2} uptake from atmosphere and as sink of heat from atmosphere) (IPCC,

Increased emission of energy from Earth into space through long wave radiation as surface temperature increases (sometimes also referred to as blackbody radiation feedback) (IPCC,

Clouds: via (i) abiotic evaporation from the ocean and (ii) evapotranspiration (which is dominated by plants) from land (IPCC,

Land biosphere (displays a range of means of reducing temperature either by reducing atmospheric CO_{2} through CO_{2} uptake or by reducing temperature directly). According to Prentice et al. (_{2} uptake is a biological process involving the chemical transformation of CO_{2} into organic compounds by means of photosynthesis. Increasing CO_{2} concentration enhances primary production (an effect known as CO_{2} fertilisation). The build-up of decomposing soil organic matter also keeps CO_{2} from the atmosphere. Plants also use water more efficiently at higher CO_{2} concentrations, because their stomata tend to close, resulting in reduced water loss per unit leaf area and further photosynthesis enhancement.

Other, smaller, negative feedbacks involving the biosphere are: increasing emissions of biogenic volatile organic compounds (BVOCs) from plants (these would tend to increase the production of secondary organic aerosols (SOA), which have a net cooling effect); changes in marine dimethyl sulfide (DMS) emissions (DMS is a precursor of cloud condensation nuclei, with a net cooling effect that could act as a negative feedback in the climate system) (Prentice et al.,

We now consider the capacity of these negative feedbacks to match the temporal dynamics of our model. In considering these dynamics, we stress that these do not include seasonal effects: these have been filtered out in the preparation of the data (see Methods).

Because both biotic and abiotic rock weathering are slow domain, they would lack temporal dynamics rapid enough to correlate with our findings. Rock weathering would therefore seem to complement our findings, rather than be alternative explanations for them.

The remaining abiotic negative feedbacks are: heat and CO_{2} sink attributes of the ocean; increased emission of energy from Earth into space through long wave radiation; and cooling from increased cloud cover. Cloud cooling, in its abiotic form, can come about in direct response to temperature through an increase in surface temperature which increases evaporation, leading to an increase cloud cover, hence reflecting sunlight back to space, and thereby lessening the effect of global warming. The feedbacks of increased long wave radiation, ocean effects, and cloud cooling in its abiotic form are all fast domain – at the same temporal scale as our results. The six biotic negative feedback sources are also fast domain, and hence at the same temporal scale as our results.

That said, all of these negative feedbacks come about in direct response to temperature. Such negative feedbacks, while not being the P term of a control system, would resemble one. This situation, however, would leave unexplained patterns which look like the I and D terms of a control system such as that for which we present evidence.

So it would seem that our I and D findings suggest that a PID control system could exist even given the fast domain negative feedbacks listed above, and that such a PID control system could exist alongside these negative feedbacks.

We now turn from negative feedback processes to control systems. While little has been written on the subject, a global-level control system that was abiotic could exist.

With regard to potential biotic control systems, we note at the outset that the six biotic negative feedback sources could be part of separate control systems, or a single, joint system.

The most prominent global-level control system that has been proposed is that embodied in the Gaia hypothesis (Lovelock and Margulis,

A more recent version of the hypothesis cited favourably by Wilkinson (

As mentioned above, Machamer et al. (

That said, how do our results compare with the above description of the activities aspect of Gaia? The current level of atmospheric CO_{2} is consistently seen as a threat to the biosphere (Díaz et al. _{2} has been achieved by our control system. That said,

The evidence that we provide for a control system presently operating on global surface temperature, then, could be considered to be consistent with the definition of the control system activities aspect of Gaia above, modified as: ‘regulating climate among other factors at a

The provision of evidence for or against the biosphere being the source of the control system is beyond the scope of this study. Suffice it to say here that the control system for which we provide evidence carries out exactly the task that Gaia is hypothesised to carry out. Further research is required on this question.

In terms of such research, the processing of information required for a control system displaying marked proportional, integral and derivative dynamics at monthly level is quite within the capability of plants. Spitzer and Sejnowski (

We now turn to the point that there are now available two separate sets of results (from a previous study and this study) indicating complex and coherent mechanisms involving integrals and derivatives acting on temperature at planetary level. These results are (i) as mentioned in the Introduction the level of SOI, the first-difference of temperature and the second difference of CO_{2} all being highly statistically significantly correlated with each other (Leggett and Ball,

With this background, it is a plausible topic for further research that the individual capacities of plants scaled up coherently to global level could be producing the two separate coherent CO_{2}-related results that we report.

Evidence exists of complex adaptive behaviour at whole-of-plant-community level being mediated via inter-plant communication. This is done through mycorrhizal networks (for review, see Gorzelak et al.,

The hierarchical integration of this phenomenon with other biological networks at broader scales in forest ecosystems, and the consequences we have observed when it is interrupted, indicate that underground ‘tree talk’ is a foundational process in the complex adaptive nature of forest ecosystems.

Hence it is possible that the individual capacities of plants could be aggregated coherently to global level.

Within this research area, one question could be whether the integral and derivative effect is some sort of echo of the control system (that is, a separate manifestation of the same thing). Each effect would appear to be created either by one large entity, or the bringing together of the actions of multiple entities in a way which is itself coherent. That could be either from an overall organiser or by the phenomenon of self-organisation (Heylighen,

LMWL conceived the control system idea. LMWL and DB developed the theory and performed the computations. Both authors collaborated on interpreting the results and contributed to the final manuscript.

The authors would like to acknowledge with appreciation the support and advice of J. Gordon and C. Dawson. We thank two anonymous reviewers whose comments have greatly improved this paper.

No potential conflict of interest was reported by the authors.

_{2}

_{2}to global surface temperature and the El Niño–Southern Oscillation, and a candidate mechanism in global photosynthesis