8 Causation
Altman Abstract
Correlation implies association, but not causation. Conversely, causation implies association, but not correlation.
Associations can arise between variables in the presence (i.e., X causes Y) and absence (i.e., they have a common cause) of a causal relationship
In everyday language, dependence, association and correlation are used interchangeably. Technically, however, association is synonymous with dependence and is different from correlation
Altman (2015) Association Correlation Causation (pdf)
8.1 Liang Causality
Information Flow-Based Causality
Liang (2016) Abstract
Information flow or information transfer the widely applicable general physics notion can be rigorously derived from first principles, rather than axiomatically proposed as an ansatz. Its logical association with causality is firmly rooted in the dynamical system that lies beneath. The principle of nil causality that reads, an event is not causal to another if the evolution of the latter is independent of the former, which transfer entropy analysis and Granger causality test fail to verify in many situations, turns out to be a proven theorem here.
Established in this study are the information flows among the components of time-discrete mappings and time-continuous dynamical systems, both deterministic and stochastic. They have been obtained explicitly in closed form, and put to applications with the benchmark systems such as the Kaplan-Yorke map, Rössler system, baker transformation, Hénon map, and stochastic potential flow.
Besides unraveling the causal relations as expected from the respective systems, some of the applications show that the information flow structure underlying a complex trajectory pattern could be tractable.
For linear systems, the resulting remarkably concise formula asserts analytically that causation implies correlation, while correlation does not imply causation, providing a mathematical basis for the long-standing philosophical debate over causation versus correlation.
Liang (2016) Information flow and causality as rigorous notions ab initio (pdf)
Liang (2021) Abstract
Causality analysis is an important problem lying at the heart of science, and is of particular importance in data science and machine learning. An endeavor during the past 16 years viewing causality as a real physical notion so as to formulate it from first principles, however, seems to have gone unnoticed. This study introduces to the community this line of work, with a long-due generalization of the information flow-based bivariate time series causal inference to multivariate series, based on the recent advance in theoretical development. The resulting formula is transparent, and can be implemented as a computationally very efficient algorithm for application. It can be normalized and tested for statistical significance. Different from the previous work along this line where only information flows are estimated, here an algorithm is also implemented to quantify the influence of a unit to itself. While this forms a challenge in some causal inferences, here it comes naturally, and hence the identification of self-loops in a causal graph is fulfilled automatically as the causalities along edges are inferred. To demonstrate the power of the approach, presented here are two applications in extreme situations. The first is a network of multivariate processes buried in heavy noises (with the noise-to-signal ratio exceeding 100), and the second a network with nearly synchronized chaotic oscillators. In both graphs, confounding processes exist. While it seems to be a challenge to reconstruct from given series these causal graphs, an easy application of the algorithm immediately reveals the desideratum. Particularly, the confounding processes have been accurately differentiated. Considering the surge of interest in the community, this study is very timely.
Liang (2021) Normalized Multivariate Time Series Causality Analysis and Causal Graph Reconstruction (pdf)
Liang (2014) Abstract
Given two time series, can one faithfully tell, in a rigorous and quantitative way, the cause and effect between them? Based on a recently rigorized physical notion, namely, information flow, we solve an inverse problem and give this important and challenging question, which is of interest in a wide variety of disciplines, a positive answer. Here causality is measured by the time rate of information flowing from one series to the other. The resulting formula is tight in form, involving only commonly used statistics, namely, sample covariances; an immediate corollary is that causation implies correlation, but correlation does not imply causation. It has been validated with touchstone linear and nonlinear series, purportedly generated with one-way causality that evades the traditional approaches. It has also been applied successfully to the investigation of real-world problems; an example presented here is the cause-and-effect relation between the two climate modes, El Niño and the Indian Ocean Dipole (IOD), which have been linked to hazards in far-flung regions of the globe. In general, the two modes are mutually causal, but the causality is asymmetric: El Niño tends to stabilize IOD, while IOD functions to make El Niño more uncertain. To El Niño, the information flowing from IOD manifests itself as a propagation of uncertainty from the Indian Ocean.
Liang (2014) Unraveling the cause-effect relation between time series (pdf)
Hagan (2019) Abstract
The interaction between the land surface and the atmosphere is of significant importance in the climate system because it is a key driver of the exchanges of energy and water. Several important relations to heat waves, floods, and droughts exist that are based on the interaction of soil moisture and, for instance, air temperature and humidity. Our ability to separate the elements of this coupling, identify the exact locations where they are strongest, and quantify their strengths is, therefore, of paramount importance to their predictability. A recent rigorous causality formalism based on the Liang–Kleeman (LK) information flow theory has been shown, both theoretically and in real-world applications, to have the necessary asymmetry to infer the directionality and magnitude within geophysical interactions. However, the formalism assumes stationarity in time, whereas the interactions within the land surface and atmosphere are generally nonstationary; furthermore, it requires a sufficiently long time series to ensure statistical sufficiency. In this study, we remedy this difficulty by using the square root Kalman filter to estimate the causality based on the LK formalism to derive a time-varying form. Results show that the new formalism has similar properties compared to its time-invariant form. It is shown that it is also able to capture the time-varying causality structure within soil moisture–air temperature coupling. An advantage is that it does not require very long time series to make an accurate estimation. Applying a wavelet transform to the results also reveals the full range of temporal scales of the interactions.
Hagan (2019) Causality Formalism (pdf)
(See also rclm/CO2-lag: Stips)
8.2 Causation in Chaotic Dynamic Systems
Palus Abstract
Using several methods for detection of causality in time series we show in a numerical study that coupled chaotic dynamical systems violate the first principle of Granger causality that the cause precedes the effect. While such a violation can be observed in formal applications of time series analysis methods, it cannot occur in nature, due to the relation between entropy production and temporal irreversibility. The obtained knowledge, however, can help to understand the type of causal relations observed in experimental data, namely can help to distinguish linear transfer of time-delayed signals from nonlinear interactions. We illustrate these findings in causality detected in experimental time series from the climate system and mammalian cardio-respiratory interactions.
Palus Memo
Chaotic dynami- cal systems are mathematical models reflecting very complicated behaviour. Recently, cooperative phenomena have been observed in coupled chaotic systems due to their ability to synchronize. On the way to synchronization, the question which system influences other systems emerges. To answer this question, researches successfully applied the Granger causality methods. In this study we demonstrate that chaotic dynamical systems do not respect the principle of the effect fol- lowing the cause. We explain, however, that such principle violation cannot occur in nature, only in mathematical models which, on the other hand, can help us to understand the mechanisms behind the experimentally observed causalities.
Probably the first approach to de- scribe causality in measurable, mathematically expressible terms can be traced to the 1950’s work of the father of cybernetics, Norbert Wiener 1 who wrote: For two simultaneously measured signals, if we can predict the first signal better by using the past information from the second one than by using the information without it, then we call the second signal causal to the first one. Later, this concept has been introduced into time series analysis by C. W. J. Granger, the 2003 Nobel prize winner in economy. In his Nobel lecture 2 he recalled the inspiration by the Wiener’s work and identified two components of the statement about causality:
The cause occurs before the effect; and
The cause contains information about the effect
that is unique, and is in no other variable. According to Granger, a consequence of these statements is that the causal variable can help to forecast the effect variable after other data has been first used. 2 This restricted sense of causality, referred to as Granger causality, GC thereafter, characterizes the extent to which a process \(X_t\) is leading another process, \(Y_t\) , and builds upon the notion of incremental predictability. It is said that the process \(X_t\) Granger causes process \(Y_t\) if future values of \(Y_t\) can be better predicted using the past values of \(X_t\) and \(Y_t\) rather then only past values of \(Y_t\) .
Due to possible nonlinear dependence in time series from real-world processes, many authors have proposed vari- ous nonlinear generalizations of the GC principle.
In the following we will particularly discuss the generalization of GC based on probability functionals from information theory. The information-theoretic functionals, in their general formulation, are applicable to a broad range of nonlinear processes, however, we will focus on time series generated by nonlinear, possibly chaotic dynamical systems. The observation that the chaotic dynamical systems generate information had led to an interesting and fruitful symbiosis of ergodic theory of dynamical systems and information theory.
that chaotic systems are not reversible in time. Therefore the observed violation of the causality principle can occur only in a numerical study but not in real-world systems. The time reversal in causality analysis can help to distinguish between a linear transfer of a time-delayed signal and nonlinear interactions of dynamical systems. Any detection of causality, however, should be accompa- nied by a battery of time series analysis methods, namely tests for nonlinearity and synchronization should be performed, as well as standard spectral analysis enhanced by time-frequency analysis since causal links can occur in or between different time scales of multiscale processes
Palus (2018) Causality, dynamical systems and the arrow of time (pdf)
8.3 Causal Inference
Cunningham
Causal inference encompasses the tools that allow social scientists to determine what causes what. In a messy world, causal inference is what helps establish the causes and effects of the actions being studied—for example, the impact (or lack thereof) of increases in the minimum wage on employment, the effects of early childhood education on incarceration later in life, or the influence on economic growth of introducing malaria nets in developing regions. Scott Cunningham introduces students and practitioners to the methods necessary to arrive at meaningful answers to the questions of causation, using a range of modeling techniques and coding instructions for both the R and the Stata programming languages.
A readable introductory book with programming examples, data, and detailed exposition.
Weirdly enough, sometimes there are causal relationships between two things and yet no observable correlation.
The potential outcomes model: A correlation, in order to be a measure of a causal effect, must be based on a choice that was made independent of the potential outcomes under consideration. Yet if the person is making some choice based on what she thinks is best, then it necessarily is based on potential outcomes, and the correlation does not remotely satisfy the conditions we need in order to say it is causal. To put it as bluntly as I can, economic theory says choices are endogenous, and therefore since they are, the correlations between those choices and outcomes in the aggregate will rarely, if ever, represent a causal effect.
A generically important contribution to our understanding of causal inference is the notion of comparative statics. Comparative statics are theoretical descriptions of causal effects contained within the model. These kinds of comparative statics are always based on the idea of ceteris paribus—or “all else constant.” When we are trying to describe the causal effect of some intervention, for instance, we are always assuming that the other relevant variables in the model are not changing. If they were changing, then they would be correlated with the variable of interest and it would confound our estimation.
Cunningham (2021) Causal Inference - The Mixtape
8.3.1 Causal Inference with Spatio-Temporal Data
Papadogeorgou Abstract
Many causal processes have spatial and temporal dimensions. Yet the classic causal inference framework is not directly applicable when the treatment and outcome variables are generated by spatio-temporal point processes. We extend the potential outcomes framework to these settings by formulating the treatment point process as a stochastic intervention. Our causal estimands include the expected number of outcome events in a specified area under a particular stochastic treatment assignment strategy. Our methodology allows for arbitrary patterns of spatial spillover and temporal carryover effects. Using martingale theory, we show that the proposed estimator is consistent and asymptotically normal as the number of time periods increases. We propose a sensitivity analysis for the possible existence of unmeasured confounders, and extend it to the Hájek estimator. Simulation studies are conducted to examine the estimators’ finite sample performance. Finally, we illustrate the proposed methods by estimating the effects of American airstrikes on insurgent violence in Iraq from February 2007 to July 2008. Our analysis suggests that increasing the average number of daily airstrikes for up to one month may result in more insurgent attacks. We also find some evidence that airstrikes can displace attacks from Baghdad to new locations up to 400 kilometers away.
Papadogeorgou Conclusion
In this paper, we provide a framework for causal inference with spatio-temporal point process treatments and outcomes. We illustrate the flexibility of this proposed methodology by applying it to the estimation of airstrike effects on insurgent violence in Iraq. Our central idea is to use a stochastic intervention that represents a distribution of treatments rather than the standard causal inference approach that estimates the average potential outcomes under some fixed treatment values. A key advantage of our approach is its flexibility: it permits unstructured patterns of both spatial spillover and temporal carryover effects. This flexibility is crucial since for many spatio-temporal causal inference problems, including our own application, little is known about how the treatments in one area affect the outcomes in other areas across different time periods. The estimands and methodology presented in this paper can be applied in a number of settings to estimate the effect of a particular stochastic intervention strategy. There are several considerations that may be useful when defining a stochastic intervention of interest. First, the choice of intervention should be guided by pressing policy questions or important academic debates where undetected spillover might frustrate traditional methods of causal inference. Second, stochastic interventions should satisfy the overlap assumption (Assumption 2). Researchers should not define a stochastic intervention that gener- ates treatment patterns that appear to be far different from those of the observed treatment events. In our application, we achieve this by constructing the stochastic interventions based on the estimated density of point patterns obtained from the past data and the observed number of airstrikes per day. The proposed framework can also be applied to other high-dimensional, and possibly unstructured, treatments. The standard approach to causal inference, which estimates the causal effects of fixed treat- ment values, does not perform well in such settings. Indeed, the sparsity of observed treatment patterns alone makes it difficult to satisfy the required overlap assumption (Imai and Jiang, 2019). We believe that the stochastic intervention approach proposed here offers an effective solution to a broad class of causal inference problems. Future research should further develop the methodology for stochastic interventions. In particular, it is important to consider an improved weighting method that explicitly targets covariate balance. This might be challenging in the spatiotemporal setting where the notion of covariate balance is not yet well understood. Finally, it is crucial to extend the stochastic intervention framework to adaptive strategies over multiple time periods that might be more reflective of realistic assignments.
Papadogeorgou_2022_Causal_Inference_with_Spatio-Temporal_Data (pdf)