The phenomenon in which a certain association between two variables occurs when studying the entire population and then disappears or appears in the opposite direction when the population is divided in subpopulations, is known as Simpson's paradox. Using a measure of strength of the probabilistic association we can formulate Simpson's paradox in three forms: association reversal, Yule's association reversal or amalgamation paradox. We can encounter the phenomenon of the paradox in various fields and for better visual display we can use plots, vectors, mosaic plots and also directed acyclic graphs. There are also some tools that help with identifying the occurrence of Simpson's paradox in data. Simpson's paradox can be connected to causal reasoning, which also helps with explaining why the paradox occurs and how to decide when dealing with data - whether to consider subpopulation division or not. It can also be linked to the sure-thing principle, for which Simpson's paradox serves as a counterexample to the principle's validity.
|