# J Shaped distribution

A J shaped distribution is a probability distribution where the majority of the observations fall at one end of the distribution while very few are located in the middle, giving the distribution the rough shape of the letter J laid on its side. This distribution pattern indicates that the majority of outcomes fall towards one extreme.

The J-distribution gets its name from fonts such as IBM Sans Plex (used in this paragraph), which have a half arm on top (the horizontal stroke at the top of the letter J).

When describing data as J shaped, be aware that the term is an informal description of a particular class of bimodal distribution with skewed data. It’s much like the way we use “bell curve” as an informal term for the normal distribution.

## Real life examples

A J shaped distribution can appear in various situations, either natural or human-made. It can occur in biological, social, or economic phenomena, or any situation where there is a vast difference between the extremes of a variable or outcome.

Many real life situations follow J shaped distributions. For example,:

• There is a J shaped relationship between body mass index and mortality in COVID‐19 patients COVID‐19: both underweight and obese COVID‐19 patients had a higher risk of death than patients with normal weight [1].
• Product reviews, such as those posted on Amazon, often show many 5-star ratings, a few 1-star (or vice-versa), and a smattering are in between [2].
• Income distribution since World War II follows a J shaped distribution where a few people earn a lot while many people earn very little. In 1202 the top quintile of families received \$11.36 for every dollar of income received by the bottom quintile [3].
• The distribution of tree density and species along elevational gradients, where many species are at lower and higher elevations, while few are in between [4].

## Interpretation and use of the J shaped distribution

Interpreting a J shaped distribution requires understanding the implications of the skewed distribution. Skewness measures the lack of symmetry in a distribution, and in a J shaped distribution, the skewness score will be positive, indicating that the tail of the distribution is on the right side. Interpretation of any dataset with a J shaped distribution requires the analyst to be aware of the percentage of observations in each range of the distribution, the range of values along the distribution, the percentage of outliers, and the context of the distribution.

Understanding the J shaped distribution can help data analysts make sense of their findings and inform their decision-making. For example, if income follows a J shaped distribution, it tells policymakers that they need to focus on the wealthiest people and the poorest people in crafting their policies, rather than the middle class. In finance, stock market analysts can use the J shaped distribution to analyze the distribution of stock returns over different periods and adjust their investment strategies accordingly.

However, the distribution does have one major pitfall. Namely, it

“…creates some fundamental statistical problems.”

Hu & Zhang [2].

The mean, a common statistic used for calculating future sales, product quality, consumer satisfaction and other metrics only works for unimodal distributions (distributions with one “hump”). A mean is calculated from a bimodal distribution such as the J shaped distribution, the statistic becomes meaningless. For example, a collection of 1 and 5 star ratings gives a mean of 3 stars— which is not a good reflection of product quality.

In summary, understanding probability distributions is essential for data analysis, and the J shaped distribution is an interesting and informative distribution to analyze. It indicates that a large number of outcomes fall towards one extreme while there are very few in the middle. The distribution can appear in different contexts, and interpreting it requires an understanding of the percentage of observations in each range of the distribution, the range of values along the distribution, the percentage of outliers, and the context of the distribution. By using it in data analysis, we can make better-informed decisions and craft better-tailored strategies.

## References

[1] Huang HK, Bukhari K, Peng CC, Hung DP, Shih MC, Chang RH, Lin SM, Munir KM, Tu YK. The J-shaped relationship between body mass index and mortality in patients with COVID-19: A dose-response meta-analysis. Diabetes Obes Metab. 2021 Jul;23(7):1701-1709. doi: 10.1111/dom.14382. Epub 2021 Apr 14. PMID: 33764660; PMCID: PMC8250762.

[2] Hu & Zhang, (2009). Overcoming the J-shaped distribution of product reviews. Communications of the ACM – A View of Parallel Computing CACM Homepage archive. Volume 52 Issue 10, October. Pages 144-147. Retrieved December 8, 2017 from: https://dl.acm.org/citation.cfm?id=1562800

[3] Levy, F. Distribution of income.

[4] Sharma, C.M., Mishra, A.K., Tiwari, O.P. et al. Effect of altitudinal gradients on forest structure and composition on ridge tops in Garhwal Himalaya. Energ. Ecol. Environ. 2, 404–417 (2017). https://doi.org/10.1007/s40974-017-0067-6

Scroll to Top