Open ended distribution

< List of probability distributions

An open-ended distribution is one in which a class or bin has no boundary ((e.g., your top answer on your survey is ‘5 or more’) [1]. Either the lower boundary is not given, the upper boundary is not given, or both boundaries are not given.

Open ended distribution examples

In the frequency distribution table below, a weight of 50 pounds or less (highlighted in yellow in the first row) signifies that this is an open-ended distribution.

open ended distribution example

A closed ended distribution, in contrast, refers to a sample range with boundaries that are clearly defined. In the following table, for example, the boundaries start at 6 and end at 24, with the researcher choosing not to consider numbers beyond these points.

This bar chart displays an open ended distribution of car prices in thousands of dollars. The lower bin shows prices of 10 or less and the upper bin shows car prices of “20 and up:”

It isn’t necessary for both the top and bottom classes or bins to be open ended for a distribution to be classified as open-ended. If just one (either the top or the bottom) is open ended, it would still be classified as an open ended distribution.

When to use open ended distributions

Open ended distributions can arise when researchers choose to collect data in a particular way. For instance, suppose a researcher conducting a survey on house prices in a certain city selects “> $300,000” as the largest possible response; this may be because the researcher is mostly interested in house prices at the lower end of the market and is willing to lump all higher prices into one large category. Conversely, the smallest possible response could be open ended in a survey collecting information about how housing prices have skyrocketed. It’s also common for researchers to include open ended classes to encourage participation. For example, in a survey about sensitive issues such as people’s net worth, researchers may want to include the open ended class of “more than $500,000” to accommodate those people with high net worth who may be sensitive about including the exact figure in a survey.

Open-ended distributions are generally a matter of choice and depend on the research type and data objectives. For instance, suppose you’re generating a frequency distribution table of family size by polling 100 families and obtaining this given data:

  • One child: 28 families.
  • Two children: 33 families.
  • Three children: 28 families.
  • Four children: 6 families.
  • Six children: 2 families.
  • Nine children: 1 family.
  • Ten children: 2 families.
  • Eleven children: 1 family.
  • Twelve children: 1 family.
  • Thirteen children: 1 family.
  • Twenty children: 1 family.
  • Twenty six children: 1 family.

You could choose to summarize this information as an open ended distribution or a closed ended distribution:

Closed ended distribution (left) vs open ended distribution (right).

Disadvantages of open ended distributions

In general, an open ended distribution will result in a more compact figure, such as the table on the above right. However, they can cause problems with calculations and interpretation. For example, if you have just two classes, say 6 or fewer children / more than 6 children, your table might end up being quite uninformative. For example, a reader might want to know how many families have more than four children; if your table only includes bulk information, it wouldn’t give the reader the answer they are looking for.

Open ended distributions censor data, which means the exact number of children per family is unknown, despite knowing how many families reported 6 or fewer / more than 6 in the survey. Censoring makes the raw data is inaccessible, so calculating statistics such as the mean or standard deviation for the data set is impossible.

References

[1] Cutting, J. (2008). Psychology 240 Lectures, Chapter 3, Statistics 1. Illinois State University.

Scroll to Top