Informationi

Empirical Cumulative Distribution Function

The data should be separated by a line breaks or commas. You may paste from Excel. The tool ignores empty or non-numeric cells.

What is Empirical Cumulative Distribution Function (ECDF)?

The ECDF is a function that estimates the probability P(X ≤ x) from sample data, without making any assumptions about the underlying distribution.

ECDF Formula

ECDF for one value
Fn(𝑥) =Number of sample values that ≤ 𝑥
n
n - the sample size, the number of values.
𝑥 - the value you are checking the probability for.

ECDF for the entire sample.
  • Sort the data from smallest to largest:
    X1 ≤ X2≤ ... ≤ Xn.
  • After sorting, the index i represents the number of values less than or equal to Xi.
  • Fn(Xi) =i
    n

How to calculate the ECDF?

  1. Enter delimited data
    • You can copy the data from Excel and paste it into the calculator.
    • You can choose the data delimiter under 'more options'.
  2. Enter Score(𝑥) If you want to calculate P(X ≤ 𝑥) for a specific value, or leave empty.
  3. Press the 'calculate' button.
  4. After generating the plot, you will find customization color buttons located below the plot.
  5. The Empirical Cumulative Distribution Function page offers the option to exclude outliers using Tukey's fences method for better clarity.

When should you use a parametric distribution?

You should use a parametric distribution, such as the normal distribution, in one of the following cases:
  1. When you know the parametric distribution from prior knowledge, such as previous research.
  2. When you do not know the parametric distribution, but the sample size is large enough to identify an appropriate parametric model.
If the data follow a known parametric distribution, you should estimate its parameters and use the parametric distribution to calculate cumulative probabilities for the entire population.

When should you use the ECDF?

  1. When you want to present the sample data results, as the ECDF describes the sample exactly.
  2. When you cannot identify a parametric distribution due to a small sample size or because the data do not follow any known distribution.

Fields

  1. Sample Data: Enter the delimited data, you may change the delimited data under 'More options'
  2. Score (𝑥): Enter the 𝑥 value you are checking the probability for.

Options

  1. Outliers: While you may exclude outliers, it is important to note that doing so is generally not recommended unless you understand the cause of each outlier.
  2. Tukey's Fence K: the K value to use for the Tukey's Fence , the default si 1.5
  3. Chart Title: the title of the ECDF chart.
  4. Data delimiters: This option defines the character(s) separating values in your data. Common choices include comma, tab, space, or Enter key. You can also specify a custom delimiter if needed.
'ECDF chart' more options:

After calculating percentiles, you can customize the chart colors.

ECDF Example

Solution:
1. Sort the numbers in ascending order:
1,3,5,7,9,12,14,17,21,33

2. The largest value that is not larger than 8 is: 7.
i = index(7) = 4.
n = 10.
Fn(8) =i=4= 0.4
n10