Empirical Cumulative Distribution Function
The data should be separated by a line breaks or commas. You may paste from Excel. The tool ignores empty or non-numeric cells.
What is Empirical Cumulative Distribution Function (ECDF)?
The ECDF is a function that estimates the probability P(X ≤ x) from sample data, without making any assumptions about the underlying distribution.
ECDF Formula
ECDF for one value
| Fn(𝑥) = | Number of sample values that ≤ 𝑥 |
| n |
𝑥 - the value you are checking the probability for.
ECDF for the entire sample.
- Sort the data from smallest to largest:
X1 ≤ X2≤ ... ≤ Xn. - After sorting, the index i represents the number of values less than or equal to Xi.
Fn(Xi) = i n
How to calculate the ECDF?
- Enter delimited data
- You can copy the data from Excel and paste it into the calculator.
- You can choose the data delimiter under 'more options'.
- Enter Score(𝑥) If you want to calculate P(X ≤ 𝑥) for a specific value, or leave empty.
- Press the 'calculate' button.
- After generating the plot, you will find customization color buttons located below the plot.
- The Empirical Cumulative Distribution Function page offers the option to exclude outliers using Tukey's fences method for better clarity.
When should you use a parametric distribution?
You should use a parametric distribution, such as the normal distribution, in one of the following cases:- When you know the parametric distribution from prior knowledge, such as previous research.
- When you do not know the parametric distribution, but the sample size is large enough to identify an appropriate parametric model.
When should you use the ECDF?
- When you want to present the sample data results, as the ECDF describes the sample exactly.
- When you cannot identify a parametric distribution due to a small sample size or because the data do not follow any known distribution.
Fields
- Sample Data: Enter the delimited data, you may change the delimited data under 'More options'
- Score (𝑥): Enter the 𝑥 value you are checking the probability for.
Options
- Outliers: While you may exclude outliers, it is important to note that doing so is generally not recommended unless you understand the cause of each outlier.
- Tukey's Fence K: the K value to use for the Tukey's Fence , the default si 1.5
- Chart Title: the title of the ECDF chart.
- Data delimiters: This option defines the character(s) separating values in your data. Common choices include comma, tab, space, or Enter key. You can also specify a custom delimiter if needed.
'ECDF chart' more options:
After calculating percentiles, you can customize the chart colors.
ECDF Example
Solution:
1. Sort the numbers in ascending order:1,3,5,7,9,12,14,17,21,33
2. The largest value that is not larger than 8 is: 7.
i = index(7) = 4.
n = 10.
| Fn(8) = | i | = | 4 | = 0.4 |
| n | 10 |