# Statistics Calculator

Statistics calculator for numerical and categorical data, offering many options.

**Header**: you may rename 'Name-1', 'Name-2', etc.

**Data**: use Enter as delimiter; you may change the delimiters on 'More options'.

## How to use the statistics calculator

Statistics calculator with optional Excel input. Computes statistics such as minimum, maximum, sum, count, standard deviation, quartiles, IQR, and skewness. Handles both numerical and categorical data. Insert one or more columns, and the statistics calculator processes each separately.

### How to enter data?

**Enter raw data directly**- usually you have the raw data.

a. Enter the*name*of the group.

b. Enter the*raw data*separated by 'comma', 'space', or 'enter'. (*you may copy only the data from excel).**Enter raw data from excel**Enter the header on the first row.

**Copy Paste**- a. copy the
*raw data*with the*header*from Excel or Google sheets, or any tool that separates data with tab and line feed. copy the entire block, include the header . - Paste the data in the input field.

- a. copy the
**Import data from an Excel or CSV file.**

When you select an Excel file, the calculator will automatically load the first sheet and display it in the input field. You can choose either an Excel file (.xlsx or .xls) or a CSV file (.csv).

To upload your file, use one of the following methods:**Browse and select**– Click the 'Browse' button and choose the file from your computer.**Drag and drop**– Drag your file and drop it into the 'Drop your .xlsx, .xls, or .csv file here!' area.

Now, the '**Select sheet**' dropdown will be populated with the names of your sheets, and you can choose any sheet.

### Numerical data

Quantitative data, continuous variable or ordinal variable

### Categorical data

Qualitative data, categorical variable

### Choose the statistics that you want to present

#### Numerical data

- Number of observations - the count of valid values, excluding empty cells or non-numerical cells for quantitative data
- Number of missing values - the count of empty cells or non-numerical cells for quantitative data.
- Minimum - the lowest value.
- Maximum - the highest value.
- Range - the distance between the minimum and the maximum.
- Mean (x̄) - the average.
- Sum - the cumulative total of all the values.
- Standard Deviation (S) - the sample standard deviation. Use it when you have only a sample data.
- Variance (S²) - the sample variance.
- Standard Deviation (σ) - the population standard deviation, use it when you have the entire population data.
- Variance (σ²) - the population variance, use it when you have the entire population data.
- Sum of squares - refers to the sum of the square distances of all the values from the mean.
- Q1- quartile 1, the 25th percentile.
- Median- quartile 2, the 50th percentile.
- Q3- quartile 3, the 75th percentile.
- IQR - InterQuartile Range - the different between Quartile 1 and Quartile 3.
- Skewness - the symmetrical level of the probability distribution
- Skewness shape - text description of the skewness and the p-value of the test for a difference from normal skewness.
- Excess kurtosis - the kurtosis difference from the normal distribution, the Kurtosis measured the level of the tails.
- Tails shape - text description of the kurtosis and the p-value of the test for a difference from normal kurtosis.
- Outliers - using the Tukey's Fences
- SW p-value - Shapiro Wilk test for normality. For more details, refer the Kolmogorov-Smirnov test
- KS p-value - Kolmogorov-Smirnov test for normality based on the sample parameters (Lilliefors test). For other options, refer to the Kolmogorov-Smirnov test

#### Options

**Outliers:**extreme values. relevant only if you entered raw data.**included**- the calculator will calculate the outliers but will include them in the calculation.**Excluded**- The calculator will exclude the outliers before calculating the average and the standard deviation.**Format the numbers**- show the thousand separator.**Rounding**- how to round the results?

When a resulting value is larger than one, the tool rounds it, but when a resulting value is less than one the tool displays the significant figures.**Histogram**- create a histogram for the data.**Boxplot**- create a boxplot for the data.**Standardized**- create table with standardized data for each variable. The standardized transformation is a way to compare data with different units of measure; it translates each value into a z-score.

$z=\frac{x-\mu}{\sigma}$

#### Categorical data

##### Measurements

- Frequency per category - counting occurrences of each value.
- Proportion per category - the ratio of each value from all the values
- Percentage per category - the percentage of each value from all the values (100*ratio)

##### Statistics

- Number of observations - the number of valid values
- Minimum - the lowest frequency (or ratio or percentage) .
- Mode - the highest value frequency.
- Range - the distance between the minimum and the maximum (Mode).
- Mean (x̄) - the average frequency, including the empty values, values that appear only on other variables.

## Options

### Cleaning the data

The calculator cleans the data but not the header, as follows:

- Remove non-numerical characters - the calculator keeps only the following characters:

numbers ('0'-'9'), minus ('-'), decimal point ('.'), and brackets ('(', ')'). - Convert numbers in brackets to negative numbers.
- Convert standalone minus signs to 0.

#### Examples

Data | Converted to | Description |
---|---|---|

1,234.56 | 1234.56 | Remove the comma. |

$ (1,234.56) | -1234.56 | Remove the dollar sign and comma, and convert brackets to a minus sign. |

$ -1,234.56 | -1234.56 | Remove the dollar sign and comma. |

$ - | 0 | Remove the dollar sign and convert the minus sign to 0 (a negative zero is equivalent to zero). |

ABCD | Missing value | This value is excluded from the calculation but will be counted as part of the 'Number of missing values'. |

### Should you exclude outliers?

It is important to exercise caution before excluding outliers from any calculation, as they may contain valuable information. However, excluding outliers from a histogram may significantly improve its visualization, even if the outliers are valid observations. If you choose to **exclude outliers**, the histogram maker will generate the chart without them. This method can create a more practical histogram that better represents the distribution of the majority of the data points.