Principal Component Analysis Calculator

The PCA calculator generates biplots in a variety of dimensions, including 3D, 2D, and 1D, as well as a scree plot, and provides calculation steps. For additional cluster analysis, please visit our cluster analysis calculator.

Enter data in columns
Enter data from excel
*When using the tool, the 'Series' column can be left empty.
The second column can be used for ID and will be displayed when hovering over the marker.
Data should be separated by either new line Enter or , comma.
The tool will ignore empty cells or cells containing non-numeric data.

PCA online

Our PCA calculator takes in data with multiple dimensions, transforms it into principal components (scores), and then generates a biplot and scree plot.

PCA online input data

  • When all the rows belong to one group, the first column should be left empty, in this case all the marks will be with the same color.
  • If the second column is labeled with any name except "ID" and contains numeric values, the calculator will treat this column as an additional dimension for the data points.
  • If the second column is labeled as "ID" or contains only non-numeric values (such as "France", "A1", "2C"), the calculator will treat the column as an identifier. This means that when hovering over a data point, the corresponding ID will be displayed."
  • What is principal component analysis?

    The principal Component Analysis (PCA) is a technique that reduces the number of dimensions in data while minimizing the loss of information. The method works by rotating the axes in such a way that there is more variance along them, and then transforming the data into principal component values, also known as scores. These principal components serve as the new axes, and the PC scores represent the projections of the original dimensions onto the new axes.

    PCA prioritizes the principal components based on importance, with PC1 being the component that explains the most variation in the data, followed by PC2, and so on. By only considering the first few principal components, such as the first two, a significant percentage of the variance in the data can be explained. This enables the representation of high-dimensional data on a two-dimensional chart.

    When there are more samples than dimensions, the number of principal components is the same as the number of dimensions.

    What is scree plot?

    The scree plot is a graphical representation of the eigenvalues of the principal components, which indicate the amount of variation explained by each component. The plot is arranged so that the eigenvalues are listed in descending order, from the highest to the lowest.

    In our scree plot, the columns represent the eigenvalues, and a line is plotted to show the cumulative percentage of variation explained by the principal components. For example, if the line for PC2 reaches 93%, it means that the first two components explain 93% of the variance in the data. If the data is represented on a two-dimensional chart using these two components, only 7% of the information is lost.

    What is PCA Biplot?

    A biplot is a graphical representation of multidimensional data that displays the relationships between variables in a two-dimensional plot. In this representation, the principal component (PC) scores are represented by dots, and the loading vectors are represented by lines. These elements of the biplot allow for a clear visualization of the underlying structure of the data.