Principal Component Analysis (PCA)

 Principal Component Analysis is an unsupervised learning algorithm that is used for dimensionality reduction in machine learning. It is a statistical process that converts the observations of correlated features into a set of linearly uncorrelated features with the help of orthogonal transformation. These new transformed features are called the Principal Components. It is one of the popular tools that is used for exploratory data analysis and predictive modelling. It is a technique to draw strong patterns from the given dataset by reducing the variances.

PCA generally tries to find the lower-dimensional surface to project the high-dimensional data.

PCA works by considering the variance of each attribute because the high attribute shows the good split between the classes, and hence it reduces the dimensionality. Some real-world applications of PCA are image processing, movie recommendation system, optimizing the power allocation in various communication channels. It is a feature extraction technique, so it contains the important variables and drops the least important variable.

The PCA algorithm is based on some mathematical concepts such as:

  • Variance and Covariance
  • Eigenvalues and Eigen factors

Some common terms used in PCA algorithm:

  • Dimensionality: It is the number of features or variables present in the given dataset. More easily, it is the number of columns present in the dataset.
  • Correlation: It signifies that how strongly two variables are related to each other. Such as if one changes, the other variable also gets changed. The correlation value ranges from -1 to +1. Here, -1 occurs if variables are inversely proportional to each other, and +1 indicates that variables are directly proportional to each other.
  • Orthogonal: It defines that variables are not correlated to each other, and hence the correlation between the pair of variables is zero.
  • Eigenvectors: If there is a square matrix M, and a non-zero vector v is given. Then v will be eigenvector if Av is the scalar multiple of v.
  • Covariance Matrix: A matrix containing the covariance between the pair of variables is called the Covariance Matrix.

Comments

Popular posts from this blog

Linear Regression with single variable

Introduction Deep Learning

Decision Tree