Saturday, July 9, 2011

Why do we need Feature Selection (Pattern Recognition)?

Feature selection process is a very important task in Statistical Pattern Recognition. It significantly increases/reduces the performance of any classification algorithm applied afterwards. I've mentioned a few guidelines which are commonly used.

In Statistical Pattern Recognition, we need Feature selection techniques due to the following reasons:
  • All Features: Selection of all features for Pattern Classification will lead to more error. So, less features the better.
  • Curse of dimensionality: The more dimensions you add to the feature vector for classification, you might encounter more error on the long run. Less features will give sub-optimal results. There is this safe "number of dimensions" which are needed for optimal classification for the given size of a data-set.
  • Complexity, Size: increase of storage, complexity will be the result of a selection of all dimensions. It basically computationally intensive. So, we don't want to be there either.
  • dimension Subset: specific subset of features may give more accuracy. Our goal will be to find those optimal subset of features that try to classify a given data-set.

No comments:

Post a Comment