Parzen windows classification is a technique for nonparametric density estimation, which can also be used for classification. Using a given kernel function, the technique approximates a given training set distribution via a linear combination of kernels centered on the observed points. In this work, we separately approximate densities for each of the two classes, and we assign a test point to the class with maximal posterior probability.

The resulting algorithm is extremely simple and closely related to
support vector machines. The decision function is

(8) |

where the kernel function

The Parzen windows classification algorithm does not require any training phase; however, the lack of sparseness makes the test phase quite slow. Furthermore, although asymptotical convergence guarantees on the perfomance of Parzen windows classifiers exist [Duda and Hart, 1973], no such guarantees exist for finite sample sizes.

Parzen windows can be regarded as a generalization of *k*-nearest
neighbor techniques. Rather than choosing the *k* nearest neighbors
of a test point and labelling the test point with the weighted
majority of its neighbors' votes, one can consider all points in the
voting scheme and assign their weight by means of the kernel function.
With Gaussian kernels, the weight decreases exponentially with the
square of the distance, so far away points are practically irrelevant.
The width
of the Guassian determines the relative weighting
of near and far points. Tuning this parameter controls the predictive
power of the system. We have empirically optimized the value of
.