K-means algorithms

Input {$x_1$, $x_2$, ..., $x_m$}

  1. Initialize cluster centriods: $c_1$, $c_2$, ..., $c_k$

  2. Repeat

    2.1 assign $x_i$ to $c_j$

    2.2 update $c_j$ according to group sample

    2.3 break when center distance is less than a threshold

As the distortion function $J$ is a non-convex function, the alogrithm will converge, but it may reach a local mininal. So run multiple times with different initialization, and choose the lowest $J$.

QA: 1. How to initialize? Random choose of training sample 2. How to decide the cluster number? Choose manually

Density estimation

Mixtures of Gaussians

QA: GMM 算法中,参数初始化如何做的? 即第一次迭代时,高斯的参数如何给定?