This text has motivated and situated two schools of thought: generative and discriminative learning. Both have deeply complementary advantages yet, in their traditional incarnations, have been incompatible. We started by reviewing several approaches in each school. This included Bayesian methods, maximum likelihood, exponential family models, maximum entropy, expectation-maximization and graphical models in the generative school. In the discriminative school, we discussed conditional likelihood, logistic regression, support vector machines and kernel methods. The various strengths and weaknesses of the methods suggested that a hybrid framework could be quite beneficial. This led us to a common mathematical framework that unites the two and marries their strengths. This framework of maximum entropy discrimination allowed us to connect maximum entropy with discriminative margin-based constraints. It spanned many important generative models allowing us to learn their parameters discriminatively. Other extensions were feasible beyond binary classification and an important iterative formulation for latent variables also emerged. MED thus provided a principled fusion of discriminative and generative learning. We can now consider using the flexible space of generative models while maximizing their performance on the tasks at hand. Thus probabilistic modeling resources are harnessed optimally by a discriminative criterion avoiding the intermediate sub-goal of learning a good generator. The end result is better performance with the same rich models.
KeywordsSupport Vector Machine Statistical Manifold Discriminative Learning Support Vector Regression Method Kernel Design
Unable to display preview. Download preview PDF.