Learning Sequence Models Discriminatively
In this chapter, I resolve two problems that you might not have noticed in the previous chapter. First, HMMs aren’t that natural for many sequences, because a model that represents (say) ink conditioned on (say) a letter is odd. Generative models like this must often do much more work than is required to solve a problem, and modelling the letter conditioned on the ink is usually much easier (this is why classifiers work). Second, in many applications you would want to learn a model that produces the right sequence of hidden states given a set of observed states, as opposed to maximizing likelihood.