Hierarchical Subsampling Networks

  • Alex GravesEmail author
Part of the Studies in Computational Intelligence book series (SCI, volume 385)


So far we have focused on recurrent neural networks with a single hidden layer (or set of disconnected hidden layers, in the case of bidirectional or multidirectional networks). As discussed in Section 3.2, this structure is in principle able to approximate any sequence-to-sequence function arbitrarily well, and should therefore be sufficient for any sequence labelling task. In practice however, it tends to struggle with very long sequences. One problem is that, because the entire network is activated at every step of the sequence, the computational cost can be prohibitively high. Another is that the information tends to be more spread out in longer sequences, and sequences with longer range interdependencies are generally harder to learn from.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag GmbH Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of TorontoTorontoCanada

Personalised recommendations