Path Kernels and Multiplicative Updates

  • Eiji Takimoto
  • Manfred K. Warmuth
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2375)


We consider a natural convolution kernel defined by a directed graph. Each edge contributes an input. The inputs along a path form a product and the products for all paths are summed. We also have a set of probabilities on the edges so that the outflow from each node is one. We then discuss multiplicative updates on these graphs where the prediction is essentially a kernel computation and the update contributes a factor to each edge. Now the total outflow out of each node is not one any more. However some clever algorithms re-normalize the weights on the paths so that the total outflow out of each node is one again. Finally we discuss the use of regular expressions for speeding up the kernel and re-normalization computation. In particular we rewrite the multiplicative algorithms that predict as well as the best pruning of a series parallel graph in terms of efficient kernel computations.


Edge Weight Regular Expression Kernel Computation Syntax Tree Path Weight 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [FS97]
    Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, August 1997.Google Scholar
  2. [Hau99]
    David Haussler. Convolution kernels on discrete structures. Technical Report UCSC-CRL-99-10, Univ. of Calif. Computer Research Lab, Santa Cruz, CA, 1999.Google Scholar
  3. [HPW02]
    D. P. Helmbold, S. Panizza, and M. K. Warmuth. Direct and indirect algorithms for on-line learning of disjunctions. Theoretical Computer Science, 2002. To appear.Google Scholar
  4. [HS97]
    D. P. Helmbold and R. E. Schapire. Predicting nearly as well as the best pruning of a decision tree. Machine Learning, 27(01):51–68, 1997.CrossRefGoogle Scholar
  5. [KRS01]
    Roni Khardon, Dan Roth, and Rocco Servedio. Efficiency versus convergence of Boolean kernels for on-line learning algorithms. In Advances in Neural Information Processing Systems 14, 2001.Google Scholar
  6. [KW97]
    J. Kivinen and M. K. Warmuth. Additive versus exponentiated gradient updates for linear prediction. Information and Computation, 132(1):1–64, January 1997.Google Scholar
  7. [LW94]
    N. Littlestone and M. K. Warmuth. The weighted majority algorithm. Inform. Comput., 108(2):212–261, 1994.zbMATHCrossRefMathSciNetGoogle Scholar
  8. [Moh98]
    Mehryar Mohri. General algebraic frameworks and algorithms for shortest distance problems. Technical Report 981219-10TM, AT&T Labs-Research, 1998.Google Scholar
  9. [MW98]
    M. Maass and M. K. Warmuth. Efficient learning with virtual threshold gates. Information and Computation, 141(1):66–83, February 1998.Google Scholar
  10. [TW99]
    Eiji Takimoto and Manfred K. Warmuth. Predicting nearly as well as the best pruning of a planar decision graph. In 10th ALT, volume 1720 of Lecture Notes in Artificial Intelligence, pages 335–346, 1999. To appear in Theoretical Computer Science.Google Scholar
  11. [Wat99]
    Chris Watkins. Dynamic alignment kernels. Technical Report CSD-TR-98-11, Royal Holloway, University of London, 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Eiji Takimoto
    • 1
  • Manfred K. Warmuth
    • 2
  1. 1.Graduate School of Information SciencesTohoku UniversitySendaiJapan
  2. 2.Computer Science DepartmentUniversity of CaliforniaSanta CruzUSA

Personalised recommendations