Optimization in Computational Graphs



A computational graph is a network of connected nodes, in which each node is a unit of computation and stores a variable. Each edge joining two nodes indicates a relationship between the corresponding variables. The graph may be either directed or undirected. In a directed graph, a node computes its associated variable as a function of the variables in the nodes that have edges incoming to it. In an undirected graph, the functional relationship works in both directions. Most practical computational graphs (e.g., conventional neural networks) are directed acyclic graphs, although many undirected probabilistic models in machine learning can be implicitly considered computational graphs with cycles. Similarly, the variables at the nodes might be continuous, discrete, or probabilistic, although most real-world computational graphs work with continuous variables.


  1. 6.
    C. Aggarwal. Neural networks and deep learning: A textbook. Springer, 2018.CrossRefGoogle Scholar
  2. 8.
    R. Ahuja, T. Magnanti, and J. Orlin. Network flows: theory, algorithms, and applications. Prentice Hall, 1993.Google Scholar
  3. 26.
    A. Bryson. A gradient method for optimizing multi-stage allocation processes. Harvard University Symposium on Digital Computers and their Applications, 1961.Google Scholar
  4. 49.
    M. Garey, and D. S. Johnson. Computers and intractability: A guide to the theory of NP-completeness. New York, Freeman, 1979.zbMATHGoogle Scholar
  5. 53.
    I. Goodfellow, Y. Bengio, and A. Courville. Deep learning. MIT Press, 2016.zbMATHGoogle Scholar
  6. 58.
    K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. IEEE International Conference on Computer Vision, pp. 1026–1034, 2015.Google Scholar
  7. 71.
    H. J. Kelley. Gradient theory of optimal flight paths. Ars Journal, 30(10), pp. 947–954, 1960.CrossRefGoogle Scholar
  8. 74.
    D. Koller and N. Friedman. Probabilistic graphical models: principles and techniques. MIT Press, 2009.zbMATHGoogle Scholar
  9. 110.
    D. Rumelhart, G. Hinton, and R. Williams. Learning internal representations by back-propagating errors. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition, pp. 318–362, 1986.Google Scholar
  10. 131.
    P. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Harvard University, 1974.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.IBM T.J. Watson Research CenterYorktown HeightsUSA

Personalised recommendations