Abstract
We follow the idea of formulating vision as inverse graphics and propose a new type of element for this task, a neural-symbolic capsule. It is capable of de-rendering a scene into semantic information feed-forward, as well as rendering it feed-backward. An initial set of capsules for graphical primitives is obtained from a generative grammar and connected into a full capsule network. Lifelong meta-learning continuously improves this network’s detection capabilities by adding capsules for new and more complex objects it detects in a scene using few-shot learning. Preliminary results demonstrate the potential of our novel approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Battaglia, P., Pascanu, R., Lai, M., Rezende, D.J., Kavukcuoglu, K.: Interaction networks for learning about objects, relations and physics. In: NIPS (2016)
Battaglia, P.W., Hamrick, J.B., Tenenbaum, J.B.: Simulation as an engine of physical scene understanding. Proc. Nat. Acad. Sci. 110(45), 18327–18332 (2013)
Hamrick, J.B., Ballard, A.J., Pascanu, R., Vinyals, O., Heess, N., Battaglia, P.W.: Metacontrol for adaptive imagination-based optimization. In: ICLR (2017)
Hinton, G.E., Krizhevsky, A., Wang, S.D.: Transforming auto-encoders. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds.) ICANN 2011. LNCS, vol. 6791, pp. 44–51. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21735-7_6
Hinton, G.E., Sabour, S., Frosst, N.: Matrix capsules with EM routing. In: ICLR (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
Kulkarni, T.D., Whitney, W.F., Kohli, P., Tenenbaum, J.B.: Deep convolutional inverse graphics network. In: NIPS (2015)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Lenssen, J.E., Fey, M., Libuschewski, P.: Group equivariant capsule networks. In: NIPS (2018)
Lipton, Z.C.: The mythos of model interpretability. CoRR abs/1606.03490 (2017)
Liu, Y., Wu, Z., Ritchie, D., Freeman, W.T., Tenenbaum, J.B., Wu, J.: Learning to describe scenes with programs. In: ICLR (2019)
Liu, Z., Freeman, W.T., Tenenbaum, J.B., Wu, J.: Physical primitive decomposition. In: ECCV (2018)
Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: CVPR, pp. 5188–5196 (2015)
Mao, J., Gan, C., Kohli, P., Tenenbaum, J.B., Wu, J.: The neuro-symbolic concept learner: interpreting scenes, words, and sentences from natural supervision. In: ICLR (2019)
Martinovic, A., Gool, L.V.: Bayesian grammar learning for inverse procedural modeling. In: CVPR (2013)
Montavon, G., Samek, W., Müller, K.R.: Methods for interpreting and understanding deep neural networks. Digit. Signal Process. 73, 1–15 (2018)
Pharr, M., Humphreys, G., Jakob, W.: Physically Based Rendering, 3rd edn. Morgan Kaufmann, Burlington (2016)
Quílez, I.: Rendering signed distance fields (2017). http://www.iquilezles.org
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?” explaining the predictions of any classifier. In: Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: NIPS (2017)
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv:1312.6034 (2014)
Godot Engine Team: Godot engine (2019). https://godotengine.org
Tian, Y., et al.: Learning to infer and execute 3D shape programs. In: ICLR (2019)
Towell, G.G., Shavlik, J.W.: Extracting refined rules from knowledge-based neural networks. Mach. Learn. 13(1), 71–101 (1993)
Towell, G.G., Shavlik, J.W.: Knowledge-based artificial neural networks. Artif. Intell. 70(1), 119–165 (1994)
Tulsiani, S., Su, H., Guibas, L.J., Efros, A.A., Malik, J.: Learning shape abstractions by assembling volumetric primitives. In: CVPR (2017)
Ullman, T.D., Spelke, E., Battaglia, P., Tenenbaum, J.B.: Mind games: game engines as an architecture for intuitive physics. Trends Cogn. Sci. 21(9), 649–665 (2017)
Wu, J., Tenenbaum, J.B., Kohli, P.: Neural scene de-rendering. In: CVPR (2017)
Yao, S., et al.: 3D-aware scene manipulation via inverse graphics. In: NIPS (2018)
Yi, K., Wu, J., Gan, C., Torralba, A., Kohli, P., Tenenbaum, J.B.: Neural-symbolic VQA: disentangling reasoning from vision and language understanding. In: NIPS (2018)
Zhang, Q., Wu, Y.N., Zhu, S.C.: Interpretable convolutional neural networks. In: CVPR, pp. 8827–8836 (2018)
Zhao, Y., Birdal, T., Deng, H., Tombari, F.: 3D point-capsule networks. arXiv:1812.10775 (2018)
Zhou, Y., Zhu, Z., Bai, X., Lischinski, D., Cohen-Or, D., Huang, H.: Non-stationary texture synthesis by adversarial expansion. In: SIGGRAPH (2018)
Zou, C., Yumer, E., Yang, J., Ceylan, D., Hoiem, D.: 3D-PRNN: generating shape primitives with recurrent neural networks. In: ICCV (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Kissner, M., Mayer, H. (2019). A Neural-Symbolic Architecture for Inverse Graphics Improved by Lifelong Meta-learning. In: Fink, G., Frintrop, S., Jiang, X. (eds) Pattern Recognition. DAGM GCPR 2019. Lecture Notes in Computer Science(), vol 11824. Springer, Cham. https://doi.org/10.1007/978-3-030-33676-9_33
Download citation
DOI: https://doi.org/10.1007/978-3-030-33676-9_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33675-2
Online ISBN: 978-3-030-33676-9
eBook Packages: Computer ScienceComputer Science (R0)