Autonomous robotic exploration using a utility function based on Rényi’s general theory of entropy

Carrillo, Henry; Dames, Philip; Kumar, Vijay; Castellanos, José A.

doi:10.1007/s10514-017-9662-9

Autonomous robotic exploration using a utility function based on Rényi’s general theory of entropy

Published: 12 August 2017

Volume 42, pages 235–256, (2018)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Henry Carrillo¹,
Philip Dames²,
Vijay Kumar³ &
…
José A. Castellanos⁴

1689 Accesses
41 Citations
Explore all metrics

Abstract

In this paper we present a novel information-theoretic utility function for selecting actions in a robot-based autonomous exploration task. The robot’s goal in an autonomous exploration task is to create a complete, high-quality map of an unknown environment as quickly as possible. This implicitly requires the robot to maintain an accurate estimate of its pose as it explores both unknown and previously observed terrain in order to correctly incorporate new information into the map. Our utility function simultaneously considers uncertainty in both the robot pose and the map in a novel way and is computed as the difference between the Shannon and the Rényi entropy of the current distribution over maps. Rényi’s entropy is a family of functions parameterized by a scalar, with Shannon’s entropy being the limit as this scalar approaches unity. We link the value of this scalar parameter to the predicted future uncertainty in the robot’s pose after taking an exploratory action. This effectively decreases the expected information gain of the action, with higher uncertainty in the robot’s pose leading to a smaller expected information gain. Our objective function allows the robot to automatically trade off between exploration and exploitation in a way that does not require manually tuning parameter values, a significant advantage over many competing methods that only use Shannon’s definition of entropy. We use simulated experiments to compare the performance of our proposed utility function to these state-of-the-art utility functions. We show that robots that use our proposed utility function generate maps with less uncertainty and fewer visible artifacts and that the robots have less uncertainty in their pose during exploration. Finally, we demonstrate that a real-world robot using our proposed utility function is able to successfully create a high-quality map of an indoor office environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Autonomous Exploration with Expectation-Maximization

Autonomous Exploration with Exact Inverse Sensor Models

Article 28 October 2017

Evan Kaufman, Kuya Takami, … Zhuming Ai

Environment model adaptation for mobile robot exploration

Article 08 November 2017

Erik Nelson, Micah Corah & Nathan Michael

References

Aczél, J., & Daróczy, Z. (1975). On measures of information and their characterizations. In Mathematics in science and engineering (Vol. 115). New York, NY Academic Press/Harcourt Brace Jovanovich Publishers.
Averbeck, B. B. (2015). Theory of choice in bandit, information sampling and foraging tasks. PLoS Computational Biology, 11, 1–28. doi:10.1371/journal.pcbi.1004164.
Article Google Scholar
Blanco, J., Fernández-Madrigal, J., & Gonzalez, J. (2008). A novel measure of uncertainty for mobile robot slam with rao-blackwellized particle filters. The International Journal of Robotics Research (IJRR), 27(1), 73–89. doi:10.1177/0278364907082610.
Article Google Scholar
Bourgault, F., Makarenko, A., Williams, S., Grocholsky, B., & Durrant-Whyte, H. (2002) Information based adaptive robotic exploration. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 540–545). doi:10.1109/IRDS.2002.1041446.
Boyd, S., & Vandenberghe, L. (2004). Convex optimization. New York, NY: Cambridge University Press.
Book MATH Google Scholar
Brodersen, K., Ong, C. S., Stephan, K., & Buhmann, J. (2010). The balanced accuracy and its posterior distribution. In Proceedings of the international conference on pattern recognition (ICPR) (pp. 3121–3124). doi:10.1109/ICPR.2010.764.
Brooks, R. A., & Mataric, M. J. (1993). Real robots, real learning problems. Berlin: Springer.
Book Google Scholar
Burgard, W., Moors, M., Stachniss, C., & Schneider, F. (2005). Coordinated multi-robot exploration. IEEE Transactions on Robotics (TRO), 21(3), 376–386. doi:10.1109/TRO.2004.839232.
Article Google Scholar
Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., et al. (2016). Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on Robotics, 32(6), 1309–1332.
Article Google Scholar
Calafiore, G., & Ghaoui, L. (2014). Optimization models. Cambridge: Cambridge University Press.
MATH Google Scholar
Carlone, L., Du, J., Kaouk, M., Bona, B., & Indri, M. (2010) An application of Kullback–Leibler divergence to active slam and exploration with particle filters. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 287–293). doi:10.1109/IROS.2010.5652164.
Carlone, L., Du, J., Kaouk, M., Bona, B., & Indri, M. (2014). Active SLAM and exploration with particle filters using Kullback-Leibler divergence. Journal of Intelligent & Robotic Systems, 75(2), 291–311. doi:10.1007/s10846-013-9981-9.
Article Google Scholar
Carrillo, H., Latif, Y., Neira, J., & Castellanos, J. A. (2012a) Fast minimum uncertainty search on a graph map representation. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Vilamoura, Portugal (pp. 2504–2511). doi:10.1109/IROS.2012.6385927.
Carrillo, H., Reid, I., & Castellanos, J. A. (2012b) On the comparison of uncertainty criteria for active SLAM. In Proceedings of the IEEE international conference on robotics and automation (ICRA), St. Paul, MN, USA (pp. 2080–2087). doi:10.1109/ICRA.2012.6224890.
Carrillo, H., Birbach, O., Taubig, H., Bauml, B., Frese, U., & Castellanos, J. A. (2013) On task-oriented criteria for configurations selection in robot calibration. In Proceedings of the IEEE international conference on robotics and automation (ICRA), Karlsruhe, Germany (pp. 3653–3659). doi:10.1109/ICRA.2013.6631090.
Carrillo, H., Dames, P., Kumar, K., & Castellanos, J. A. (2015a) Autonomous robotic exploration using occupancy grid maps and graph SLAM based on Shannon and rényi entropy. In Proceedings of the IEEE international conference on robotics and automation (ICRA), Seattle, WA, USA.
Carrillo, H., Latif, Y., Rodríguez, M. L., Neira, J., & Castellanos, J. A. (2015b) On the monotonicity of optimality criteria during exploration in active SLAM. In Proceedings of the IEEE International conference on robotics and automation (ICRA), Seattle, WA, USA.
Censi, A. (2007). An accurate closed-form estimate of ICP’s covariance. In Proceedings 2007 IEEE international conference on robotics and automation (pp. 3167–3172). IEEE.
Charrow, B., & Dames, P. (2016). ROS code for UPenn’s SCARAB robot. https://github.com/bcharrow/scarab.
Cover, T. M., & Thomas, J. A. (2012). Elements of information theory. Hoboken, NJ: Wiley.
MATH Google Scholar
Dames, P., & Kumar, V. (2013). Cooperative multi-target localization with noisy sensors. In Proceedings of the IEEE international conference on robotics and automation (ICRA), Karlsruhe, Germany.
Du, J., Carlone, L., Kaouk, M., Bona, B., & Indri, M. (2011) A comparative study on active slam and autonomous exploration with particle filters. In Proceedings of IEEE/ASME international conference on advanced intelligent mechatronics (pp. 916–923). doi:10.1109/AIM.2011.6027142.
Eustice, R. M., Singh, H., Leonard, J. J., & Walter, M. R. (2006). Visually mapping the RMS titanic: Conservative covariance estimates for SLAM information filters. The International Journal of Robotics Research (IJRR), 25(12), 1223–1242.
Article Google Scholar
Fairfield, N., & Wettergreen, D. (2010). Active SLAM and Loop prediction with the segmented map using simplified models. In A. Howard, K. Iagnemma, & A. Kelly (Eds.), Field and service robotics, Springer Tracts in Advanced Robotics (Vol. 62, pp. 173–182). New York: Springer. doi:10.1007/978-3-642-13408-11_6.
Feinstein, A. (1958). Foundations of information theory. New York City, NY: McGraw-Hill.
MATH Google Scholar
Fernández-Madrigal, J. A., & Blanco, J. L. (2012). Simultaneous localization and mapping for mobile robots: Introduction and methods (1st ed.). Hershey, PA: IGI Global.
Google Scholar
Grisetti, G., Kuemmerle, R., Stachniss, C., & Burgard, W. (2010). A tutorial on graph-based SLAM. IEEE Intelligent Transportation Systems Magazine, 2(4), 31–43. doi:10.1109/MITS.2010.939925.
Article Google Scholar
Guzzi, J., Giusti, A., Gambardella, L. M., Theraulaz, G., & Di Caro, G. A. (2013) Human-friendly robot navigation in dynamic environments. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 423–430).
Hardy, G., Littlewood, J., & Pólya, G. (1952). Inequalities. Cambridge Mathematical Library. Cambridge: Cambridge University Press.
MATH Google Scholar
Hartley, R. V. L. (1928). Transmission of Information. Bell System Technical Journal, 7, 535–563.
Article Google Scholar
Hollinger, G. A., Mitra, U., & Sukhatme, G. S. (2011) Autonomous data collection from underwater sensor networks using acoustic communication. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 3564–3570). IEEE.
Hornung, A., Wurm, K. M., Bennewitz, M., Stachniss, C., & Burgard, W. (2013). OctoMap: An efficient probabilistic 3D mapping framework based on octrees. Autonomous Robots (AR), 34(3), 189–206. doi:10.1007/s10514-012-9321-0.
Article Google Scholar
Howard, A., Roy, N. (2009) Radish: The robotics data set repository. http://radish.sourceforge.net/. Accessed October 15, 2014.
Indelman, V., Carlone, L., & Dellaert, F. (2015). Planning in the continuous domain: A generalized belief space approach for autonomous navigation in unknown environments. The International Journal of Robotics Research, 34(7), 849–882. doi:10.1177/0278364914561102.
Article Google Scholar
Jaynes, E. T. (1957). Information theory and statistical mechanics. Physical Review, 106(4), 620.
Article MathSciNet MATH Google Scholar
Jumarie, G. (1990). Relative information: Theories and applications. New York, NY: Springer.
Book MATH Google Scholar
Kaess, M., & Dellaert, F. (2009). Covariance recovery from a square root information matrix for data association. Robotics and Autonomous Systems (RAS), 57, 1198–1210. doi:10.1016/j.robot.2009.06.008.
Article Google Scholar
Kaess, M., Ranganathan, A., & Dellaert, F. (2008). iSAM: Incremental Smoothing and Mapping. IEEE Transactions on Robotics (TRO), 24(6), 1365–1378.
Article Google Scholar
Kim, A., & Eustice, R. M. (2013) Perception-driven Navigation: Active visual SLAM for robotic area coverage. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 3196–3203). doi:10.1109/ICRA.2013.6631022.
Kim, A., & Eustice, R. M. (2015). Active visual SLAM for robotic area coverage: Theory and experiment. The International Journal of Robotics Research (IJRR), 34(4–5), 457–475. doi:10.1177/0278364914547893.
Article Google Scholar
Likhachev, M. (2015) Search-based planning library. https://github.com/sbpl/sbpl. Accessed October 15, 2015.
Lu, F., & Milios, E. (1997). Globally consistent range scan alignment for environment mapping. Autonomous Robots, 4(4), 333–349.
Article Google Scholar
Makarenko, A., Williams, S. B., Bourgault, F., & Durrant-Whyte, H. F. (2002) An experiment in integrated exploration. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 534–539). doi:10.1109/IRDS.2002.1041445.
Martinez-Cantin, R., de Freitas, N., Brochu, E., Castellanos, J., & Doucet, A. (2009). A bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot. Autonomous Robots (AR), 27(2), 93–103. doi:10.1007/s10514-009-9130-2.
Article Google Scholar
Michael, N., Fink, J., & Kumar, V. (2008). Experimental testbed for large multirobot teams. Robotics and Autonomous Systems (RAS), 15(1), 53–61.
Google Scholar
MRPT. (2016). Mobile robot programming toolkit. http://www.mrpt.org/.
Olson, E. (2009) Real-time correlative scan matching. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 1233–1239).
Pomerleau, F., Colas, F., Siegwart, R., & Magnenat, S. (2013). Comparing ICP variants on real-world data sets. Autonomous Robots (AR), 34(3), 133–148. doi:10.1007/s10514-013-9327-2.
Article Google Scholar
Principe, J. (2010). Information theoretic learning: Rényi’s entropy and kernel perspectives. Information Science and Statistics. Berlin: Springer.
Book MATH Google Scholar
Pukelsheim, F. (2006). Optimal design of experiments. Classics in Applied Mathematics. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM).
Rényi, A. (1960) On measures of entropy and information. In: Proceedings of the 4th Berkeley symposium on mathematics, statistics and probability (pp. 547–561).
Rényi, A. (1970). Probability theory. North-Holland series in applied mathematics and mechanics. Amsterdam: Elsevier.
Google Scholar
Roy, N., Burgard, W., Fox, D., Thrun, S. (1999) Coastal navigation—Mobile Robot navigation with uncertainty in dynamic environments. In Proceedings of the IEEE international conference on robotics and automation (ICRA).
Shannon, C., & Weaver, W. (1949). The mathematical theory of communication. Champaign, IL: Illinois Books, University of Illinois Press.
MATH Google Scholar
Shimazaki, H., & Shinomoto, S. (2007). A method for selecting the bin size of a time histogram. Neural Computation, 19(6), 1503–1527. doi:10.1162/neco.2007.19.6.1503.
Article MathSciNet MATH Google Scholar
Sim, R., Dudek, G., & Roy, N. (2004) Online control policy optimization for minimizing map uncertainty during exploration. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (Vol. 2, pp. 1758–1763). doi:10.1109/ROBOT.2004.1308078.
Smith, R., Self, M., & Cheeseman, P. (1990). Estimating uncertain spatial relationships in robotics. In I. J. Cox & G. T. Wilfong (Eds.), Autonomous robot vehicles (pp. 167–193). New York, NY: Springer.
Chapter Google Scholar
Stachniss, C. (2009). Robotic mapping and exploration (Vol. 55). Berlin: Springer.
Google Scholar
Stachniss, C., Hahnel, D., & Burgard, W. (2004) Exploration with active loop-closing for FastSLAM. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (Vol. 2, pp. 1505–1510). doi:10.1109/IROS.2004.1389609.
Stachniss, C., Grisetti, G., & Burgard, W. (2005) Information gain-based exploration using RAO-Blackwellized particle filters. In Proceedings of robotics: Science and systems conference (RSS), Cambridge, MA, USA.
Thrun, S., Burgard, W., & Fox, D. (2005). Probabilistic robotics. Boston, MA: MIT Press.
MATH Google Scholar
van den Berg, J., Patil, S., Alterovitz, R., et al. (2012). Motion planning under uncertainty using iterative local optimization in belief space. The International Journal of Robotics Research (IJRR), 31(11), 1263–1278. doi:10.1177/0278364912456319.
Article Google Scholar
Tipaldi, G. D., & Arras, K. O. (2010) FLIRT-interest regions for 2D range data. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 3616–3622).
Valencia, R., & Valls Miró, J., Dissanayake, G., Andrade-Cetto, J. (2012) Active pose SLAM. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 1885–1891). doi:10.1109/IROS.2012.6385637.
Xu, D. (1998) Energy, entropy and information potential for neural computation. PhD thesis, Gainesville, FL, USA, aAI9935317.
Yamauchi, B. (1998) Frontier-based exploration using multiple robots. In Proceedings of the second international conference on autonomous agents, ACM, AGENTS ’98 (pp. 47–53). doi:10.1145/280765.280773.
Zhang, Q., Whitney, D., Shkurti, F., & Rekleitis, I. (2014) Ear-based exploration on hybrid metric/topological maps. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 3081–3088). doi:10.1109/IROS.2014.6942988.
Zhang, Q., Rekleitis, I., & Dudek, G. (2015) Uncertainty reduction via heuristic search planning on hybrid metric/topological map. In 12th conference on computer and robot vision (pp. 222–229). doi:10.1109/CRV.2015.36.

Download references

Author information

Authors and Affiliations

Escuela de Ciencias Exactas e Ingeniería, Universidad Sergio Arboleda, Bogotá, Colombia
Henry Carrillo
Department of Mechanical Engineering, Temple University, Philadelphia, PA, USA
Philip Dames
GRASP Laboratory, University of Pennsylvania, Philadelphia, PA, USA
Vijay Kumar
Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
José A. Castellanos

Authors

Henry Carrillo
View author publications
You can also search for this author in PubMed Google Scholar
Philip Dames
View author publications
You can also search for this author in PubMed Google Scholar
Vijay Kumar
View author publications
You can also search for this author in PubMed Google Scholar
José A. Castellanos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henry Carrillo.

Additional information

This is one of several papers published in Autonomous Robots comprising the Special Issue on Active Perception.

H. Carrillo and J. A. Castellanos gratefully acknowledge funding from MINECO-FEDER Project DPI2012-36070 and DPI2015-68905-P, research Grants BES-2010-033116 and EEBB-2011-44287, and DGA Grupo (T04). H. Carrillo also acknowledges funding from Universidad Sergio Arboleda Project IN.BG.086.17.003/OE4. P. Dames and V. Kumar gratefully acknowledge funding from AFOSR Grant FA9550-10-1-0567, ONR Grants N00014-14-1-0510, N00014-09-1-1051, and N00014-09-1-103, and NSF Grant IIS-1426840.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 29533 KB)

Appendices

Appendix

1.1 A.1 Properties of Rényi’s Entropy

Entropy is a measure of the uncertainty of a random variable (Shannon and Weaver 1949; Rényi 1960; Jumarie 1990). A proper definition of entropy should comply with a set of axioms that guarantee a coherent way of accounting for uncertainty. A widely accepted set of axioms was developed by Aczél and Daróczy (1975). Feinstein (1958) also developed an earlier and more succinct set.

The first attempt to mathematically define entropy was from Hartley (1928). The second definition was developed by Shannon and Weaver (1949), and is the most widely-known and commonly-used definition. Finally Rényi (1960, 1970) created a family of entropy functions, of which the entropies of Hartley and Shannon are special cases. This family of functions, parameterized by $\alpha $, is defined as:

$$\begin{aligned} {{\mathrm{\mathbb {H}_{\alpha }}}}[P(\mathbf {x})] = \frac{1}{1-\alpha }~{{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}^{\alpha } \right) \end{aligned}$$

(A.1)

where $p_{i} = P(\mathbf {x}= \mathbf {x}_i)$ is an element of the probability distribution of a discrete random variable $\mathbf {x}$, so that $p_{i} \ge 0, \quad \forall i$ and $\sum _{i=1}^{n} p_{i} = 1$. The variable $\alpha $ is a free parameter in the range $[0, 1) \cup (1, \infty )$.

1.2 A.2 ${{\mathrm{\mathbb {H}_{\alpha }}}}~\text {at}~\alpha = 0$

Plugging $\alpha = 0$ in to (A.1) yields $\mathbb {H}_0[P(\mathbf {x})] = {{\mathrm{\log _{2}}}}n$, which is the Hartley entropy.

1.3 A.3 ${{\mathrm{\mathbb {H}_{\alpha }}}}~\text {as}~\alpha \rightarrow 1$

Note from the definition of Rényi’s entropy in (A.1) that it is undefined at $\alpha = 1$. Thus to define $\mathbb {H}_{1}[P(\mathbf {x})]$ we must look at the limit as $\alpha \rightarrow 1$:

$$\begin{aligned} \mathbb {H}_{1}[P(\mathbf {x})] = \lim _{\alpha \rightarrow \ 1} \mathbb {H}_{\alpha }[P(\mathbf {x})]. \end{aligned}$$

(A.2)

Applying the limit directly, we obtain a canonical indeterminate form $\frac{0}{0}$. Applying l’Hôpital’s rule we see that:

$$\begin{aligned} \mathbb {H}_{1}[P(\mathbf {x})]&= \lim _{\alpha \rightarrow \ 1} \frac{\left( \sum _{i=1}^{n} p_{i}^{\alpha } \right) ^{-1} \left( \sum _{i=1}^{n} p_{i}^{\alpha } {{\mathrm{\log _{2}}}}(p_{i}) \right) }{-1}\nonumber \\&= \quad -\,\sum _{i=1}^{n} p_{i} {{\mathrm{\log _{2}}}}(p_{i}) \nonumber \\&= \mathbb {H}[P(\mathbf {x})]. \end{aligned}$$

(A.3)

In other words, in the limit as $\alpha \rightarrow 1$ Rényi’s entropy becomes equal to Shannon’s entropy.

1.4 A.4 ${{\mathrm{\mathbb {H}_{\alpha }}}}~\text {as}~\alpha \rightarrow \infty $

Attempting to compute the limit at infinity of ${{\mathrm{\mathbb {H}_{\alpha }}}}[P(\mathbf {x})]$ directly yields infinity as a result. However, we can obtain the true value using the squeeze theorem. We start by defining $p_{i'} = \max (p_{i})$ and recall that $i \in \lbrace 1 \ldots n \rbrace $, $0 \le p_{i} \le 1$, and $\sum _{i=1}^{n} p_{i} = 1$. Hence, for $1< \alpha < \infty $, the following inequality stands:

$$\begin{aligned} p_{i'}^{\alpha } \le \sum _{i=1}^{n} p_{i}^{\alpha } \le n~p_{i'}^{\alpha } \end{aligned}$$

(A.4)

If we take the binary logarithm of (A.4), divide by $1-\alpha $, and rearrange terms, we obtain a more familiar inequality:

$$\begin{aligned} {{\mathrm{\log _{2}}}}(p_{i'}^{\alpha })&\le {{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}^{\alpha } \right) \le {{\mathrm{\log _{2}}}}(n~p_{i'}^{\alpha }) \nonumber \\ \alpha {{\mathrm{\log _{2}}}}(p_{i'})&\le {{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}^{\alpha } \right) \le {{\mathrm{\log _{2}}}}(n~p_{i'}^{\alpha }) \nonumber \\ \frac{\alpha }{1-\alpha }~{{\mathrm{\log _{2}}}}(p_{i'})&\!\ge \! {{\mathrm{\mathbb {H}_{\alpha }}}}[P(\mathbf {x})] \ge ~\frac{{{\mathrm{\log _{2}}}}(n)}{1-\alpha } \!+\! \frac{\alpha }{1-\alpha }~{{\mathrm{\log _{2}}}}(p_{i'}) \end{aligned}$$

(A.5)

Computing the limit as $\alpha \rightarrow \infty $ with l’Hôpital’s rule, we see that both sides yield the same value of $-{{\mathrm{\log _{2}}}}(p_{i'})$. Hence, according to the squeeze theorem, we can compute the desired limit:

$$\begin{aligned} \mathbb {H}_\infty [P(\mathbf {x})] = \lim _{\alpha \rightarrow \infty } {{\mathrm{\mathbb {H}_{\alpha }}}}[P(\mathbf {x})] = -{{\mathrm{\log _{2}}}}(\max _{i} p_{i}) \end{aligned}$$

(A.6)

1.5 A.5 Monotonicity with respect to $\alpha $

We seek to show that Rényi’s entropy monotonically decreases with increasing $\alpha $. To do this, we take the derivative with respect to $\alpha $ and show that it is non-positive. Let $q_{i} = p_{i}^{\alpha } / \sum _{j} p_{j}^{\alpha }$, and note that this defines a probability distribution $Q(\mathbf {x})$.

Taking the derivative of (A.1) with respect to $\alpha $ yields:

$$\begin{aligned} \frac{d}{d\alpha } {{\mathrm{\mathbb {H}_{\alpha }}}}[P(\mathbf {x})]&= \frac{d}{d\alpha } \frac{1}{1-\alpha }~{{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}^{\alpha } \right) \nonumber \\&= \frac{(1-\alpha ) (\sum _{j} p_{j}^{\alpha })^{-1} (\sum _{i} p_{i}^{\alpha } \log p_{i}) + \log (\sum _{j} p_{j}^{\alpha })}{(1-\alpha )^2} \nonumber \\&= \frac{(1-\alpha ) (\sum _{i} q_{i} \log p_{i}) + \sum _{i} (q_i \log (\sum _{j} p_{j}^{\alpha }))}{(1-\alpha )^2} \nonumber \\&= \frac{\sum _{i} q_{i} \log p_{i} - q_{i} \log p_{i}^{\alpha } + q_i \log (\sum _{j} p_{j}^{\alpha })}{(1-\alpha )^2} \nonumber \\&= \frac{\sum _{i} q_{i} \log p_{i} - q_{i} \log q_{i}}{(1-\alpha )^2} \nonumber \\&= \quad -\, \frac{\sum _{i} q_{i} \log \frac{q_{i}}{p_{i}}}{(1-\alpha )^2} \nonumber \\&= \quad -\, \frac{\text {KL}[Q(\mathbf {x}) \, || \, P(\mathbf {x})]}{(1-\alpha )^2}. \end{aligned}$$

(A.7)

Since both $(1-\alpha )^2$ and the Kullback–Leibler divergence $\text {KL}[Q(\mathbf {x}) \, || \, P(\mathbf {x})]$ are non-negative (Cover and Thomas 2012) the derivative is non-positive. Thus we conclude that Rényi’s entropy monotonically decreases in $\alpha $ for $\alpha \in (1, \infty )$.

1.6 A.6 Useful inequalities of ${{\mathrm{\mathbb {H}_{\alpha }}}}$

Let us consider two values for the free parameter of the Rényi entropy, $\alpha $ and $\alpha '$, such that $1 \le \alpha \le \alpha '$. For these two values of the free parameter, we can show that:

1.
$\mathbb {H}[P(\mathbf {x})] \ge \mathbb {H}_{\alpha }[P(\mathbf {x})], \; \forall \alpha \ge 1$
2.
$\mathbb {H}[P(\mathbf {x})] \ge \mathbb {H}_{\alpha }[P(\mathbf {x})] \ge \mathbb {H}_{\alpha '}[P(\mathbf {x})], \; 1 \le \alpha \le \alpha '$.

1.6.1 A.6.1 $\mathbb {H}[P(\mathbf {x})] \ge \mathbb {H}_{\alpha }[P(\mathbf {x})]$

The function $-{{\mathrm{\log _{2}}}}(z)$ is convex and a non-negative weighted sum operation does not affect the convexity of a function Boyd and Vandenberghe (2004, Ch. 3), hence using the $p_{i}$ as weights, the function $-\sum _{i=1}^{n} p_{i}{{\mathrm{\log _{2}}}}(z_{i})$ is still convex. Applying Jensen’s inequality we see that:

$$\begin{aligned} -{{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i} z_{i} \right) \le -\sum _{i=1}^{n} p_{i}{{\mathrm{\log _{2}}}}(z_{i}) \end{aligned}$$

(A.8)

where it is understood that $\sum _{i=1}^{n} p_{i} = 1$. Letting $z_{i} = p_{i}^{\alpha - 1}$, (A.8) becomes:

$$\begin{aligned} -{{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}^{\alpha }\right)\le & {} -(\alpha - 1)\sum _{i=1}^{n} p_{i}{{\mathrm{\log _{2}}}}(p_{i}) \end{aligned}$$

(A.9)

$$\begin{aligned} \frac{1}{1-\alpha }{{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}^{\alpha }\right)\le & {} -\sum _{i=1}^{n} p_{i}{{\mathrm{\log _{2}}}}(p_{i}) \end{aligned}$$

(A.10)

$$\begin{aligned} \mathbb {H}_{\alpha }[P(\mathbf {x})]\le & {} \mathbb {H}[P(\mathbf {x})]. \end{aligned}$$

(A.11)

1.6.2 A.6.2 $\mathbb {H}[P(\mathbf {x})] \ge \mathbb {H}_{\alpha }[P(\mathbf {x})] \ge \mathbb {H}_{\alpha '}[P(\mathbf {x})]$

It is known Hardy et al. (1952, Theorem 16) that

$$\begin{aligned} \left( \sum _{i=1}^{n} w_{i}x_{i}^{\beta } \right) ^{\frac{1}{\beta }} \end{aligned}$$

(A.12)

is a monotone increasing function of $\beta $. Rewriting (A.1) as

$$\begin{aligned} {{\mathrm{\mathbb {H}_{\alpha }}}}[P(\mathbf {x})] = {{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}\left( \frac{1}{p_{i}}\right) ^{1-\alpha } \right) ^{\frac{1}{1-\alpha }} \end{aligned}$$

(A.13)

and letting $\beta = 1 - \alpha $, we may apply (A.12) to conclude that (A.1) is a monotone decreasing function of $\alpha $ and the inequality holds.

B Probability simplex

Let $P(\mathbf {x})$ be a probability distribution of a discrete random variable $\mathbf {x}$, where $p_{i} = P(\mathbf {x}= \mathbf {x}_i)$ is an element of the distribution, $p_{i} \ge 0, \forall i$, and $\sum _{i=1}^{N} p_{i} = 1$. The probability simplex $\Delta _N$ is a $N-1$ dimensional manifold in a N-dimensional space where all the possible probabilities distributions of a N multidimensional random variable $\mathbf {x}$ lives (Principe 2010). That is:

$$\begin{aligned} \Delta _N \!=\! \left\{ p \!= \!(p_1, \ldots , p_N) \in \mathcal {R}^{N} , p_{i} \!\ge \! 0,\sum _{i=1}^{N} p_{i} = 1, \quad \forall i \right\} \end{aligned}$$

(A.14)

Where any point p in the probability simplex has a natural interpretation as a discrete probability distribution (Calafiore and Ghaoui 2014).

As an example consider the probability simplex in $\mathcal {R}^{3}$ for three variables $p_1, p_2, p_3$ where all possible distributions lives inside the equilateral triangle with vertices at (1, 0, 0), (0, 1, 0) and (0, 0, 1). Figure 13 shows an illustration of the above.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Carrillo, H., Dames, P., Kumar, V. et al. Autonomous robotic exploration using a utility function based on Rényi’s general theory of entropy. Auton Robot 42, 235–256 (2018). https://doi.org/10.1007/s10514-017-9662-9

Download citation

Received: 17 February 2016
Accepted: 20 July 2017
Published: 12 August 2017
Issue Date: February 2018
DOI: https://doi.org/10.1007/s10514-017-9662-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Autonomous robotic exploration using a utility function based on Rényi’s general theory of entropy

Abstract

Access this article

Similar content being viewed by others

Autonomous Exploration with Expectation-Maximization

Autonomous Exploration with Exact Inverse Sensor Models

Environment model adaptation for mobile robot exploration

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Appendices

Appendix

1.1 A.1 Properties of Rényi’s Entropy

1.2 A.2 \({{\mathrm{\mathbb {H}_{\alpha }}}}~\text {at}~\alpha = 0\)

1.3 A.3 \({{\mathrm{\mathbb {H}_{\alpha }}}}~\text {as}~\alpha \rightarrow 1\)

1.4 A.4 \({{\mathrm{\mathbb {H}_{\alpha }}}}~\text {as}~\alpha \rightarrow \infty \)

1.5 A.5 Monotonicity with respect to \(\alpha \)

1.6 A.6 Useful inequalities of \({{\mathrm{\mathbb {H}_{\alpha }}}}\)

1.6.1 A.6.1 \(\mathbb {H}[P(\mathbf {x})] \ge \mathbb {H}_{\alpha }[P(\mathbf {x})]\)

1.6.2 A.6.2 \(\mathbb {H}[P(\mathbf {x})] \ge \mathbb {H}_{\alpha }[P(\mathbf {x})] \ge \mathbb {H}_{\alpha '}[P(\mathbf {x})]\)

B Probability simplex

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Autonomous robotic exploration using a utility function based on Rényi’s general theory of entropy

Abstract

Access this article

Similar content being viewed by others

Autonomous Exploration with Expectation-Maximization

Autonomous Exploration with Exact Inverse Sensor Models

Environment model adaptation for mobile robot exploration

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Appendices

Appendix

1.1 A.1 Properties of Rényi’s Entropy

1.2 A.2 \({{\mathrm{\mathbb {H}_{\alpha }}}}~\text {at}~\alpha = 0\)

1.3 A.3 \({{\mathrm{\mathbb {H}_{\alpha }}}}~\text {as}~\alpha \rightarrow 1\)

1.4 A.4 \({{\mathrm{\mathbb {H}_{\alpha }}}}~\text {as}~\alpha \rightarrow \infty \)

1.5 A.5 Monotonicity with respect to \(\alpha \)

1.6 A.6 Useful inequalities of \({{\mathrm{\mathbb {H}_{\alpha }}}}\)

1.6.1 A.6.1 \(\mathbb {H}[P(\mathbf {x})] \ge \mathbb {H}_{\alpha }[P(\mathbf {x})]\)

1.6.2 A.6.2 \(\mathbb {H}[P(\mathbf {x})] \ge \mathbb {H}_{\alpha }[P(\mathbf {x})] \ge \mathbb {H}_{\alpha '}[P(\mathbf {x})]\)

B Probability simplex

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation