Skip to main content
Log in

Autonomous robotic exploration using a utility function based on Rényi’s general theory of entropy

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

In this paper we present a novel information-theoretic utility function for selecting actions in a robot-based autonomous exploration task. The robot’s goal in an autonomous exploration task is to create a complete, high-quality map of an unknown environment as quickly as possible. This implicitly requires the robot to maintain an accurate estimate of its pose as it explores both unknown and previously observed terrain in order to correctly incorporate new information into the map. Our utility function simultaneously considers uncertainty in both the robot pose and the map in a novel way and is computed as the difference between the Shannon and the Rényi entropy of the current distribution over maps. Rényi’s entropy is a family of functions parameterized by a scalar, with Shannon’s entropy being the limit as this scalar approaches unity. We link the value of this scalar parameter to the predicted future uncertainty in the robot’s pose after taking an exploratory action. This effectively decreases the expected information gain of the action, with higher uncertainty in the robot’s pose leading to a smaller expected information gain. Our objective function allows the robot to automatically trade off between exploration and exploitation in a way that does not require manually tuning parameter values, a significant advantage over many competing methods that only use Shannon’s definition of entropy. We use simulated experiments to compare the performance of our proposed utility function to these state-of-the-art utility functions. We show that robots that use our proposed utility function generate maps with less uncertainty and fewer visible artifacts and that the robots have less uncertainty in their pose during exploration. Finally, we demonstrate that a real-world robot using our proposed utility function is able to successfully create a high-quality map of an indoor office environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Aczél, J., & Daróczy, Z. (1975). On measures of information and their characterizations. In Mathematics in science and engineering (Vol. 115). New York, NY Academic Press/Harcourt Brace Jovanovich Publishers.

  • Averbeck, B. B. (2015). Theory of choice in bandit, information sampling and foraging tasks. PLoS Computational Biology, 11, 1–28. doi:10.1371/journal.pcbi.1004164.

    Article  Google Scholar 

  • Blanco, J., Fernández-Madrigal, J., & Gonzalez, J. (2008). A novel measure of uncertainty for mobile robot slam with rao-blackwellized particle filters. The International Journal of Robotics Research (IJRR), 27(1), 73–89. doi:10.1177/0278364907082610.

    Article  Google Scholar 

  • Bourgault, F., Makarenko, A., Williams, S., Grocholsky, B., & Durrant-Whyte, H. (2002) Information based adaptive robotic exploration. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 540–545). doi:10.1109/IRDS.2002.1041446.

  • Boyd, S., & Vandenberghe, L. (2004). Convex optimization. New York, NY: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Brodersen, K., Ong, C. S., Stephan, K., & Buhmann, J. (2010). The balanced accuracy and its posterior distribution. In Proceedings of the international conference on pattern recognition (ICPR) (pp. 3121–3124). doi:10.1109/ICPR.2010.764.

  • Brooks, R. A., & Mataric, M. J. (1993). Real robots, real learning problems. Berlin: Springer.

    Book  Google Scholar 

  • Burgard, W., Moors, M., Stachniss, C., & Schneider, F. (2005). Coordinated multi-robot exploration. IEEE Transactions on Robotics (TRO), 21(3), 376–386. doi:10.1109/TRO.2004.839232.

    Article  Google Scholar 

  • Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., et al. (2016). Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on Robotics, 32(6), 1309–1332.

    Article  Google Scholar 

  • Calafiore, G., & Ghaoui, L. (2014). Optimization models. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Carlone, L., Du, J., Kaouk, M., Bona, B., & Indri, M. (2010) An application of Kullback–Leibler divergence to active slam and exploration with particle filters. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 287–293). doi:10.1109/IROS.2010.5652164.

  • Carlone, L., Du, J., Kaouk, M., Bona, B., & Indri, M. (2014). Active SLAM and exploration with particle filters using Kullback-Leibler divergence. Journal of Intelligent & Robotic Systems, 75(2), 291–311. doi:10.1007/s10846-013-9981-9.

    Article  Google Scholar 

  • Carrillo, H., Latif, Y., Neira, J., & Castellanos, J. A. (2012a) Fast minimum uncertainty search on a graph map representation. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Vilamoura, Portugal (pp. 2504–2511). doi:10.1109/IROS.2012.6385927.

  • Carrillo, H., Reid, I., & Castellanos, J. A. (2012b) On the comparison of uncertainty criteria for active SLAM. In Proceedings of the IEEE international conference on robotics and automation (ICRA), St. Paul, MN, USA (pp. 2080–2087). doi:10.1109/ICRA.2012.6224890.

  • Carrillo, H., Birbach, O., Taubig, H., Bauml, B., Frese, U., & Castellanos, J. A. (2013) On task-oriented criteria for configurations selection in robot calibration. In Proceedings of the IEEE international conference on robotics and automation (ICRA), Karlsruhe, Germany (pp. 3653–3659). doi:10.1109/ICRA.2013.6631090.

  • Carrillo, H., Dames, P., Kumar, K., & Castellanos, J. A. (2015a) Autonomous robotic exploration using occupancy grid maps and graph SLAM based on Shannon and rényi entropy. In Proceedings of the IEEE international conference on robotics and automation (ICRA), Seattle, WA, USA.

  • Carrillo, H., Latif, Y., Rodríguez, M. L., Neira, J., & Castellanos, J. A. (2015b) On the monotonicity of optimality criteria during exploration in active SLAM. In Proceedings of the IEEE International conference on robotics and automation (ICRA), Seattle, WA, USA.

  • Censi, A. (2007). An accurate closed-form estimate of ICP’s covariance. In Proceedings 2007 IEEE international conference on robotics and automation (pp. 3167–3172). IEEE.

  • Charrow, B., & Dames, P. (2016). ROS code for UPenn’s SCARAB robot. https://github.com/bcharrow/scarab.

  • Cover, T. M., & Thomas, J. A. (2012). Elements of information theory. Hoboken, NJ: Wiley.

    MATH  Google Scholar 

  • Dames, P., & Kumar, V. (2013). Cooperative multi-target localization with noisy sensors. In Proceedings of the IEEE international conference on robotics and automation (ICRA), Karlsruhe, Germany.

  • Du, J., Carlone, L., Kaouk, M., Bona, B., & Indri, M. (2011) A comparative study on active slam and autonomous exploration with particle filters. In Proceedings of IEEE/ASME international conference on advanced intelligent mechatronics (pp. 916–923). doi:10.1109/AIM.2011.6027142.

  • Eustice, R. M., Singh, H., Leonard, J. J., & Walter, M. R. (2006). Visually mapping the RMS titanic: Conservative covariance estimates for SLAM information filters. The International Journal of Robotics Research (IJRR), 25(12), 1223–1242.

    Article  Google Scholar 

  • Fairfield, N., & Wettergreen, D. (2010). Active SLAM and Loop prediction with the segmented map using simplified models. In A. Howard, K. Iagnemma, & A. Kelly (Eds.), Field and service robotics, Springer Tracts in Advanced Robotics (Vol. 62, pp. 173–182). New York: Springer. doi:10.1007/978-3-642-13408-11_6.

  • Feinstein, A. (1958). Foundations of information theory. New York City, NY: McGraw-Hill.

    MATH  Google Scholar 

  • Fernández-Madrigal, J. A., & Blanco, J. L. (2012). Simultaneous localization and mapping for mobile robots: Introduction and methods (1st ed.). Hershey, PA: IGI Global.

    Google Scholar 

  • Grisetti, G., Kuemmerle, R., Stachniss, C., & Burgard, W. (2010). A tutorial on graph-based SLAM. IEEE Intelligent Transportation Systems Magazine, 2(4), 31–43. doi:10.1109/MITS.2010.939925.

    Article  Google Scholar 

  • Guzzi, J., Giusti, A., Gambardella, L. M., Theraulaz, G., & Di Caro, G. A. (2013) Human-friendly robot navigation in dynamic environments. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 423–430).

  • Hardy, G., Littlewood, J., & Pólya, G. (1952). Inequalities. Cambridge Mathematical Library. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Hartley, R. V. L. (1928). Transmission of Information. Bell System Technical Journal, 7, 535–563.

    Article  Google Scholar 

  • Hollinger, G. A., Mitra, U., & Sukhatme, G. S. (2011) Autonomous data collection from underwater sensor networks using acoustic communication. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 3564–3570). IEEE.

  • Hornung, A., Wurm, K. M., Bennewitz, M., Stachniss, C., & Burgard, W. (2013). OctoMap: An efficient probabilistic 3D mapping framework based on octrees. Autonomous Robots (AR), 34(3), 189–206. doi:10.1007/s10514-012-9321-0.

    Article  Google Scholar 

  • Howard, A., Roy, N. (2009) Radish: The robotics data set repository. http://radish.sourceforge.net/. Accessed October 15, 2014.

  • Indelman, V., Carlone, L., & Dellaert, F. (2015). Planning in the continuous domain: A generalized belief space approach for autonomous navigation in unknown environments. The International Journal of Robotics Research, 34(7), 849–882. doi:10.1177/0278364914561102.

    Article  Google Scholar 

  • Jaynes, E. T. (1957). Information theory and statistical mechanics. Physical Review, 106(4), 620.

    Article  MathSciNet  MATH  Google Scholar 

  • Jumarie, G. (1990). Relative information: Theories and applications. New York, NY: Springer.

    Book  MATH  Google Scholar 

  • Kaess, M., & Dellaert, F. (2009). Covariance recovery from a square root information matrix for data association. Robotics and Autonomous Systems (RAS), 57, 1198–1210. doi:10.1016/j.robot.2009.06.008.

    Article  Google Scholar 

  • Kaess, M., Ranganathan, A., & Dellaert, F. (2008). iSAM: Incremental Smoothing and Mapping. IEEE Transactions on Robotics (TRO), 24(6), 1365–1378.

    Article  Google Scholar 

  • Kim, A., & Eustice, R. M. (2013) Perception-driven Navigation: Active visual SLAM for robotic area coverage. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 3196–3203). doi:10.1109/ICRA.2013.6631022.

  • Kim, A., & Eustice, R. M. (2015). Active visual SLAM for robotic area coverage: Theory and experiment. The International Journal of Robotics Research (IJRR), 34(4–5), 457–475. doi:10.1177/0278364914547893.

    Article  Google Scholar 

  • Likhachev, M. (2015) Search-based planning library. https://github.com/sbpl/sbpl. Accessed October 15, 2015.

  • Lu, F., & Milios, E. (1997). Globally consistent range scan alignment for environment mapping. Autonomous Robots, 4(4), 333–349.

    Article  Google Scholar 

  • Makarenko, A., Williams, S. B., Bourgault, F., & Durrant-Whyte, H. F. (2002) An experiment in integrated exploration. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 534–539). doi:10.1109/IRDS.2002.1041445.

  • Martinez-Cantin, R., de Freitas, N., Brochu, E., Castellanos, J., & Doucet, A. (2009). A bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot. Autonomous Robots (AR), 27(2), 93–103. doi:10.1007/s10514-009-9130-2.

    Article  Google Scholar 

  • Michael, N., Fink, J., & Kumar, V. (2008). Experimental testbed for large multirobot teams. Robotics and Autonomous Systems (RAS), 15(1), 53–61.

    Google Scholar 

  • MRPT. (2016). Mobile robot programming toolkit. http://www.mrpt.org/.

  • Olson, E. (2009) Real-time correlative scan matching. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 1233–1239).

  • Pomerleau, F., Colas, F., Siegwart, R., & Magnenat, S. (2013). Comparing ICP variants on real-world data sets. Autonomous Robots (AR), 34(3), 133–148. doi:10.1007/s10514-013-9327-2.

    Article  Google Scholar 

  • Principe, J. (2010). Information theoretic learning: Rényi’s entropy and kernel perspectives. Information Science and Statistics. Berlin: Springer.

    Book  MATH  Google Scholar 

  • Pukelsheim, F. (2006). Optimal design of experiments. Classics in Applied Mathematics. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM).

  • Rényi, A. (1960) On measures of entropy and information. In: Proceedings of the 4th Berkeley symposium on mathematics, statistics and probability (pp. 547–561).

  • Rényi, A. (1970). Probability theory. North-Holland series in applied mathematics and mechanics. Amsterdam: Elsevier.

    Google Scholar 

  • Roy, N., Burgard, W., Fox, D., Thrun, S. (1999) Coastal navigation—Mobile Robot navigation with uncertainty in dynamic environments. In Proceedings of the IEEE international conference on robotics and automation (ICRA).

  • Shannon, C., & Weaver, W. (1949). The mathematical theory of communication. Champaign, IL: Illinois Books, University of Illinois Press.

    MATH  Google Scholar 

  • Shimazaki, H., & Shinomoto, S. (2007). A method for selecting the bin size of a time histogram. Neural Computation, 19(6), 1503–1527. doi:10.1162/neco.2007.19.6.1503.

    Article  MathSciNet  MATH  Google Scholar 

  • Sim, R., Dudek, G., & Roy, N. (2004) Online control policy optimization for minimizing map uncertainty during exploration. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (Vol. 2, pp. 1758–1763). doi:10.1109/ROBOT.2004.1308078.

  • Smith, R., Self, M., & Cheeseman, P. (1990). Estimating uncertain spatial relationships in robotics. In I. J. Cox & G. T. Wilfong (Eds.), Autonomous robot vehicles (pp. 167–193). New York, NY: Springer.

    Chapter  Google Scholar 

  • Stachniss, C. (2009). Robotic mapping and exploration (Vol. 55). Berlin: Springer.

    Google Scholar 

  • Stachniss, C., Hahnel, D., & Burgard, W. (2004) Exploration with active loop-closing for FastSLAM. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (Vol. 2, pp. 1505–1510). doi:10.1109/IROS.2004.1389609.

  • Stachniss, C., Grisetti, G., & Burgard, W. (2005) Information gain-based exploration using RAO-Blackwellized particle filters. In Proceedings of robotics: Science and systems conference (RSS), Cambridge, MA, USA.

  • Thrun, S., Burgard, W., & Fox, D. (2005). Probabilistic robotics. Boston, MA: MIT Press.

    MATH  Google Scholar 

  • van den Berg, J., Patil, S., Alterovitz, R., et al. (2012). Motion planning under uncertainty using iterative local optimization in belief space. The International Journal of Robotics Research (IJRR), 31(11), 1263–1278. doi:10.1177/0278364912456319.

    Article  Google Scholar 

  • Tipaldi, G. D., & Arras, K. O. (2010) FLIRT-interest regions for 2D range data. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 3616–3622).

  • Valencia, R., & Valls Miró, J., Dissanayake, G., Andrade-Cetto, J. (2012) Active pose SLAM. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 1885–1891). doi:10.1109/IROS.2012.6385637.

  • Xu, D. (1998) Energy, entropy and information potential for neural computation. PhD thesis, Gainesville, FL, USA, aAI9935317.

  • Yamauchi, B. (1998) Frontier-based exploration using multiple robots. In Proceedings of the second international conference on autonomous agents, ACM, AGENTS ’98 (pp. 47–53). doi:10.1145/280765.280773.

  • Zhang, Q., Whitney, D., Shkurti, F., & Rekleitis, I. (2014) Ear-based exploration on hybrid metric/topological maps. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 3081–3088). doi:10.1109/IROS.2014.6942988.

  • Zhang, Q., Rekleitis, I., & Dudek, G. (2015) Uncertainty reduction via heuristic search planning on hybrid metric/topological map. In 12th conference on computer and robot vision (pp. 222–229). doi:10.1109/CRV.2015.36.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henry Carrillo.

Additional information

This is one of several papers published in Autonomous Robots comprising the Special Issue on Active Perception.

H. Carrillo and J. A. Castellanos gratefully acknowledge funding from MINECO-FEDER Project DPI2012-36070 and DPI2015-68905-P, research Grants BES-2010-033116 and EEBB-2011-44287, and DGA Grupo (T04). H. Carrillo also acknowledges funding from Universidad Sergio Arboleda Project IN.BG.086.17.003/OE4. P. Dames and V. Kumar gratefully acknowledge funding from AFOSR Grant FA9550-10-1-0567, ONR Grants N00014-14-1-0510, N00014-09-1-1051, and N00014-09-1-103, and NSF Grant IIS-1426840.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 29533 KB)

Appendices

Appendix

1.1 A.1 Properties of Rényi’s Entropy

Entropy is a measure of the uncertainty of a random variable (Shannon and Weaver 1949; Rényi 1960; Jumarie 1990). A proper definition of entropy should comply with a set of axioms that guarantee a coherent way of accounting for uncertainty. A widely accepted set of axioms was developed by Aczél and Daróczy (1975). Feinstein (1958) also developed an earlier and more succinct set.

The first attempt to mathematically define entropy was from Hartley (1928). The second definition was developed by Shannon and Weaver (1949), and is the most widely-known and commonly-used definition. Finally Rényi (1960, 1970) created a family of entropy functions, of which the entropies of Hartley and Shannon are special cases. This family of functions, parameterized by \(\alpha \), is defined as:

$$\begin{aligned} {{\mathrm{\mathbb {H}_{\alpha }}}}[P(\mathbf {x})] = \frac{1}{1-\alpha }~{{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}^{\alpha } \right) \end{aligned}$$
(A.1)

where \(p_{i} = P(\mathbf {x}= \mathbf {x}_i)\) is an element of the probability distribution of a discrete random variable \(\mathbf {x}\), so that \(p_{i} \ge 0, \quad \forall i\) and \(\sum _{i=1}^{n} p_{i} = 1\). The variable \(\alpha \) is a free parameter in the range \([0, 1) \cup (1, \infty )\).

1.2 A.2 \({{\mathrm{\mathbb {H}_{\alpha }}}}~\text {at}~\alpha = 0\)

Plugging \(\alpha = 0\) in to (A.1) yields \(\mathbb {H}_0[P(\mathbf {x})] = {{\mathrm{\log _{2}}}}n\), which is the Hartley entropy.

1.3 A.3 \({{\mathrm{\mathbb {H}_{\alpha }}}}~\text {as}~\alpha \rightarrow 1\)

Note from the definition of Rényi’s entropy in (A.1) that it is undefined at \(\alpha = 1\). Thus to define \(\mathbb {H}_{1}[P(\mathbf {x})]\) we must look at the limit as \(\alpha \rightarrow 1\):

$$\begin{aligned} \mathbb {H}_{1}[P(\mathbf {x})] = \lim _{\alpha \rightarrow \ 1} \mathbb {H}_{\alpha }[P(\mathbf {x})]. \end{aligned}$$
(A.2)

Applying the limit directly, we obtain a canonical indeterminate form \(\frac{0}{0}\). Applying l’Hôpital’s rule we see that:

$$\begin{aligned} \mathbb {H}_{1}[P(\mathbf {x})]&= \lim _{\alpha \rightarrow \ 1} \frac{\left( \sum _{i=1}^{n} p_{i}^{\alpha } \right) ^{-1} \left( \sum _{i=1}^{n} p_{i}^{\alpha } {{\mathrm{\log _{2}}}}(p_{i}) \right) }{-1}\nonumber \\&= \quad -\,\sum _{i=1}^{n} p_{i} {{\mathrm{\log _{2}}}}(p_{i}) \nonumber \\&= \mathbb {H}[P(\mathbf {x})]. \end{aligned}$$
(A.3)

In other words, in the limit as \(\alpha \rightarrow 1\) Rényi’s entropy becomes equal to Shannon’s entropy.

1.4 A.4 \({{\mathrm{\mathbb {H}_{\alpha }}}}~\text {as}~\alpha \rightarrow \infty \)

Attempting to compute the limit at infinity of \({{\mathrm{\mathbb {H}_{\alpha }}}}[P(\mathbf {x})]\) directly yields infinity as a result. However, we can obtain the true value using the squeeze theorem. We start by defining \(p_{i'} = \max (p_{i})\) and recall that \(i \in \lbrace 1 \ldots n \rbrace \), \(0 \le p_{i} \le 1\), and \(\sum _{i=1}^{n} p_{i} = 1\). Hence, for \(1< \alpha < \infty \), the following inequality stands:

$$\begin{aligned} p_{i'}^{\alpha } \le \sum _{i=1}^{n} p_{i}^{\alpha } \le n~p_{i'}^{\alpha } \end{aligned}$$
(A.4)

If we take the binary logarithm of (A.4), divide by \(1-\alpha \), and rearrange terms, we obtain a more familiar inequality:

$$\begin{aligned} {{\mathrm{\log _{2}}}}(p_{i'}^{\alpha })&\le {{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}^{\alpha } \right) \le {{\mathrm{\log _{2}}}}(n~p_{i'}^{\alpha }) \nonumber \\ \alpha {{\mathrm{\log _{2}}}}(p_{i'})&\le {{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}^{\alpha } \right) \le {{\mathrm{\log _{2}}}}(n~p_{i'}^{\alpha }) \nonumber \\ \frac{\alpha }{1-\alpha }~{{\mathrm{\log _{2}}}}(p_{i'})&\!\ge \! {{\mathrm{\mathbb {H}_{\alpha }}}}[P(\mathbf {x})] \ge ~\frac{{{\mathrm{\log _{2}}}}(n)}{1-\alpha } \!+\! \frac{\alpha }{1-\alpha }~{{\mathrm{\log _{2}}}}(p_{i'}) \end{aligned}$$
(A.5)

Computing the limit as \(\alpha \rightarrow \infty \) with l’Hôpital’s rule, we see that both sides yield the same value of \(-{{\mathrm{\log _{2}}}}(p_{i'})\). Hence, according to the squeeze theorem, we can compute the desired limit:

$$\begin{aligned} \mathbb {H}_\infty [P(\mathbf {x})] = \lim _{\alpha \rightarrow \infty } {{\mathrm{\mathbb {H}_{\alpha }}}}[P(\mathbf {x})] = -{{\mathrm{\log _{2}}}}(\max _{i} p_{i}) \end{aligned}$$
(A.6)

1.5 A.5 Monotonicity with respect to \(\alpha \)

We seek to show that Rényi’s entropy monotonically decreases with increasing \(\alpha \). To do this, we take the derivative with respect to \(\alpha \) and show that it is non-positive. Let \(q_{i} = p_{i}^{\alpha } / \sum _{j} p_{j}^{\alpha }\), and note that this defines a probability distribution \(Q(\mathbf {x})\).

Taking the derivative of (A.1) with respect to \(\alpha \) yields:

$$\begin{aligned} \frac{d}{d\alpha } {{\mathrm{\mathbb {H}_{\alpha }}}}[P(\mathbf {x})]&= \frac{d}{d\alpha } \frac{1}{1-\alpha }~{{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}^{\alpha } \right) \nonumber \\&= \frac{(1-\alpha ) (\sum _{j} p_{j}^{\alpha })^{-1} (\sum _{i} p_{i}^{\alpha } \log p_{i}) + \log (\sum _{j} p_{j}^{\alpha })}{(1-\alpha )^2} \nonumber \\&= \frac{(1-\alpha ) (\sum _{i} q_{i} \log p_{i}) + \sum _{i} (q_i \log (\sum _{j} p_{j}^{\alpha }))}{(1-\alpha )^2} \nonumber \\&= \frac{\sum _{i} q_{i} \log p_{i} - q_{i} \log p_{i}^{\alpha } + q_i \log (\sum _{j} p_{j}^{\alpha })}{(1-\alpha )^2} \nonumber \\&= \frac{\sum _{i} q_{i} \log p_{i} - q_{i} \log q_{i}}{(1-\alpha )^2} \nonumber \\&= \quad -\, \frac{\sum _{i} q_{i} \log \frac{q_{i}}{p_{i}}}{(1-\alpha )^2} \nonumber \\&= \quad -\, \frac{\text {KL}[Q(\mathbf {x}) \, || \, P(\mathbf {x})]}{(1-\alpha )^2}. \end{aligned}$$
(A.7)

Since both \((1-\alpha )^2\) and the Kullback–Leibler divergence \(\text {KL}[Q(\mathbf {x}) \, || \, P(\mathbf {x})]\) are non-negative (Cover and Thomas 2012) the derivative is non-positive. Thus we conclude that Rényi’s entropy monotonically decreases in \(\alpha \) for \(\alpha \in (1, \infty )\).

1.6 A.6 Useful inequalities of \({{\mathrm{\mathbb {H}_{\alpha }}}}\)

Let us consider two values for the free parameter of the Rényi entropy, \(\alpha \) and \(\alpha '\), such that \(1 \le \alpha \le \alpha '\). For these two values of the free parameter, we can show that:

  1. 1.

    \(\mathbb {H}[P(\mathbf {x})] \ge \mathbb {H}_{\alpha }[P(\mathbf {x})], \; \forall \alpha \ge 1\)

  2. 2.

    \(\mathbb {H}[P(\mathbf {x})] \ge \mathbb {H}_{\alpha }[P(\mathbf {x})] \ge \mathbb {H}_{\alpha '}[P(\mathbf {x})], \; 1 \le \alpha \le \alpha '\).

1.6.1 A.6.1 \(\mathbb {H}[P(\mathbf {x})] \ge \mathbb {H}_{\alpha }[P(\mathbf {x})]\)

The function \(-{{\mathrm{\log _{2}}}}(z)\) is convex and a non-negative weighted sum operation does not affect the convexity of a function Boyd and Vandenberghe (2004, Ch. 3), hence using the \(p_{i}\) as weights, the function \(-\sum _{i=1}^{n} p_{i}{{\mathrm{\log _{2}}}}(z_{i})\) is still convex. Applying Jensen’s inequality we see that:

$$\begin{aligned} -{{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i} z_{i} \right) \le -\sum _{i=1}^{n} p_{i}{{\mathrm{\log _{2}}}}(z_{i}) \end{aligned}$$
(A.8)

where it is understood that \(\sum _{i=1}^{n} p_{i} = 1\). Letting \(z_{i} = p_{i}^{\alpha - 1}\), (A.8) becomes:

$$\begin{aligned} -{{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}^{\alpha }\right)\le & {} -(\alpha - 1)\sum _{i=1}^{n} p_{i}{{\mathrm{\log _{2}}}}(p_{i}) \end{aligned}$$
(A.9)
$$\begin{aligned} \frac{1}{1-\alpha }{{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}^{\alpha }\right)\le & {} -\sum _{i=1}^{n} p_{i}{{\mathrm{\log _{2}}}}(p_{i}) \end{aligned}$$
(A.10)
$$\begin{aligned} \mathbb {H}_{\alpha }[P(\mathbf {x})]\le & {} \mathbb {H}[P(\mathbf {x})]. \end{aligned}$$
(A.11)

1.6.2 A.6.2 \(\mathbb {H}[P(\mathbf {x})] \ge \mathbb {H}_{\alpha }[P(\mathbf {x})] \ge \mathbb {H}_{\alpha '}[P(\mathbf {x})]\)

It is known Hardy et al. (1952, Theorem 16) that

$$\begin{aligned} \left( \sum _{i=1}^{n} w_{i}x_{i}^{\beta } \right) ^{\frac{1}{\beta }} \end{aligned}$$
(A.12)

is a monotone increasing function of \(\beta \). Rewriting (A.1) as

$$\begin{aligned} {{\mathrm{\mathbb {H}_{\alpha }}}}[P(\mathbf {x})] = {{\mathrm{\log _{2}}}}\left( \sum _{i=1}^{n} p_{i}\left( \frac{1}{p_{i}}\right) ^{1-\alpha } \right) ^{\frac{1}{1-\alpha }} \end{aligned}$$
(A.13)

and letting \(\beta = 1 - \alpha \), we may apply (A.12) to conclude that (A.1) is a monotone decreasing function of \(\alpha \) and the inequality holds.

B Probability simplex

Let \(P(\mathbf {x})\) be a probability distribution of a discrete random variable \(\mathbf {x}\), where \(p_{i} = P(\mathbf {x}= \mathbf {x}_i)\) is an element of the distribution, \(p_{i} \ge 0, \forall i\), and \(\sum _{i=1}^{N} p_{i} = 1\). The probability simplex \(\Delta _N\) is a \(N-1\) dimensional manifold in a N-dimensional space where all the possible probabilities distributions of a N multidimensional random variable \(\mathbf {x}\) lives (Principe 2010). That is:

$$\begin{aligned} \Delta _N \!=\! \left\{ p \!= \!(p_1, \ldots , p_N) \in \mathcal {R}^{N} , p_{i} \!\ge \! 0,\sum _{i=1}^{N} p_{i} = 1, \quad \forall i \right\} \end{aligned}$$
(A.14)

Where any point p in the probability simplex has a natural interpretation as a discrete probability distribution (Calafiore and Ghaoui 2014).

As an example consider the probability simplex in \(\mathcal {R}^{3}\) for three variables \(p_1, p_2, p_3\) where all possible distributions lives inside the equilateral triangle with vertices at (1, 0, 0), (0, 1, 0) and (0, 0, 1). Figure 13 shows an illustration of the above.

Fig. 13
figure 13

The probability simplex in \(\mathcal {R}^{3}\) is shown as the blue triangle with vertices at (1, 0, 0), (0, 1, 0) and (0, 0, 1). Any point on the simplex represents a probability distribution over three variables \(p_1, p_2, p_3\) (Color figure online)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Carrillo, H., Dames, P., Kumar, V. et al. Autonomous robotic exploration using a utility function based on Rényi’s general theory of entropy. Auton Robot 42, 235–256 (2018). https://doi.org/10.1007/s10514-017-9662-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-017-9662-9

Keywords

Navigation