Correction to: Efficient feature selection using shrinkage estimators

Sechidis, Konstantinos; Azzimonti, Laura; Pocock, Adam; Corani, Giorgio; Weatherall, James; Brown, Gavin

doi:10.1007/s10994-020-05884-6

Correction to: Efficient feature selection using shrinkage estimators

Correction
Published: 04 June 2020

Volume 109, pages 1565–1567, (2020)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Correction to: Efficient feature selection using shrinkage estimators

Download PDF

Konstantinos Sechidis¹,
Laura Azzimonti²,
Adam Pocock³,
Giorgio Corani²,
James Weatherall⁴ &
…
Gavin Brown¹

1312 Accesses
1 Citation
Explore all metrics

The Original Article was published on 09 May 2019

1 Correction to: Machine Learning (2019) 108:1261–1286 https://doi.org/10.1007/s10994-019-05795-1

There was a mistake in the proof of the optimal shrinkage intensity for our estimator presented in Section 3.1. The main theorem still holds, and the shrinkage intensity presented in the corrected version is the optimal in the sense of minimizing the mean squared error (MSE). In this document, apart from correcting the proof for the optimal shrinkage intensity, we provide empirical verification on the correctness via simulations. The third term of Theorem 1 needs to be corrected as follows:

$$ \begin{aligned} \widehat{\mathbb {E}}\left[ (\hat{p}^{\mathrm{Ind}}(xy))^2\right]&= \frac{1}{N^3}\bigg ( (N-1)(N-2)(N-3)\big ( \hat{p}^\mathrm{ML}(x) \hat{p}^\mathrm{ML}(y) \big )^2 \nonumber \\&\qquad \qquad + (N-1) (N-2)\hat{p}^\mathrm{ML}(x) \hat{p}^\mathrm{ML}(y) \big ((\hat{p}^{\mathrm{ML}}(x)+\hat{p}^{\mathrm{ML}}(y)+4\hat{p}^{\mathrm{ML}}(xy))\big ) \nonumber \\&\qquad \qquad +(N-1)\big (2\hat{p}^{\mathrm{ML}}(xy)(\hat{p}^{\mathrm{ML}}(x)+\hat{p}^{\mathrm{ML}}(y))+2(\hat{p}^{\mathrm{ML}}(xy))^2\nonumber \\&\qquad \qquad +\hat{p}^\mathrm{ML}(x) \hat{p}^\mathrm{ML}(y)\big ) + \hat{p}^{\mathrm{ML}}(xy) \bigg ). \end{aligned} $$

(1)

Parts of supplementary material’s pages 4–6, where the above term is derived, need the following corrections. In page 4 the term A(xy) needs to be corrected as follows:

$$ \begin{aligned} {A(xy)} ={\sum _{\begin{array}{c} x',x'' \in \mathcal {X}\\ x'\ne x'' \ne x \end{array}}\sum _{\begin{array}{c} y', y'' \in \mathcal {Y}\\ y'\ne y'' \end{array}}{\mathbb {E}} \left[ {N_{xy'}N_{xy''} N_{x'y}N_{x''y}}\right] +2\sum _{\begin{array}{c} x' \in \mathcal {X}\\ x'\ne x \end{array}}\sum _{\begin{array}{c} y', y'' \in \mathcal {Y}\\ y'\ne y'' \ne y \end{array}}{\mathbb {E}} \left[ {N_{xy'}N_{xy''} N_{x'y}N_{xy}}\right] }. \end{aligned} $$

As a consequence in page 5 the same term needs correction:

$$ \begin{aligned} {A(xy)}=&{N^{(4)} \Bigg [\bigg (p(x)^2-\sum _{y' \in \mathcal {Y}}p(xy')^2\bigg )\bigg (p(y)^2-\sum _{x' \in \mathcal {X}}p(x'y)^2\bigg )}\\&{-4 \big (p(x)-p(xy)\big )p(xy)^2\big (p(y)-p(xy)\big )\Bigg ]}. \end{aligned} $$

Finally, the first equation in page 6 needs the following correction:

$$ \begin{aligned}&{\sum _{x',x'' \in \mathcal {X}}\sum _{y', y'' \in \mathcal {Y}}{\mathbb {E}} \left[ {N_{xy'}N_{xy''} N_{x'y}N_{x''y}}\right] =}{N^{(4)}p(x)^2p(y)^2}\\&\quad {+N^{(3)}p(x)p(y)(p(x)+p(y)+4p(xy))}\\&\quad {+N^{(2)}\big [2p(xy)(p(x)+p(y))+2p(xy)^2+p(x)p(y)\big ]}\\&\quad {+Np(xy),} \end{aligned} $$

which will result in the estimate for $ \widehat{\mathbb {E}}\left[ (\hat{p}^{\mathrm{Ind}}(xy))^2\right] $ presented in Eq. (1).

Apart from correcting the proof, we also provide some simulation results that validate the correctness of the optimal shrinkage intensity. To this end we followed the procedure described in the main paper’s Section 3.2, to generate probabilities that lead to different types of effect size, i.e. different population values for the mutual information I(X; Y). The squared error of our shrinkage estimator for the probabilities is defined as $ \sum _{x \in \mathcal {X}}\sum _{' \in \mathcal {Y}} \left( p(xy) - \hat{p}^{\mathrm{Ind-JS}}(xy) \right) ^{2}. $ We estimated the MSE by averaging over 1000 simulation runs. In Fig. 1 we present the results for three different effect sizes: I(X;Y) = 0.01, 0.05 and 0.15. In each graph we plot the MSE for all possible values of the shrinkage intensity [0, 1] and we also point out the optimal intensity using the corrected value $\lambda ^{*}$ and the value we erroneously used in the previous version of the paper $ \lambda _{e}^{*} $. As we see, the corrected value leads to the minimum MSE.

Acknowledgements

We would like to thank Prof. Jan Mielniczuk and Małgorzata Łazȩcka for bringing this issue to our attention and for their detailed and insightful comments.

Author information

Authors and Affiliations

School of Computer Science, University of Manchester, Manchester, UK
Konstantinos Sechidis & Gavin Brown
Istituto Dalle Molle di studi sull’ Intelligenza Artificiale (IDSIA), Manno, Switzerland
Laura Azzimonti & Giorgio Corani
Oracle Labs, Burlington, MA, USA
Adam Pocock
Advanced Analytics Centre, Global Medicines Development, AstraZeneca, Cambridge, UK
James Weatherall

Authors

Konstantinos Sechidis
View author publications
You can also search for this author in PubMed Google Scholar
Laura Azzimonti
View author publications
You can also search for this author in PubMed Google Scholar
Adam Pocock
View author publications
You can also search for this author in PubMed Google Scholar
Giorgio Corani
View author publications
You can also search for this author in PubMed Google Scholar
James Weatherall
View author publications
You can also search for this author in PubMed Google Scholar
Gavin Brown
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Konstantinos Sechidis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sechidis, K., Azzimonti, L., Pocock, A. et al. Correction to: Efficient feature selection using shrinkage estimators. Mach Learn 109, 1565–1567 (2020). https://doi.org/10.1007/s10994-020-05884-6

Download citation

Published: 04 June 2020
Issue Date: August 2020
DOI: https://doi.org/10.1007/s10994-020-05884-6

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Correction to: Efficient feature selection using shrinkage estimators

1 Correction to: Machine Learning (2019) 108:1261–1286 https://doi.org/10.1007/s10994-019-05795-1

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation