Skip to main content

Modifications of BIC for data mining under sparsity

  • Conference paper
  • First Online:
Operations Research Proceedings 2011

Part of the book series: Operations Research Proceedings ((ORP))

  • 2689 Accesses

Abstract

In many research areas today the number of features for which data is collected is much larger than the sample size based on which inference is made. This is especially true for applications in bioinformatics, but the theory presented here is of general interest in any data mining context, where the number of “interesting” features is expected to be small. In particular mBIC, mBIC1 and mBIC2 are discussed, three modifications of the Bayesian information criterion BIC which in case of an orthogonal designs control the family wise error (mBIC) and the false discovery rate (mBIC1, mBIC2), respectively. In a brief simulation study the performance of these criteria is illustrated for orthogonal and non-orthogonal regression matrices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abramovich, F., Benjamini, Y., Donoho, D.L., Johnstone, I.M.: Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 34, 584–653 (2006)

    Article  Google Scholar 

  2. Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B. 57, 289–300 (1995)

    Google Scholar 

  3. Bogdan,M., Chakrabarti, A., Frommlet, F., Ghosh, J. K.: Asymptotic Bayes-Optimality under sparsity of some multiple testing procedures. Ann. Statist. 39, 1551-1579 (2011)

    Article  Google Scholar 

  4. Bogdan, M., Frommlet, F., Biecek, P., Cheng, R., Ghosh, J.K. and Doerge, R.W.: Extending the Modified Bayesian Information Criterion (mBIC) to dense markers and multiple interval mapping. Biometrics 64, 1162 – 1169 (2008)

    Article  Google Scholar 

  5. Bogdan, M., Ghosh, J.K., Doerge, R.W.: Modifying the Schwarz Bayesian Information Criterion to locate multiple interacting quantitive trait loci. Genetics 167, 989–999 (2004)

    Article  Google Scholar 

  6. Bogdan, M., ˙Zak-Szatkowska, M., Ghosh, J.K.: Selecting explanatory variables with themodified version of Bayesian Information Criterion. Quality and Reliability Engineering International 24, 627–641, (2008)

    Google Scholar 

  7. Frommlet, F., Chakrabarti, A., Murawska, M., Bogdan, M.,: Asymptotic Bayes optimality under sparsity for generally distributed effect sizes under the alternative. Technical report, arXiv:1005.4753 (2011)

    Google Scholar 

  8. Frommlet, F., Ruhaltinger, F., Twar´og, P., Bogdan, M.,: Modified versions of Bayesian Information Criterion for genome-wide association studies. CSDA, in print, doi:10.1016/j.csda.2011.05.005 (2011)

    Google Scholar 

  9. Hoggart, C.J., Whittaker, J.C., De Iorio, M., Balding, D.J.: Simultaneous Analysis of All SNPs in Genome-Wide and Re-Sequencing Association Studies. PLOS Genetics 4(7), e1000130. doi:10.1371/journal.pgen.1000130, (2008)

    Google Scholar 

  10. Żak-Szatkowska M., Bogdan, M.: Modified versions of Bayesian Information Criterion for sparse Generalized Linear Models. CSDA, in press, doi:10.1016/j.csda.2011.04.016 (2011).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Florian Frommlet .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Frommlet, F. (2012). Modifications of BIC for data mining under sparsity. In: Klatte, D., Lüthi, HJ., Schmedders, K. (eds) Operations Research Proceedings 2011. Operations Research Proceedings. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29210-1_39

Download citation

Publish with us

Policies and ethics