Skip to main content

Towards a High Productivity Automatic Analysis Framework for Classification: An Initial Study

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7987))

Abstract

Due to the recent explosion of research data based on novel scientific instruments and corresponding experiments, automatic features, in particular in data analysis, has become more essential than ever. In this paper we present a new Automatic Analysis Framework (AAF) that is able to increase the productivity of data analysis. The AAF can be used for classifications, predictions and clustering. It is built upon the workflow engine Taverna, which is widely used in different domains and there exists a large number of Taverna activities for various kinds of analytical methods. The AAF enables scientists to modify our predefined Taverna workflow and to extend it with other available activities. For the execution of the analytical methods, in particular for the computation of the results, we use our own cloud-based Code Execution Framework (CEF). It provides web services to execute problem solving environment code, such as MATLAB, Octave, and R scripts, in parallel in the cloud. This combination of the AAF and CEF enables scientists to easily conduct time-consuming calculations without the need to manually combine potential combinations of independent variables. It furthermore automatically evaluates all identified models and provides service for the scientists conducting the analysis. The framework has been tested and evaluated with real breath gas data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. IONICON PTR-TOFMS Series (2012), http://www.ionicon.com/products/ptr-ms/ptrtofms/index.html

  2. Taverna - open source and domain independent Workflow Management System (2012), http://www.taverna.org.uk

  3. International Association for Breath Research (IABR), http://iabr.voc-research.at (accessed December 2012)

  4. Journal of Breath Research, http://iopscience.iop.org/1752-7163 (accessed December 2012)

  5. Amazon: Amazon EC2 Instance Types (2012), http://aws.amazon.com/ec2/instance-types/

  6. Bajtarevic, A., Ager, C., Pienz, M., Klieber, M., Schwarz, K., Ligor, M., Ligor, T., Filipiak, W., Denz, H., Fiegl, M., Hilbe, W., Weiss, W., Lukas, P., Jamnig, H., Hackl, M., Haidenberger, A., Buszewski, B., Miekisch, W., Schubert, J., Amann, A.: Noninvasive detection of lung cancer by analysis of exhaled breath. BMC Cancer 9(1), 348 (2009), http://www.biomedcentral.com/1471-2407/9/348

    Article  Google Scholar 

  7. Elsayed, I., Ludescher, T., Woehrer, A., Feilhauer, T., Brezany, P.: Data Life Cycle Management and Analytics Code Execution Strategies for the Breath Gas Analysis Domain. Procedia Computer Science 9, 156–165 (2012), http://www.sciencedirect.com/science/article/pii/S187705091200138X ; Proceedings of the International Conference on Computational Science, ICCS 2012

    Google Scholar 

  8. Filipiak, W., Ruzsanyi, V., Mochalski, P., Filipiak, A., Bajtarevic, A., Ager, C., Denz, H., Hilbe, W., Jamnig, H., Hackl, M., Dzien, A., Amann, A.: Dependence of exhaled breath composition on exogenous factors, smoking habits and exposure to air pollutants. Journal of Breath Research 6(3), 036008 (2012), http://stacks.iop.org/1752-7163/6/i=3/a=036008

  9. R Project Foundation, The R Project for Statistical Computing, http://www.r-project.org (accessed December 2012)

  10. Houeto, P., Hoffman, J.R., Got, P., Dang, V., Baud, F.J.: Acetonitrile as a possible marker of current cigarette smoking. Hum. Exp. Toxicol. 16(11), 658–661 (1997), http://www.biomedsearch.com/nih/Acetonitrile-as-possible-marker-current/9426367.html

    Article  Google Scholar 

  11. Eato, J.W.: Octave (2012), http://www.gnu.org/software/octave

  12. Kepner, J.: High Performance Computing Productivity Model Synthesis. The International Journal of High Performance Computing Applications 4(18), 505516 (2004)

    Google Scholar 

  13. Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial intelligence, IJCAI 1995, vol. 2, pp. 1137–1143. Morgan Kaufmann Publishers Inc., San Francisco (1995), http://dl.acm.org/citation.cfm?id=1643031.1643047

    Google Scholar 

  14. Kushch, I., Schwarz, K., Schwentner, L., Baumann, B., Dzien, A., Schmid, A., Unterkofler, K., Gastl, G., Španěl, P., Smith, D., Amann, A.: Compounds enhanced in a mass spectrometric profile of smokers’ exhaled breath versus non-smokers as determined in a pilot study using ptr-ms. Journal of Breath Research 2(2), 026002 (2008), http://stacks.iop.org/1752-7163/2/i=2/a=026002

  15. Ludescher, T., Feilhauer, T., Brezany, P.: Security Concept and Implementation for a Cloud Based E-science Infrastructure. In: 2012 Seventh International Conference on Availability, Reliability and Security, pp. 280–285 (2012)

    Google Scholar 

  16. OECD: Measuring Productivity - OECD Manual. OECD Publishing, /content/book/9789264194519-en (2001)

    Google Scholar 

  17. The MathWorks: Matlab - The Language of Technical Computing, http://www.mathworks.com/products/matlab (accessed December 2012)

  18. Weka 3: Data Mining with Open Source Machine Learning Software in Java (2012), http://www.cs.waikato.ac.nz/~ml/weka/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ludescher, T., Feilhauer, T., Amann, A., Brezany, P. (2013). Towards a High Productivity Automatic Analysis Framework for Classification: An Initial Study. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2013. Lecture Notes in Computer Science(), vol 7987. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39736-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39736-3_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39735-6

  • Online ISBN: 978-3-642-39736-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics