Abstract
In the field of protein secondary structure prediction three states of secondary structures are used, namely, alpha helices (H) beta strands (E), and coils (C). Protein secondary structure prediction is a fundamental step in determining the final structure and functions of a protein. In this chapter we are going to investigate the amino acids benchmark data sets, it was observed that the data is grouped into two distinct states or groups almost 50% each. In this scheme, researchers classify any state which is not classified as helix or strands or coils. Hence, in this work a new way of looking to the data set is adopted. For this type of data, the Receiver Operating Characteristic (ROC) analysis is considered for analysing and interpreting the results of our protein secondary structure classifier. The results revealed that ROC analysis showed similar results to that obtained using other non-ROC classification methods. The ROC curves were able to discriminate the coil states from non-coil states by 72% prediction accuracy with very small standard error.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
A.H. Fielding, J.F. Bell, A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ. Conserv. 24(1), 38–49 (1997)
D.J. Hand, W.E. Henley, Statistical classification methods in consumer credit scoring: A review. J. R. Stat. Soc. A. Stat. Soc. 160(3), 523–541 (1997)
A. Kloczkowski et al., Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence. Proteins 49, 154–166 (2002)
J. Garnier et al., Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol. 120, 97–120 (1978)
T.Z. Sen, R.L. Jernigan, J. Garnier, A. Kloczkowski, GOR V server for protein secondary structure prediction. Bioinformatics 21(11), 2787–2788 (2005)
S.O. Subair, S. Deris, Predicting protein secondary structure using artificial neural networks and information theory, in Application of Agents and Intelligent Information Technologies, ed. by V. Sugumaran, (Idea Group, USA, 2007), pp. 337–362
C. Cortes, M. Mohri, Confidence intervals for the area under the ROC curve. Adv. Neural Inf. Proces. Syst. 17, 305–312 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Subair, S., Thron, C. (2020). Receiver Operating Characteristic Curves in Binary Classification of Protein Secondary Structure Data. In: Subair, S., Thron, C. (eds) Implementations and Applications of Machine Learning. Studies in Computational Intelligence, vol 782. Springer, Cham. https://doi.org/10.1007/978-3-030-37830-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-37830-1_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37829-5
Online ISBN: 978-3-030-37830-1
eBook Packages: EngineeringEngineering (R0)