A Prediction Model Based Approach to Open Space Steganography Detection in HTML Webpages

  • Iman SedeeqEmail author
  • Frans Coenen
  • Alexei Lisitsa
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10431)


A mechanism for detecting Open Space Steganography (OSS) is described founded on the observation that the length of white space segments increases in the presence of OSS. The frequency of white space segments of different length is conceptualized in terms of an n-dimensional feature. This feature space is used to encode webpages (labelled as OSS or not OSS) so that each page is represented in terms of a feature vector. This representation was used to train a classifier which can subsequently be used to detect the presence, or otherwise, of OSS in unseen webpages. The proposed approach is evaluated using a number of different classifiers and with and without feature selection. Its operation is also compared with two existing OSS detection approaches. From the evaluation a best accuracy of \(96.7\%\) was obtained. The evaluation also demonstrated that the proposed method outperforms the two alternative techniques by a significant margin.


Open space Steganography Classification 


  1. 1.
    Bender, W., Gruh, D., Morimoto, N., Lu, A.: Techniques for data hiding. IBM Syst. 35, 313–336 (1996)CrossRefGoogle Scholar
  2. 2.
    Forrest, S.: Introduction to deogol (2006),
  3. 3.
    Huang, H., Zhong, S., Sun, X.: An algorithm of webpage information hiding based on attributes permutation. In: 4th International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), pp. 257–260 (2008)Google Scholar
  4. 4.
    Shen, D., Zhao, H.: A novel scheme of webpage information hiding based on attributes. In: 2010 IEEE International Conference on Information Theory and Information Security (ICITIS), pp. 1147–1150 (2010)Google Scholar
  5. 5.
    Sui, X.-G., Luo, H.: A new steganography method based on hypertext. In: Proceedings of Radio Science Conference, pp. 181–184 (2004)Google Scholar
  6. 6.
    Shen, Y.: A scheme of information hiding based on html document. J. Wuhan Univ. 50, 217–220 (2004)Google Scholar
  7. 7.
    Zhao, Q., Hongtao, L.: A PCA-based web page watermarking. Pattern Recogn. 40, 1334–1341 (2007)CrossRefzbMATHGoogle Scholar
  8. 8.
    McKellar, D.: Space mimic (2000),
  9. 9.
    wbStego4open (2004),
  10. 10.
    Kwan, M.: The snow home page (2006),
  11. 11.
    Por, L.Y., Ang, T.F., Delina, B.: Whitesteg: a new scheme in information hiding using text steganography. WSEAS Trans. Comput. 7, 735–745 (2008)Google Scholar
  12. 12.
    Sui, X.-G., Luo, H.: A steganalysis method based on the distribution of space characters. In: Proceedings of Communications, Circuits and Systems International Conference Guilin Guangzi, China, pp. 54–56 (2006)Google Scholar
  13. 13.
    Huang, H., Sun, X., Li, Z., Sun, G.: Detection of hidden information in webpage. In: Fourth International Conference of Fuzzy Systems and Knowledge Discovery FSKD (2007)Google Scholar
  14. 14.
    Baeza-Yates, R., Navarro, G.: Modeling text databases. Recent Advences in Applied Probablity, pp. 1–25 (2006)Google Scholar
  15. 15.
    Frank, E., Hall, M.A., Witten, I.H.: The Weka Workbench. Online Appendix for Data Mining: Practical Machine Learning Tools and Techniques, 4th edn. Morgan Kaufmann (2016)Google Scholar
  16. 16.
    Omar, N.B., Jusoh, F.B., Bin Othman, M.S., Ibrahim, R.B.: Review of feature selection for solving classification problems. J. Res. Innov. Inf. Syst., 54–60 (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Liverpool UniversityLiverpoolUK

Personalised recommendations