IStego100K: Large-Scale Image Steganalysis Dataset

Yang, Zhongliang; Wang, Ke; Ma, Sai; Huang, Yongfeng; Kang, Xiangui; Zhao, Xianfeng

doi:10.1007/978-3-030-43575-2_29

Zhongliang Yang¹³,
Ke Wang¹³,
Sai Ma¹⁴,
Yongfeng Huang¹³,
Xiangui Kang¹⁵ &
…
Xianfeng Zhao¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12022))

Included in the following conference series:

International Workshop on Digital Watermarking

1518 Accesses
13 Citations

Abstract

In order to promote the rapid development of image steganalysis technology, in this paper, we construct and release a multivariable large-scale image steganalysis dataset called IStego100K. It contains 208,104 images with the same size of 1024*1024. Among them, 200,000 images (100,000 cover-stego image pairs) are divided as the training set and the remaining 8,104 as testing set. In addition, we hope that IStego100K can help researchers further explore the development of universal image steganalysis algorithms, so we try to reduce limits on the images in IStego100K. For each image in IStego100K, the quality factors is randomly set in the range of 75–95, the steganographic algorithm is randomly selected from three well-known steganographic algorithms, which are J-uniward, nsF5 and UERD, and the embedding rate is also randomly set to be a value of 0.1–0.4. In addition, considering the possible mismatch between training samples and test samples in real environment, we add a test set (DS-Test) whose source of samples are different from the training set. We hope that this test set can help to evaluate the robustness of steganalysis algorithms. We tested the performance of some latest steganalysis algorithms on IStego100K, with specific results and analysis details in the experimental part. We hope that the IStego100K dataset will further promote the development of universal image steganalysis technology (The description of IStego100K and instructions for use can be found here: https://github.com/YangzlTHU/IStego100K).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://unsplash.com/.
2.
https://unsplash.com/license.

References

Shannon, C.E.: Communication theory of secrecy systems. Bell Labs Tech. J. 28(4), 656–715 (1949)
Article MathSciNet Google Scholar
Fridrich, J.: Steganography in Digital Media: Principles, Algorithms, and Applications. Cambridge University Press, Cambridge (2009)
Book Google Scholar
Chen, K., Zhou, H., Zhou, W., Zhang, W., Yu, N.: Defining cost functions for adaptive JPEG steganography at the microscale. IEEE Trans. Inf. Forensics Secur. 14(4), 1052–1066 (2019)
Article Google Scholar
Yang, Z., Peng, X., Huang, Y.: A sudoku matrix-based method of pitch period steganography in low-rate speech coding. In: Lin, X., Ghorbani, A., Ren, K., Zhu, S., Zhang, A. (eds.) SecureComm 2017. LNICST, vol. 238, pp. 752–762. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78813-5_40
Chapter Google Scholar
Yang, Z., Du, X., Tan, Y., Huang, Y., Zhang, Y.-J.: AAG-Stega: automatic audio generation-based steganography. arXiv preprint arXiv:1809.03463 (2018)
Yang, Z.-L., Guo, X.-Q., Chen, Z.-M., Huang, Y.-F., Zhang, Y.-J.: RNN-Stega: linguistic steganography based on recurrent neural networks. IEEE Trans. Inf. Forensics Secur. 14(5), 1280–1295 (2019)
Article Google Scholar
Yang, Z., Zhang, P., Jiang, M., Huang, Y., Zhang, Y.-J.: RITS: real-time interactive text steganography based on automatic dialogue model. In: Sun, X., Pan, Z., Bertino, E. (eds.) ICCCS 2018. LNCS, vol. 11065, pp. 253–264. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00012-7_24
Chapter Google Scholar
Yang, Z., Jin, S., Huang, Y., Zhang, Y., Li, H.: Automatically generate steganographic text based on Markov model and Huffman coding. arXiv preprint arXiv:1811.04720 (2018)
Johnson, N.F., Sallee, P.A.: Detection of Hidden Information, Covert Channels and Information Flows. Wiley Handbook of Science and Technology for Homeland Security (2008)
Google Scholar
Theohary, C.A.: Terrorist Use of the Internet: Information Operations in Cyberspace. DIANE Publishing (2011)
Google Scholar
Holub, V., Fridrich, J.: Low-complexity features for JPEG steganalysis using undecimated DCT. IEEE Trans. Inf. Forensics Secur. 10(2), 219–228 (2014)
Article Google Scholar
Song, X., Liu, F., Yang, C., Luo, X., Zhang, Y.: Steganalysis of adaptive JPEG steganography using 2D Gabor filters. In: Proceedings of the 3rd ACM Workshop on Information Hiding and Multimedia Security, pp. 15–23. ACM (2015)
Google Scholar
Boroumand, M., Chen, M., Fridrich, J.: Deep residual network for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 14(5), 1181–1193 (2019)
Article Google Scholar
Xu, G., Wu, H.-Z., Shi, Y.-Q.: Structural design of convolutional neural networks for steganalysis. IEEE Signal Process. Lett. 23(5), 708–712 (2016)
Article Google Scholar
Wu, S., Zhong, S., Liu, Y.: Deep residual learning for image steganalysis. Multimedia Tools Appl. 77(9), 437–453 (2018)
Google Scholar
Yang, Z., Wang, K., Li, J., Huang, Y., Zhang, Y.: TS-RNN: text steganalysis based on recurrent neural networks. IEEE Signal Process. Lett., 1 (2019)
Google Scholar
Yang, Z., Huang, Y., Zhang, Y.-J.: A fast and efficient text steganalysis method. IEEE Signal Process. Lett. 26(4), 627–631 (2019)
Article Google Scholar
Yang, Z., Yang, H., Hu, Y., Huang, Y., Zhang, Y.-J.: Real-time steganalysis for stream media based on multi-channel convolutional sliding windows. arXiv preprint arXiv:1902.01286 (2019)
Bas, P., Filler, T., Pevný, T.: Break our steganographic system: the ins and outs of organizing BOSS. In: Filler, T., Pevný, T., Craver, S., Ker, A. (eds.) IH 2011. LNCS, vol. 6958, pp. 59–70. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24178-9_5
Chapter Google Scholar
Holub, V., Fridrich, J., Denemark, T.: Universal distortion function for steganography in an arbitrary domain. EURASIP J. Inf. Secur. 2014(1), 1 (2014)
Article Google Scholar
Fridrich, J., Pevnỳ, T., Kodovskỳ, J.: Statistically undetectable jpeg steganography: dead ends challenges, and opportunities. In: Proceedings of the 9th Workshop on Multimedia & Security, pp. 3–14. ACM (2007)
Google Scholar
Guo, L., Ni, J., Su, W., Tang, C., Shi, Y.-Q.: Using statistical image model for JPEG steganography: uniform embedding revisited. IEEE Trans. Inf. Forensics Secur. 10(12), 2669–2680 (2015)
Article Google Scholar
Xu, G.: Deep convolutional neural network to detect J-UNIWARD. In: Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security, pp. 67–73. ACM (2017)
Google Scholar

Download references

Acknowledgment

This work was supported in part by the National Key Research and Development Program of China under Grant 2018YFB0804103 and the National Natural Science Foundation of China (No.U1536207, No.U1705261 and No.U1636113).

Author information

Authors and Affiliations

Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing, 100084, China
Zhongliang Yang, Ke Wang & Yongfeng Huang
State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100093, China
Sai Ma & Xianfeng Zhao
Guangdong Key Lab of Information Security, Sun Yat-sen University, Guangzhou, China
Xiangui Kang

Authors

Zhongliang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ke Wang
View author publications
You can also search for this author in PubMed Google Scholar
Sai Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yongfeng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xiangui Kang
View author publications
You can also search for this author in PubMed Google Scholar
Xianfeng Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhongliang Yang .

Editor information

Editors and Affiliations

College of Cybersecurity, Sichuan University, Chengdu, China
Hongxia Wang
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Xianfeng Zhao
Department of ECE, New Jersey Institute of Technology, Newark, NJ, USA
Yunqing Shi
Graduate School of Information Study, Korea University, Seoul, Korea (Republic of)
Hyoung Joong Kim
Department of Information Engineering, University of Florence, Florence, Italy
Alessandro Piva

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, Z., Wang, K., Ma, S., Huang, Y., Kang, X., Zhao, X. (2020). IStego100K: Large-Scale Image Steganalysis Dataset. In: Wang, H., Zhao, X., Shi, Y., Kim, H., Piva, A. (eds) Digital Forensics and Watermarking. IWDW 2019. Lecture Notes in Computer Science(), vol 12022. Springer, Cham. https://doi.org/10.1007/978-3-030-43575-2_29

Download citation

DOI: https://doi.org/10.1007/978-3-030-43575-2_29
Published: 25 March 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43574-5
Online ISBN: 978-3-030-43575-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics