Steganalysis of JSteg algorithm using hypothesis testing theory
 5.5k Downloads
 2 Citations
Abstract
This paper investigates the statistical detection of JSteg steganography. The approach is based on a statistical model of discrete cosine transformation (DCT) coefficients challenging the usual assumption that among a subband all the coefficients are independent and identically distributed (i. i. d.). The hidden informationdetection problem is cast in the framework of hypothesis testing theory. In an ideal context where all model parameters are perfectly known, the likelihood ratio test (LRT) is presented, and its performances are theoretically established. The statistical performance of LRT serves as an upper bound for the detection power. For a practical use where the distribution parameters are unknown, by exploring a DCT channel selection, a detector based on estimation of those parameters is designed. The loss of power of the proposed detector compared with the optimal LRT is small, which shows the relevance of the proposed approach.
Keywords
hypothesis testing theory JSteg steganalysis DCT distribution model Hidden information detection1 Introduction
Steganography and steganalysis have received more and more focus in the past two decades since the research in this field concerns law enforcement and national strategic defence. Steganography is the art and science of hiding secret messages in the cover media. On the opposite, steganalysis is about the detection of hidden secret information embedded in the cover media, also called stego media. If a steganalysis algorithm detects the inspected media as the stego one, even without knowing any extra information about the secret message, the steganographic approach fails.
1.1 State of the art
In today’s digital world, there exists many steganographic tools available on the Internet. Due to the fact that some are readily available and very simple to use, it is necessary to design the most reliable steganalysis methodology to fight back steganography. In general, due to its simplicity, most steganographic schemes insert the secret message into the least significant bit (LSB) plane of the cover media, including two kinds of steganography: LSB replacement and LSB matching. The former algorithm aims at replacing the LSB plane in the spatial domain or frequency domain of the cover media by 0 or 1. The latter algorithm, also known as ±1 embedding (see [13]), randomly increments or decrements a pixel or discrete cosine transformation (DCT) coefficient value to match the secret bit to be embedded when necessary. Since LSB replacement is easier to implement, it remains more popular, and hence, as of December 2011, WetStone declared that about 70% of the available steganographic softwares are based on the LSBreplacement algorithm [4]. Therefore, the research on LSBreplacement steganalysis remains an active topic.
Although the LSBreplacement steganalysis method (see [510]) has been studied for many years, it can be noted that most of the priorart detectors are designed to detect data hidden in the spatial domain. In addition, for only a few detectors, the statistical properties have been studied and established, referred to as the optimal detectors. As detailed in [11], a wide range of problems, theoretical as well as practical, remain uncovered and some prevent the moving of ‘steganography and steganalysis from the laboratory into the real world’. This is especially the case in the field of optimal detection, see ([11], sec. 3.1), in which this paper lies. Roughly speaking, the goal of optimal detection in steganalysis is to exploit an accurate statistical model of cover source, usually digital images, to design a statistical test whose properties can be established, typically, in order to guarantee a false alarm rate (FAR) and to calculate the optimal detection performance one can expect from the most powerful detector.
In 2004, the weighted stegoimage (WS) method [12] and the test proposed in [13] for LSBreplacement steganalysis changed the situation opening the way to optimal detectors. Driven by these pioneer works, the enhanced WS algorithm proposed in [14] improved the detection rate by enhancing pixel predictor, adjusting weighting factor and introducing the concept of bias correction. Nevertheless, the drawback of the original WS method is that it can only be applied in the spatial domain. Due to the prevalence of images compressed in the Joint Photographic Experts Group (JPEG) format, how to deal with this kind of images becomes mandatory. Inspired by the prior studies [12,14], the WS steganalyser for JPEG covers was proposed in [15]. However, the WS steganalyser does not allow one to get a highdetection performance for a low FAR, see [16], and its statistical properties remain unknown, which prevents the guarantee of a prescribed FAR. In practical forensic cases, since a large database of images needs to be processed, the getting of a very low FAR is crucial.
1.2 Contributions of the paper
For the detection of data hidden within the DCT coefficients of JPEG images, the application of hypothesis testing theory for designing optimal detectors that are efficient in practice is facing the problem of accurately modelling statistical distribution of DCT coefficients. It can be noted that several models have been proposed in the literature to model statistically the DCT coefficients. Among those models, the Laplacian distribution is probably the most widely used due to its simplicity and its fairly good accuracy [17]. More accurate models such as the generalized Gaussian [18] and, more recently, the generalized gamma model [19] have been shown to provide much more accuracy at the cost of higher complexity. Some of those models have been exploited in the field of steganalysis, see [20,21] for instance. In the framework of optimal detection, a first attempt has been made to design a statistical test modelling the DCT coefficient with the quantized Laplacian distribution, see [22].
It should be noted that other approaches have been proposed for the detection of data hidden within DCT coefficients of JPEG images, to cite a few, the structural detection [23], the category attack [24], the WS detector [15] and the universal or blind detectors [25,26]. However, establishing the statistical properties of those detectors remains a difficult work which has not been studied yet. In addition, most accurate detectors based on statistical learning are sensitive to the socalled coversource mismatch [27]: the training phase must be performed with caution.
 1.
First, a novel model of DCT coefficients is proposed; its major originality is that this model does not assume that all the coefficients of the same subband are i. i. d.
 2.
Second, assuming that all the parameters are known, this statistical model of DCT coefficients is used to design the optimal test to detect data hidden within JPEG images with JSteg algorithm. This statistical test takes into account distribution parameters of each DCT coefficient as nuisance parameters.
 3.
Further, assuming that all the parameters are unknown, a simple approach is proposed to estimate the expectation (or location) parameter of each coefficient by using linear properties of DCT as well as estimation of pixel expectation in the spatial domain; the variance (or scale) parameter is also estimated locally.
 4.
The designed detector is improved by exploring a DCT channel selection, which has been proposed very recently [28,29] that selects only a subset of pixels or DCT coefficients in which embedding is most likely and hence detection easier.
 5.
Numerical results show the sharpness of the theoretically established results and the good performance of the proposed statistical test. A comparison with the statistical test based on the Laplacian distribution and on the assumption of i. i. d. coefficient, see [22], shows the relevance of the proposed methodology. In addition, compared with priorart WS detector [15], experimental results show the efficiency of the proposed detector.
1.3 Organisation of the paper
This paper is organised as follows. Section 2 formalises the statistical problem of detection of information hidden within DCT coefficients of JPEG images. Then, Section 3 presents the optimal LRT for detecting the JSteg algorithm based on the Laplacian distribution model. Section 4 presents the proposed approach for estimating the nuisance parameters in practice and compares our proposed detector with the WS detector [15] theoretically. Finally, Section 5 presents numerical results of the proposed steganalyser on simulated and real images, and Section 6 concludes this paper. This paper is an extended version of [30] that also includes the findings of [31] on channel selection [28,29].
2 Problem statement
In this paper, a grayscale digital image is represented, in the spatial domain, by a single matrix Z={z _{ i,j }},i∈{1,…,I},j∈{1,…,J}. The present work can be extended to a colour image by analysing each colour channel separately. Most digital images are stored using the JPEG compression standard. This standard exploits the linear DCT, over blocks of 8×8 pixels, to represent an image in the socalled DCT domain. In the present paper, we avoid the description of the imaging pipeline of a digital still camera; the reader can refer to [32] for a description of the whole imaging pipeline and to [33] for a detailed description of the JPEG compression standard.
Let us denote DCT coefficients by the matrix V={v _{ i,j }}. An alternative representation of those coefficients is usually adopted by gathering the DCT coefficients that correspond to the same frequency subband. In this paper, this alternative representation is denoted by the matrix U={u _{ k,l }},k∈{1,…,K},l∈{1,…,64} with K≈I×J/64^{a}.
The coefficients from the first subband u _{ k,1}, often referred to as direct current component (DC) coefficients, represent the mean of pixel value over a kth block of 8×8 pixels. The modification of those coefficients may be obvious and creates artifacts that can be detected easily; hence, they are usually not used for data hiding. Similarly, the JSteg algorithm does not use the coefficients from the other subbands, referred to as alternating current component (AC) coefficients, if they equal 0 or 1. In fact, it is known that using the coefficients equal to 0 or 1 modifies significantly the statistical properties of AC coefficients; this creates a flaw that can be detected.
The JSteg algorithm embeds data within the DCT coefficients of JPEG images using the wellknown LSBreplacement method, see details in [34]. In brief, this method consists of substituting the LSB of each DCT coefficient by a bit of the message it is aimed to hide. The number of bits hidden per coefficient, usually referred to as the payload, is denoted R∈(0,1]. Since the JSteg algorithm does not use each DCT coefficient, the payload will in fact be measured in this paper as the number of bits hidden per usable coefficients (that is the number of bits divided by the number of AC coefficients that differ from 0 and 1).
and \(\bar {u} = u+(1)^{u}\) represents the integer u with flipped LSB. For the sake of clarity, let us denote θ _{ k,l } the distribution parameter of the kth DCT coefficient from the lth subband and let θ={θ _{ k,l }},k∈{1,…,K},l∈{2,…,64} represent the distribution parameter of all the AC coefficients.
which is equivalent to minimise the missed detection probability \(\alpha _{1}(\delta) = {\mathbb {P}}_{\mathcal {H}_{1}}\left [\delta (\mathbf {U})=\mathcal {H}_{0}\right ] = 1\beta _{\delta }\).
In order to design a practical optimal detector, as referred in [11], for steganalysis in the spatial domain, the main difficulty is to estimate the distribution parameters (expectation and variance of each pixel). On the opposite, in the case of the DCT coefficients, the application of hypothesis testing theory to design an optimal detector has previously been attempted with the assumption that the distribution parameter remains the same for all the coefficients from a same subband. With this assumption, the estimation of the distribution parameters is not an issue because thousands of DCT coefficients are available. However, which distribution model to choose remains an open problem.
The hypothesis testing theory has been applied for the steganalysis of JSteg algorithm in [22] using a Laplacian distribution model and using the assumption that DCT coefficients of each subband are i. i. d. However, this pioneer work does not allow the designing of an efficient test because a very important loss of performance has been observed when comparing results on real images and theoretically established ones. Such a result can be explained by the two following reasons: 1) the Laplacian model might be not accurate enough to detect steganagraphy and 2) the assumption that the DCT coefficients of each frequency subband are i. i. d. may be wrong. Recently, it has been shown that the use of the generalised gamma model or an even more accurate model [36,37] allows the designing of a test with very good detection performance. On the opposite, in this paper, it is proposed to challenge the assumption that all the DCT coefficients of a subband are i. i. d.
In the following section, we detail the statistical test that takes into account both the expectation and the variance as nuisance parameters, and we study the optimal detection when those parameters are known. A discussion on nuisance parameters is also provided in Section 4.
3 LRT for two simple hypothesis
3.1 Optimal detection framework
In practice, when the rate R is not known, one can try to design a test which is locally optimal around a given payload rate, named Locally Asymptotically Uniformly Most Powerful (LAUMP) test, as proposed in [6,8], but this lies outside the scope of this paper.
where, as previously defined, \(\bar {u}_{k,l} = u_{k,l} + (1)^{{u}_{k,l}}\) represents the DCT coefficient u _{ k,l } with flipped LSB.
3.2 Statistical performance of LRT
Accepting, for a moment, that one is in this most favourable scenario, in which all the parameters are perfectly known, we can deduce some interesting results. Due to the fact that observations are considered to be independent, the LR Λ ^{lr}(U) is the sum of random variables and some asymptotic theorems allow to establish its distribution when the number of coefficients becomes ‘sufficiently large’. This asymptotic approach is usually verified in the case of digital images due to the very large number of pixels or DCT coefficients.
where \(\xrightarrow {\;d\;}\) represents the convergence in distribution, and \(\mathcal {N}(0,1)\) is the standard normal distribution, i.e. with zero mean and unit variance.
Equations (11) and (12) emphasise the main advantage of normalising the LR as described in relation (9): it allows setting any threshold that guarantees a false alarm probability independently from any distribution parameters, and, this is particularly crucial because digital images are heterogeneous, their properties vary for each image. Second, the normalisation allows to easily establish the detection power which again is achieved for any distribution parameters and hence for any inspected image.
3.3 Application with Laplacian distribution
In the case of Laplacian distribution, the framework of hypothesis testing theory has been applied for the steganalysis of JSteg in [22] in which the moments of LR are calculated under the two following assumptions: 1) all the DCT coefficients from the same subband are i. i. d. and 2) the expectation of each DCT coefficient is zero.
where Δ is the quantization step.
where the observed DCT coefficient, referred to as u _{ k,l } in Equation (7), is denoted as k. It can be noted that this expression (15) of the LR is almost the same as the one obtained in [22]; assuming that all DCT coefficients have a zero mean, only the sign term sign(Δ k−μ) becomes sign(k) when assuming a zero mean. It should also be noted that the logLR equals 0 for every DCT coefficient whose value is 0 or 1 because the JSteg algorithm does not embed hidden data in those coefficients. In the present paper, the moments of the LR (15) are not analytically established; the interested reader can refer to [22].
4 Proposed approach for estimating the nuisance parameters in practice
4.1 Estimation of expectation of each DCT coefficient
where DCT represents the DCT transform and D is the change of basis matrix from spatial to DCT basis, often referred as the DCT matrix.
It makes sense to assume that the expectation of the noise component n has a zero mean in the spatial and in the DCT domain. On the opposite, it is difficult to justify that the DCT of pixels’ expectation x should necessarily be around zero. Actually, this assumption holds true if and only if the expectation is the same for of all the pixels from a block: ∀i∈{1,…,8},∀j∈{1,…,8},x _{ i,j }=x; see [36,37,39] for details.
On the opposite, in the paper, it is mainly aimed at estimating the expectation of each DCT coefficient. To this end, it is proposed to decompress a JPEG image V into the spatial domain to obtain Z, then to estimate the expectation of each pixel in the spatial domain \(\widehat {\mathbf {Z}}\) by using a denoising filter. Then, this denoised image \(\widehat {\mathbf {Z}}\) is transformed back into the DCT domain to finally obtain the estimated value of all DCT coefficients, denoted \(\widehat {\mathbf {V}} = \{ \hat {v}_{i,j} \} \,, i\in \{1,\ldots,I\} \,, j\in \{1,\ldots,J\}\). Several methods have been tested to estimate the expectation of pixels in the spatial domain \(\widehat {\mathbf {Z}}\), namely, the BM3D collaborative filtering [40], KSVD sparse dictionary learning [41], nonlocal weighted averaging method from nonlocal (NL) means [42] and the wavelet denoising filter [43]. The codes used for the methods [4042] have been downloaded from the Image Processing OnLine website^{d}. The codes used for the method [43] have been downloaded from DDE^{e}.
4.2 A local estimation of b
where \(\hat {v}_{i+8s,j+8t}\) is the estimation of expectation of each DCT coefficient by using the denoising filter previously defined. As in the WS Jpeg algorithm, this approach raises the problem of scale parameter estimation for blocks located on the sides of the image. In the present paper, as in the WS Jpeg method, it is proposed not to use those blocks in the test.
4.3 A channel selection to improve the method
 1.
By uncompressing the JPEG format image, we obtain the intensity value of a JPEG image in the spatial domain.
 2.
By using a denoising filter, we extract the raw ‘residual noise’ in the spatial domain.
 3.
By using DCT transformation, we transform the raw ‘residual noise’ from the spatial to the frequency domain.
 4.
By using quantization table, we can obtain the quantized ‘residual noise’.
 5.
By rounding the quantized ‘residual noise’ in the frequency domain, the quantized and rounded ‘residual noise’ is obtained.
 6.
If a quantized and rounded ‘residual noise’ takes zero, WF equals 0; If not, WF equals 1.
 Cover channel selection ratio: denotes the ratio of the ‘nonzero’ subset to the ‘residual noise’ set of a cover image.Table 1
Ratio (%) comparison before and after embedding
Inspected images index
No.1
No.2
No.3
No.4
No.5
No.6
No.7
No.8
No.9
No.10
On average
Cover channel selection ratio
0.23
0.17
0.56
0.61
0.21
0.03
0.87
0.41
1.23
0.33
0.63
Stego channel selection ratio
0.23
0.17
0.56
0.62
0.21
0.04
0.88
0.42
1.22
0.34
0.64
Cover DCT coefs. std
0.98
1.01
1.06
1.03
0.90
1.07
1.02
0.93
1.26
1.03
7.45
Stego DCT coefs. std
0.98
1.00
1.06
1.03
0.89
1.07
1.03
0.90
1.27
1.03
7.52
Cover JSteg selection ratio
1.12
0.81
2.46
2.49
0.80
0.08
5.07
2.42
7.56
0.34
1.44
Stego JSteg selection ratio
4.48
2.85
7.63
7.60
2.27
1.07
17.1
7.92
20.5
1.04
4.95
Cover and stego selection similarity
89.5
91.5
94.2
93.0
80.7
80.7
93.9
93.7
93.3
93.9
92.8

Stego channel selection ratio: denotes the ratio of the ‘nonzero’ subset to the ‘residual noise’ set of a stego image.

Cover DCT coefs. std: denotes the standard deviation of the ‘residual noise’ set from a cover image.

Stego DCT coefs. std: denotes the standard deviation of the ‘residual noise’ set from a stego image.

Cover JSteg selection ratio: denotes the ratio of the DCT coefficients used by JSteg in the ‘nonzero’ subset to the DCT coefficients used by JSteg in the ‘residual noise’ set from a cover image.

Stego JSteg selection ratio: denotes the ratio of the DCT coefficients used by JSteg in the ‘nonzero’ subset to the DCT coefficients used by JSteg in the ‘residual noise’ set from a stego image.

Cover and stego selection similarity: denotes the ratio of the same position in the ‘nonzero’ subset before and after embedding.
In our proposed statistical test, the number of the selected coefficients for the detection should be kept very close before and after embedding. As Table 1 illustrated, the ratio of cover channel selection ratio and stego channel selection ratio basically remains the same before and after embedding, which reveals the proportion of the coefficients used for the test as nearly the same. Similarly, the ratio of cover DCT coefs. std and stego DCT coefs. std allows us to verify our assumption that the embedding doesn’t change much the statistical properties of the ‘residual noise’. In addition, those numbers also show that, after rejection of the content, the ‘residual noise’ standard deviation is very small compared to the original DCT coefficients (see also Figures 2 and 3), which thus permits a better detection of modifications due to JSteg embedding. The ratio of cover and stego selection similarity which is kept at the high value signifies that most of the ‘residual noise’ are chosen at the same position. Then, the only difference is the comparison between the cover JSteg selection ratio and stego JSteg selection ratio. It should be noted that if all DCT coefficients used by JSteg are included in the ‘nonzero’ subset, then the ratio equals 100%. It is observed that only a few of the DCT coefficients used by the JSteg algorithm is included in the ‘nonzero’ subset. Nevertheless, after embedding, the ratio of stego JSteg selection ratio is largely improved, compared with the ratio of cover JSteg selection ratio. It can be assumed that by using a WF, more ‘residual noise’ from the embedding positions are counted. Besides, prior to embedding secret information, we never know which position will be embedded; the very low ratio of the cover JSteg selection ratio is reasonable.
By investigating the ‘nonzero’ and ‘zero’ subset, although we can not capture all the embedding positions in the DCT domain, it is totally enough to detect the JSteg steganography. Besides, all the coefficients in the ‘zero’ subset are not counted in our proposed test. On average, for a cover image with the size of 512×512, 0.63% of the coefficients are kept to compute the test; 0.64% of the coefficients from a stego image are used. As the embedding rate R=0.05, it is obvious that most of the DCT coefficients remain the same before and after embedding. Thus, it is not necessary to compute these values. Furthermore, the LR values of these DCT coefficients without embedding any information probably mask or disturb the LR from DCT coefficients with JSteg embedding.
4.4 Design of proposed test
where the channelselection decision statistic \(\widehat {\Lambda }_{\textit {cs}}(u_{k,l}) = \widehat {\Lambda }(u_{k,l})\cdot w_{k,l} \) for a single DCT coefficient is given, and a weighting factor w _{ k,l } selects the DCT channel. Next, let us study the \(\widehat {\Lambda }(u_{k,l})\) to verify the effectiveness of our proposed test.
4.5 Comparison with prior art
see details in Appendix C. This expression highlights the wellknown fact that the WS consists in fact of three terms: 1) the term w _{ σ } which is a weight so that pixels or DCT coefficients with the highest variance have a smallest importance, 2) the term \((k  \bar {k}) = \pm 1\) according the LSB of k and 3) the term (Δ k−μ).
which is also made of three terms; the two first are roughly similar to the two first terms of the WS : 1) the term w _{ b } is a weight so that DCT coefficients with the highest ‘scale’ b have the smallest importance; note that the variance is proportional to b ^{2}; 2) the term \((k  \bar {k}) = \pm 1\) according to the LSB of k. However, in the expression of the LR based on the Laplacian model, the term (Δ k−μ) of the WS is replaced with its sign. This shows that the statistical tests based on the Laplacian model and based on the Gaussian model are essentially similar.
5 Numerical simulations
5.1 Results on simulated images
One of the main contributions of this paper is to show that the hypothesis testing theory can be applied in practice to design a statistical test with known statistical properties for JSteg steganalysis.
5.2 Results on real images
Another contribution of this paper is to design the optimal test with estimated parameters to break JSteg algorithm in a practical case.
To verify the relevance of the proposed methodology, it is proposed to compare the proposed statistical test with two other detectors. The first chosen competitor is the statistical test proposed in [22] as it is also based on a Laplacian model but does not take into account the distribution parameters as nuisance parameters; it considers that DCT coefficients are i. i. d., following a Laplacian distribution with zero mean. The comparison with this test is meaningful as it allows us to measure how much the detection performance is improved by removing the assumption that the DCT coefficients of each subband are i. i. d. The second chosen competitor is the WS [15] due to its similarity with the proposed statistical test, see details in Section 4.5.
For a largescale verification, it is proposed to use the ‘break our steganographic system’ (BOSS) database, made of 10,000 grayscale images of size 512×512 pixels, used with payload R=0.05. Prior to our experiments, the images have been compressed in JPEG using the linux command convert which uses the standard quantization table. Note also that all the JSteg steganography was performed using a Matlab source code we developed based on Phil Sallee’s Jpeg Toolbox^{f}. Four denoising methods have been tested to estimate the expectation of each DCT coefficient, namely the KSVD, the BM3D, the NL means and the wavelet denoising algorithms.
Among the four denoising algorithms that have been tested, the BM3D achieves the best performance, but it can be observed in Figure 10 that the performances obtained using the KSVD and using the wavelet denoising methods are also very good. The performance of NL means method is comparable with the WS detector [15].
6 Conclusions
This paper aims at improving the optimal detection of data hidden within the DCT coefficients of JPEG images. Its main originality is that the usual Laplacian model is used as a statistical model of DCT coefficients, but opposed to what is usually proposed, it is not assumed that all DCT coefficients from a subband are i. i. d. This leads us to consider the Laplacian distribution parameters, namely the expectation e and the scale parameter b, as nuisance parameters as they have no interest for the detection of hidden data, but they must be carefully taken into account to design an efficient statistical test. Numerical results show that by estimating those nuisance parameters, the Laplacian model allows the designing of an accurate statistical test which outperforms the WS detector. The comparison with the optimal detector based on the Laplacian model and on the assumption that all DCT coefficients of a subband are i. i. d. shows the relevance of the proposed approach.
A possible future work would be to apply this approach with a stateoftheart statistical model of DCT coefficients, such as the generalized Gaussian or the generalized gamma model. This could provide improvements in the detection performance at the cost of a higher complexity.
7 Endnotes
^{a} In this paper, we assume, without loss of generality, that both width and height of the inspected image are multiples of 8.
^{b} In practice, DCT coefficients belong to set [−1024,…,1023], see [22].
^{c} Note that we refer to the Lindeberg’s CLT, whose conditions are easily verified in our case, because the random variable are independent but are not i. i. d..
^{d} Image Processing OnLine journal is available at: http://www.ipol.im
^{e} Source codes are available at: http://dde.binghamton.edu
^{f} Phil Sallee’s Jpeg Toolbox is available at: http://dde.binghamton.edu/download/jpeg_toolbox.zip
8 Appendix
9 A Quantized Laplacian pmf
Now consider the result from quantization of this random variable Y=⌊X/Δ⌋, it is immediate to establish the pmf of this random variable. Let us first consider the case Δ(k+1/2)<μ (due to the symmetry of Laplacian pdf, the case Δ(k−1/2)>μ is treated similarly).
which corresponds to the pmf given in Equation (14). The case Δ(k−1/2)<μ<Δ(k+1/2) is treated similarly.
10 C Loglikelihood ratio calculation
11 B LR based on the Gaussian model (WS)
Notes
Acknowledgements
The Matlab codes will be published upon paper acceptance. The work of FR, RC and CZ is funded by Troyes University of Technology (UTT) strategic program COLUMBO. The PhD thesis of TQ is funded by the China Scholarship Council (CSC) program.
References
 1.R Böhme, Advanced Statistical Steganalysis (Springer, New York, 2010).CrossRefMATHGoogle Scholar
 2.J Fridrich, Steganography in Digital Media: Principles, Algorithms, and Applications (Cambridge University Press, Cambridge, 2009).CrossRefGoogle Scholar
 3.I Cox, M Miller, J Bloom, J Fridrich, T Kalker, Digital Watermarking and Steganography (Morgan Kaufmann, Burlington, 2007).Google Scholar
 4.J Fridrich, J Kodovskỳ, in Information Hiding. Steganalysis of LSB replacement using parityaware features (Springer,New York2013), pp. 31–45.CrossRefGoogle Scholar
 5.T Zhang, X Ping, in Proceedings of the 2003 ACM Symposium on Applied Computing. A fast and effective steganalytic technique against jsteglike algorithms (ACM,New York2003), pp. 307–311.CrossRefGoogle Scholar
 6.R Cogranne, C Zitzmann, L Fillatre, F Retraint, I Nikiforov, P Cornu, in Information Theory Proceedings (ISIT), 2011 IEEE International Symposium On. Statistical decision by using quantized observations (IEEE,New York2011), pp. 1210–1214.CrossRefGoogle Scholar
 7.R Cogranne, C Zitzmann, L Fillatre, F Retraint, I Nikiforov, P Cornu, in Information Hiding. A cover image model for reliable steganalysis (Springer,New York2011), pp. 178–192.CrossRefGoogle Scholar
 8.C Zitzmann, R Cogranne, F Retraint, I Nikiforov, L Fillatre, P Cornu, in Information Hiding. Statistical decision methods in hidden information detection (Springer,New York2011), pp. 163–177.CrossRefGoogle Scholar
 9.J Fridrich, M Goljan, R Du, in Proceedings of the 2001 Workshop on Multimedia and Security: New Challenges. Reliable detection of LSB steganography in color and grayscale images (ACM,New York2001), pp. 27–30.CrossRefGoogle Scholar
 10.S Dumitrescu, X Wu, Z Wang, Detection of LSB steganography via sample pair analysis. Signal Process. IEEE Trans. 51(7), 1995–2007 (2003).CrossRefGoogle Scholar
 11.AD Ker, P Bas, RBöhme, R Cogranne, S Craver, T Filler, J Fridrich, T Pevnỳ, in Proceedings of the First ACM Workshop on Information Hiding and Multimedia Security. Moving steganography and steganalysis from the laboratory into the real world (ACM,New York2013), pp. 45–58.CrossRefGoogle Scholar
 12.J Fridrich, M Goljan, in Electronic Imaging 2004. On estimation of secret message length in LSB steganography in spatial domain (International Society for Optics and PhotonicsWashington, 2004), pp. 23–34.Google Scholar
 13.O Dabeer, K Sullivan, U Madhow, S Chandrasekaran, B Manjunath, Detection of hiding in the least significant bit. Signal Process. IEEE Trans. 52(10), 3046–3058 (2004).CrossRefMathSciNetGoogle Scholar
 14.AD Ker, R Böhme, in Electronic Imaging 2008. Revisiting weighted stegoimage steganalysis (International Society for Optics and PhotonicsWashington, 2008), pp. 681905–681905.Google Scholar
 15.R Böhme, in Information Hiding. Weighted stegoimage steganalysis for JPEG covers (Springer,New York2008), pp. 178–194.CrossRefGoogle Scholar
 16.R Cogranne, C Zitzmann, F Retraint, IV Nikiforov, P Cornu, L Fillatre, A local adaptive model of natural images for almost optimal detection of hidden data. Signal Process. 100, 169–185 (2014).CrossRefGoogle Scholar
 17.EY Lam, JW Goodman, A mathematical analysis of the DCT coefficient distributions for images. Image Process. IEEE Trans. 9(10), 1661–1666 (2000).CrossRefMATHGoogle Scholar
 18.F Muller, Distribution shape of twodimensional DCT coefficients of natural images. Electron. Lett. 29(22), 1935–1936 (1993).CrossRefGoogle Scholar
 19.JH Chang, JW Shin, NS Kim, SK Mitra, Image probability distribution based on generalized gamma function. Signal Process. Lett. IEEE. 12(4), 325–328 (2005).CrossRefGoogle Scholar
 20.P Sallee, Modelbased methods for steganography and steganalysis. Int. J. Image Graph. 5(01), 167–189 (2005).CrossRefGoogle Scholar
 21.R Böhme, A Westfeld, Breaking cauchy modelbased JPEG steganography with first order statistics. Computer Security–ESORICS, 125–140 (2004).Google Scholar
 22.C Zitzmann, R Cogranne, L Fillatre, I Nikiforov, F Retraint, P Cornu, in Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference On. Hidden information detection based on quantized Laplacian distribution (IEEE,New York2012), pp. 1793–1796.CrossRefGoogle Scholar
 23.J Kodovsky, J Fridrich, Quantitative structural steganalysis of jsteg. Inform. Forensics Secur. IEEE Trans. 5(4), 681–693 (2010).CrossRefGoogle Scholar
 24.K Lee, A Westfeld, S Lee, in Digital Watermarking. Category attack for LSB steganalysis of JPEG images (Springer,New York2006), pp. 35–48.CrossRefGoogle Scholar
 25.S Lyu, H Farid, Steganalysis using higherorder image statistics. Inform. Forensics Secur. IEEE Trans. 1(1), 111–119 (2006).CrossRefGoogle Scholar
 26.T Pevny, J Fridrich, Multiclass detector of current steganographic methods for JPEG format. Inform. Forensics Secur. IEEE Trans. 3(4), 635–650 (2008).CrossRefGoogle Scholar
 27.P Bas, T Filler, T Pevný, in Information Hiding, 13th International Workshop, ed. by Filler T. Break our steganographic system — the ins and outs of organizing boss (IEEE,New York2011).Google Scholar
 28.T Denemark, V Sedighi, V Holub, R Cogranne, J Fridrich, in IEEE Workshop on Information Forensic and Security, Atlanta, GA. Selectionchannelaware rich model for steganalysis of digital images (IEEE,New York2014).Google Scholar
 29.W Tang, H Li, W Luo, J Huang, in Proceedings of the 2nd ACM Workshop on Information Hiding and Multimedia Security. Adaptive steganalysis against WOW embedding algorithm (ACM,New York2014), pp. 91–96.CrossRefGoogle Scholar
 30.T Qiao, C Ziitmann, R Cogranne, F Retraint, in Proceedings of the 2nd ACM Workshop on Information Hiding and Multimedia Security. Detection of jsteg algorithm using hypothesis testing theory and a statistical model with nuisance parameters (ACM,New York2014), pp. 3–13.CrossRefGoogle Scholar
 31.T Qiao, C Zitzmann, R Cogranne, F Retraint, in IEEE International Conference on Image Processing (ICIP). Statistical detection of jsteg steganography using hypothesis testing theory (IEEENew York, 2014), pp. 5517–5521.Google Scholar
 32.J Nakamura, Image Sensors and Signal Processing for Digital Still Cameras (CRC Press, Boca Raton, 2005).CrossRefGoogle Scholar
 33.WB Pennebaker, JL Mitchell, JPEG: Still Image Data Compression Standard (Springer, Germany, 1993).Google Scholar
 34.D Upham, Jsteg steganographic algorithm 1999 Available on the internet. http://www.filewatcher.com/m/jpegjstegv4.diff.gz.88780.html.
 35.EL Lehmann, JP Romano, Testing Statistical Hypotheses (Springer, Germany, 2006).Google Scholar
 36.TH Thai, R Cogranne, F Retraint, in ICIP. Steganalysis of Jsteg algorithm based on a novel statistical model of quantized DCT coefficients (IEEE,New York2013), pp. 4427–4431.Google Scholar
 37.T Thai, R Cogranne, F Retraint, Statistical model of quantized DCT coefficients: Application in the steganalysis of jsteg algorithm. IEEE Trans. Image Process. Publ. IEEE Signal Process. Soc. 23(5), 1980–1993 (2014).CrossRefMathSciNetGoogle Scholar
 38.R Cogranne, F Retraint, An asymptotically uniformly most powerful test for LSB matching detection. IEEE Trans. Information Forensics and Security. Publ. IEEE Signal Process. Soc. 8(3), 464–476 (2013).CrossRefGoogle Scholar
 39.TH Thai, F Retraint, R Cogranne, in Image Processing (ICIP) 2012 19th IEEE International Conference On. Statistical model of natural images (IEEENew York, 2012), pp. 2525–2528.CrossRefGoogle Scholar
 40.M Lebrun, An analysis and implementation of the BM3D image denoising method. Image Processing On Line. 2, 175–213 (2012).CrossRefGoogle Scholar
 41.M Lebrun, A Leclaire, An implementation and detailed analysis of the KSVD image denoising algorithm. Image Processing On Line. 2, 96–133 (2012).CrossRefGoogle Scholar
 42.A Buades, B Coll, JM Morel, in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference On, 2. A nonlocal algorithm for image denoising (IEEENew York, 2005), pp. 60–65.Google Scholar
 43.J Lukas, J Fridrich, M Goljan, Digital camera identification from sensor pattern noise. Inform. Forensics Secur. IEEE Trans. 1(2), 205–214 (2006).CrossRefGoogle Scholar
 44.R Cogranne, F Retraint, C Zitzmann, I Nikiforov, L Fillatre, P Cornu, Hidden information detection using decision theory and quantized samples: Methodology, difficulties and results. Digital Signal Process. 24, 144–161 (2014).CrossRefMathSciNetGoogle Scholar
Copyright information
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.