1 Introduction

The rapid development of information and communication technologies have helped facilitate people’s social interactions. Online social media platforms like Twitter provide people a new way to build up their social relationships, share their daily lives, and express their emotions. However, many online users frequently (and often unintentionally) share personal information online, which can lead to unwanted online disclosures of private information of themselves or other people in their social networks. Figure 1 shows several imaginary but realistic online posts of such unintended privacy disclosures on Twitter, generated based on some examples in a research dataset of privacy-disclosing tweets constructed by Song et al. (2018). Although people can check their online posts manually to avoid privacy disclosures, many online users do not have a good level of awareness on such privacy issues, and they do not necessarily know when and what to check. Therefore, automated solutions that can help online users identify such issues and take proper actions are important, which is the focus of our work.

Fig. 1
figure 1

Several illustrative examples of possible privacy disclosures on online social media platforms

Past studies about privacy disclosure detection attempted to solve this problem with different machine learning methods. Traditional methods on privacy disclosure detection try to detect privacy disclosures in user profiles or user settings, but not in user generated content (UGC), leading to incomplete detection. More recently, many researchers started studying privacy disclosure detection in UGC by analysing pictures and/or texts in such UGC. Therefore, their work extends the scope of such work.

Recently, some researchers use the multi-label text classification (MLTC) framework to model the privacy disclosure problem (Song et al. 2018; Chen et al. 2020). MLTC is a an important task in the field of natural language processing (NLP). Different from multi-class text classification (MCTC), which classifies a given piece of text into one of multiple class labels, MLTC aims to tag a piece of given text with multiple (i.e., one or more) content-specific labels. In Song et al. (2018) and Chen et al. (2020), the privacy information is divided into eight main categories, then they make further division, using 32 categories of labels to reflect the possible disclosed privacy. However, their methods are limited due to the lack of consideration for the relationship between texts and labels. Their methods aim to improve the prediction results by considering the co-occurrence relation between labels. For example, the label “Health condition” usually appears with the label “Treatment” and the label “Occupation” usually appears with the label “Salary”. However, those two methods do not consider label-text correlations, i.e., their work ignores the fact that some key words or phrases in the input texts can assist indicating the possible privacy-aware labels. For example, a location name in the input text may help to indicate that the text is involved in the privacy disclosure of “Current location” or “Place planning to go”. We follow their thoughts to model privacy disclosure detection as an MLTC problem. Our proposed framework takes an online post as the input, and outputs a number of privacy-relevant labels that indicate potential disclosure of different types of personal information in the input online post.

Considering that privacy disclosure is a universal problem in people’s daily life, new frameworks with better performance on privacy disclosure detection are needed. The aim of our work is to provide a more effective MLTC privacy disclosure detection algorithm to facilitate the fine-grained text privacy detection. As mentioned before, current MLTC privacy disclosure models are limited by their consideration of relationships between various texts or words. In order to improve the performance of privacy-disclosing post detection, which combines three different sources of relevant information, the text information, the label-to-text correlation and the label-to-label correlation, to produce a more comprehensive model for detecting privacy-disclosing online posts. Our model extracts the text representations through a double-attention mechanism as Xiao et al. (2019) did, which measures the contribution of each word to each privacy-relevant label. The label-to-label correlation is considered in the final text representation via a graph convolutional network (GCN). We propose a new feature fusion mechanism assisted by GCN to make the fused feature more comprehensive. We utilize the label-to-label correlation to obtain the proposed compensation coefficients from both the self-attention and the label-attention text representations. We summarize the main contributions of our work as follows:

  • A new privacy disclosure detection model with multi-label text classification is proposed. Our model presents a new fine-grained privacy disclosure detection algorithm and outputs multiple privacy-aware labels as the possible leaked privacy. From the perspective of the detection performance, our model provides a better solution to the fine-grained privacy disclosure detection on the UGC.

  • Our proposed model considers three different sources of relevant information for the MLTC task: the input text itself, the label-to-text correlation, and the label-to-label correlation.

  • A new feature fusion mechanism assisted by a GCN is proposed to construct comprehensive text representations with the guidance of the label-to-label correlation. The idea of compensation coefficients is proposed in the feature fusion mechanism, which reflects the compensation relationship between self-attention and label-attention.

  • A series of experiments on a public privacy-disclosing tweet dataset showed that our proposed model outperformed selected state-of-the-art models significantly and consistently. Our code has been released to facilitate others to conduct follow-up research.Footnote 1

The rest of the paper is organized as follows. Section 2 introduces the related work. Section 3 elaborates the proposed MLTC-based model for privacy detection. Section 4 shows and discusses the experiment results. Section 5 concludes our work and discusses the future work. Section 6 makes statements on financial or non-financial interests that are directly or indirectly related to the work submitted for publication.

2 Related work

2.1 Privacy disclosure analysis

The problem of online privacy disclosures has attracted the attention of many researchers. Some researchers studied this problem based on analysis of user profiles (Biega et al. 2017; Eslami et al. 2017; Huang and Paul 2019) or privacy settings of user accounts (Raber and Krüger 2018; Sanchez et al. 2020). Biega et al. (2017) proposed a privacy-aware framework that leverages solidarity in a large community to scramble user interaction histories, in order to disturb the information collection from user profiles by the online service providers. To minimize users’ privacy risks, Eslami et al. (2017) proposed an alternative solution, where posts of different users are split and merged into synthetic mediator profiles. Raber and Krüger (2018) studied privacy settings of user accounts by observing the context factors and personality measures which can be used to predict the correct privacy level out of seven privacy levels. Sanchez et al. (2020) considered how to model users’ privacy preferences for data sharing and processing in the IoT and fitness domain, paying a specific attention to the GDPR compliance.

Some other researchers such as Tran et al. (2016) and Mao et al. (2011) also proposed classifiers to detect privacy disclosures in user-generated online posts. Tran et al. (2016) proposed Privacy-CNH, a binary classification framework that utilizes hierarchical features including both object and convolutional features in a deep learning model to detect whether a photo is private or not. Mao et al. (2011) analysed privacy disclosures on Twitter by building binary classifiers to detect three types of privacy disclosure including divulging vacation plans, tweeting under the influence of alcohol and revealing medical conditions. Despite all the past studies, they only focused on privacy disclosure detection at a more coarse-grained level. These studies used frameworks or classifiers to implement relatively simple analysis of privacy disclosures, normally based on less comprehensive privacy categories so not being able to cover some specific privacy disclosure scenarios.

In order to achieve finer-grained analysis, Song et al. (2018) proposed a taxonomy-guided multi-task learning model to detect what personal aspects of online users are disclosed in online posts. They also constructed a dataset of privacy-disclosing tweets covering 32 privacy-relevant personal aspects. Similarly, Chen et al. (2020) proposed GrHA, a fine-grained privacy detection network, to improve the performance of the model proposed in (Song et al. 2018). The above two proposed methods aim to improve the prediction results by considering label co-occurrences, but they did not consider label-to-text correlations explicitly.

2.2 Multi-label text classification

Traditional machine learning methods (Kumar and Daumé III 2012; Jacob et al. 2008) have been widely used to deal with MLTC tasks. Kumar and Daumé III (2012) proposed the GO-MTL model by using grouping and overlap mechanism to enhance the semantic correlations in MLTC tasks. Likewise, Jacob et al. (2008) studied the clustered multi-task learning to deal with MLTC tasks. Although these machine learning methods utilize multiple hand-crafted features to enhance the semantic representations in MLTC tasks, they overlook deep semantic features among input text and multi labels.

Nowadays, researchers have made great progress on the deep learning technology. Therefore deep models such as CNN (Liu et al. 2017; Kurata et al. 2016; Kim 2014) and RNN (Liu et al. 2016; Chen et al. 2017) have been used to implement end-to-end MLTC tasks. In more recent studies, researchers have also proposed to use attention mechanisms such as DocBERT (Adhikari et al. 2019) and other methods such as SGM (Yang et al. 2018) and LSAN (Xiao et al. 2019) to consider the label-to-text correlation in the MLTC problem. Adhikari et al. (2019) proposed DocBERT model as a much simpler BERT model with competitive accuracy at a far more modest computational cost in terms of MLTC tasks. Yang et al. (2018) considered how to address the MLTC problem by capturing the correlations between labels as well as the most informative words automatically when predicting different labels. Xiao et al. (2019) used self-attention and label-attention for better representations of input text in MLTC tasks. Label co-occurrences are a vital source of information when dealing with the MLTC problem. More specifically, some labels often appear with other labels due to the semantic relation. However, most existing methods focus only on optimizing the process of feature extraction, but do not consider label co-occurrences. By utilizing the GCN model, Ma et al. (2021) proposed LDGN (label-specific dual graph neural network) to improve the MLTC representations by including label co-occurrences. Although they considered label co-occurrences to a certain extent, their method has some limitations in the process of combination with the feature exaction module, for their model’s usage of the GCN only attempts to optimize the text representation of the model with label co-occurrences yet ignores diversity of the text representation and the labels’ guidance on fusing different feature vectors.

3 Proposed method

In this section, we introduce the GCN-based double attention network, as shown in Fig. 2. The network includes four major components: (1) an input text feature encoder that transforms the input text into word-level semantic vectors; (2) a double-attention text representation component that enhances the important word representations of the text combining both text information and label information; (3) a GCN-assisted feature fusion mechanism that utilizes the label-to-label correlation acquired by GCN to guide the double-attention information fusion process; and (4) a label probability output component that predicts the probabilities of various privacy-relevant labels.

Fig. 2
figure 2

The architecture of the proposed network

3.1 Problem formulation

Let \(\mathbb {D}=\left\{ (x_i, y_i)\right\} _{i=1}^N\) denote the set of texts, where \(x_i\) represents the input texts and \(y_i\in \{0,1\}^L\) represents its corresponding labels. Here, L denotes the total number of privacy-relevant labels. The target of the proposed method in this paper is to learn the output probability of each label from the input text, in order to match the most relevant labels.

3.2 Input text feature encoder

Given a text \(x_i\) containing M words \((x_i=\left\{ w_{i1}, w_{i2}, \cdots , w_{iM}\right\} )\), the word2vec method (Le and Mikolov 2014) is adopted to obtain the embedding vector based on the input, which is denoted as \(\mathbf {E_s} \in \mathbb {R}^{M \times d_1}\), where \(d_1\) denotes the embedding dimension.

For fair comparisons, we used the same feature extraction structure, bidirectional long short-term memory (BiLSTM) (Zhou et al. 2016), as the baseline models (Chen et al. 2020; Xiao et al. 2019; Ma et al. 2021) used, to get the embedding. We adopt the BiLSTM model to process the embedded vector. The formula is as follows:

$$\begin{aligned} \begin{aligned}&\textbf{H}=\left\{ \overrightarrow{\textbf{H}}_r, \overleftarrow{\textbf{H}}_l\right\} ,\\ \end{aligned} \end{aligned}$$
(1)

where \(\overrightarrow{\textbf{H}}_r, \overleftarrow{\textbf{H}}_r \in \mathbb {R}^{M\times d_2}\) represent the forward and backward text representations, respectively. The whole text can be represented as \(\textbf{H} \in \mathbb {R}^{M\times 2d_2}\).

3.3 Double-attention text representation

We use a double attention mechanism to generate text- and label-specific representations from the output of the BiLSTM. A self-attention model is adopted to capture the long-term dependence of words in \(\textbf{H}\). Meanwhile, to extract the text attention from the corresponding labels, a label-specific attention model is used as the supplementary information.

3.3.1 Self-attention model

Self-attention models have shown their considerable merits on assessing the importance of word representations. Therefore, we adopt a self-attention mechanism (Lin et al. 2017) to reinforce the semantic representation of the text based on the word-to-word correlations. Different from traditional self-attention algorithms, the self-attention sentence embedding algorithm (Lin et al. 2017) uses multiple hops of attention calculated from the LSTM outputs \(\textbf{H}\) to focus on different aspects of the meanings of the sentence. Since the output labels have the dimensionality of L, we take the self-attention weights with L dimensions to reflect the effects of L labels to M words. The calculation of attention weights can be described as follows:

$$\begin{aligned} \textbf{A}_s={\text {softmax}}\left( \textbf{W}_{s2} \tanh \left( \textbf{W}_{s1} \textbf{H}^T\right) \right) , \end{aligned}$$
(2)

where \(\textbf{A}_s \in \mathbb {R}^{L \times M}\) are the self attention weights that indicate the effect of each word to each label. \(\textbf{W}_{s1} \in \mathbb {R}^{d_3 \times 2d_2}, \textbf{W}_{s2} \in \mathbb {R}^{L \times d_3}\) are the parameters to be trained. Then, the attention weights are utilized to update the text representation:

$$\begin{aligned} \textbf{Q}_s=\textbf{A}_s \times \textbf{H}^T. \end{aligned}$$
(3)

3.3.2 Label-attention model

Apart from obtaining text attention from the text itself, the label-attention model (Xiao et al. 2019) is adopted to extract text attention from the corresponding labels. The labels’ semantic information is acquired with the word2vec method, which is denoted as \(\textbf{E}_l \in \mathbb {R}^{L \times d_1}\).

To capture a better semantic representation with the guidance of output labels, the label-attention mechanism computes the attention weights by calculating the relationship between the labels and the text as follows \(\textbf{A}_l=\textbf{E}_l \times \textbf{H}^T\), where \(\textbf{A}_l \in \mathbb {R}^{L \times M}\) are the label-specific attention weights that indicate the effect of each word to each label. With the weight matrix, the label-specific attention weights are utilized to enhance the label-aware information in the text semantic representation \(\textbf{Q}_l=\textbf{A}_l \times \textbf{H}^T\).

3.4 GCN-assisted feature fusion

In this section, the GCN-assisted feature fusion mechanism is described to construct comprehensive text representations with the guidance of the label-to-label correlation.

We use a GCN framework to extract a label-to-label correlation matrix. With the guidance of the correlation matrix, we enhance the text representations by utilizing the proposed compensation coefficients to implement the algorithm of feature fusion.

3.4.1 GCN-based label-to-label correlation extraction

The graph convolutional networks (GCNs) (Kipf and Welling 2016) were proposed to get a better understanding of the relationship of nodes in a graph. A GCN uses an adjacency matrix to characterize the graph structure and a convolutional network to capture the correlations among different nodes, with an output of a correlation matrix. In our work, we aim to extract the label co-occurrence through a GCN. The label co-occurrence refers to the simultaneous occurrence of two or more labels in the same text. For example, considering the two labels “Salary” and “Occupation”, their probability of co-occurrence is high due to their semantic relation (i.e., an occupation is normally associated with a salary). Therefore, we utilize the GCN to transform such label-to-label relationships (inferred from label co-occurrences and their semantic relationships) into mathematical representations.

As Fig. 3 shows, the output labels are represented as a weighted label graph \((\textbf{V},\textbf{E})\), where each node represents a label embedding and each edge’s weight refers to the two adjacent labels’ co-occurrence frequency. More specifically, each node is initialized to be the embedded vector of the corresponding label and each edge weight is calculated to be the co-occurrence frequency of the two labels representing the two adjacent nodes based on information in the training set. In Fig. 3, the symbol \(\#\) represents the the number of occurrences. For example, \(\#(a)\) represents the number of tweets with the label a in the training set and \(\#(a,b)\) represents the number of tweets with both labels a and b in the training set. We use \(\textbf{P}\) to represent the initial co-occurrence adjacent matrix. According to Chen et al. (2020), considering the noisy co-occurrence caused by the sparse real-world dataset, the initial co-occurrence adjacent matrix \(\textbf{P}\) should be binarized and revised as follow:

$$\begin{aligned} a_j^k= {\left\{ \begin{array}{ll}\frac{u}{\sum _{x=1}^L p_j^k}, &{} \text { if } j \ne k, \\ 1-u, &{} \text { if } j=k,\end{array}\right. } \end{aligned}$$
(4)

where \(p_j^k\) represents the co-occurrence frequency of label j to label k and \(a_j^k\) represents the revised co-occurrence frequency. u represents the trade-off parameter that balances the weights between the label itself and its correlated labels. We use \(\textbf{A}\) to represent the revised adjacency matrix. In our work, we use the same revised adjacency matrix as Chen et al. (2020) did. The trade-off parameter is set to 0.2.

Then, a GCN is adopted to update the label-to-label correlation representations from the previous representations and the adjacency matrix containing co-occurrence probabilities. The GCN propagation is calculated as follows:

$$\begin{aligned} \textbf{C}^{(l+1)}=\sigma \left( \textbf{AC}^{(l)} \textbf{W}^{(l)}_g\right) , \end{aligned}$$
(5)

where \(\textbf{C}^{(l)} \in \mathbb {R}^{L \times d_4^{(l)}}\) represents the input label-to-label correlation representations for the l-th GCN layer, \(\sigma\) denotes the activation function (LeakyReLU is adopted here), \(\textbf{A}\) is the revised adjacency matrix, and \(\textbf{W}^{(l)}_g \in \mathbb {R}^{d_4^{(l)} \times d_4^{(l+1)}}\) denotes the transformation matrix to be learned for the l-th layer.

Our GCN contains two layers. As a result, the second layer’s embedding size adopts \(2d_2\) to align the dimension of the output from the double-attention model. Thus the correlation matrix is obtained from the output of the second layer, which is denoted as \(\textbf{C}^{\text {out}}\in \mathbb {R}^{L \times 2d_2}\).

Fig. 3
figure 3

Construction of the initial weighted label graph

3.4.2 Feature fusion guided by label-to-label correlation

As mentioned above, we obtain the text representations including the text semantic information (from self-attention) and the label-to-text correlation (from label-attention), and represent the label-to-label correlation through a GCN. The text semantic information uses the self-attention mechanism to enhance the weight of key words or phrases based on the inputting text semantics itself. Meanwhile, the label-to-text correlation provides the improved text representations through the label-attention mechanism, which is based on the labels’ semantic representations. Therefore, these two text representations shuffle the word weights of the input texts to enhance their key parts. However, they are based on different semantic information (the text itself and the labels’ semantics) and the enhanced parts are different. Therefore, it is important and necessary to fuse these two representations in order to get a more comprehensive semantic representations. To this end, we propose a cross-attention model that utilizes the label-to-label correlation matrix to guide the fusion of output features from the double-attention model. The experimental results demonstrated the superiority of our model compared to other state-of-the-art methods.

Our method aims to enhance the weak part of the representations in the output from different attention models and utilize the label-to-label correlation to fuse such output features better. More specifically, the output from the self-attention mechanism enhances the key words or phrases according to the context semantics of inputting texts yet lacks the representation enhancement from label-text correlation features, while the output from the label-attention mechanism enhances the key words or phrases according to the label semantics yet lacks the representation enhancement from text semantic features. Therefore, with the guidance of a GCN, we aim to acquire the complementary feature vectors of these two representations. We use the proposed compensation coefficients guided by the GCN to quantify the extent of the compensation above. First, we calculate the cross-attention weights, denoted by \(\textbf{W}_l, \textbf{W}_s \in \mathbb {R}^L\), which indicate the compensation coefficients of each representation. The model’s output can be described as follows:

$$\begin{aligned} \begin{aligned}&\textbf{W}_l=f\left( \textbf{C}^{\text {out}} \textbf{Q}_s^T \textbf{W}_{a1}\right) ,\\&\textbf{W}_s=f\left( \textbf{C}^{\text {out}} \textbf{Q}_l^T \textbf{W}_{a2}\right) ,\\&\textbf{W}_l+\textbf{W}_s=\textbf{1}, \end{aligned} \end{aligned}$$
(6)

where \(\textbf{W}_{a1},\textbf{W}_{a2} \in \mathbb {R}^L\) are parameters to be trained, f represents the sigmoid function, the third equation is to let \(\textbf{W}_l\) and \(\textbf{W}_s\) satisfy the normalization constraint, and \(\textbf{1}\) represent an all-one vector. Then, according to the compensation coefficients, the i-th label based final text representation can be obtained as \(\textbf{Q}_i=\textbf{W}_{li}\textbf{Q}_{li}+\textbf{W}_{si}\textbf{Q}_{si}\). The final text representation output by the proposed model is \(\textbf{Q}=\{\textbf{Q}_i\}_{i=1}^L \in \mathbb {R}^{L \times 2d_2}\).

3.5 Label probability prediction

After obtaining the fused text representation, we feed \(\textbf{Q}\) into a fully connected layer for the label probability prediction to produce the prediction result \(\hat{y}=f\left( \textbf{Q}\textbf{W}_o\right)\), where f represents the sigmoid function and \(\textbf{W}_o \in \mathbb {R}^{2d_2}\) are the parameters to be trained.

After comparing the predicted labels \(\hat{y}\) with the ground-truth \(y \in \{0,1\}^L\), the proposed model is trained with the cross entropy loss as follows:

$$\begin{aligned} \mathcal {L}=\sum _{l=1}^L y_l \log \left( \hat{y}_l\right) +\left( 1-y_l\right) \log \left( 1-\hat{y}_l\right) . \end{aligned}$$
(7)

4 Experimental results

To evaluate our proposed model, we conducted numerous experiments on a public dataset of privacy-disclosing tweets and compared the performance of our model with selected state-of-the-art methods in terms of key performance metrics. Furthermore, we verified the effect of each component in our model with corresponding ablation tests and component analysis. Finally, we used our proposed model to test some concrete tweet examples to demonstrate the practicability of the proposed model.

4.1 Experimental setup

4.1.1 Dataset used

We evaluated our proposed model on the public dataset of privacy-disclosing tweets introduced in (Song et al. 2018), which includes 11,368 tweets each annotated with one or more privacy-relevant labels representing 32 privacy-oriented personal aspects. Figure 4 illustrates 32 categories of privacy in the dataset specifically. In the dataset, the personal privacy is firstly divided into eight groups, including “healthcare”, “life milestones”, “personal attributes”, “relationship”, “activities”, “location”, “emotion” and “neutral statements”. the first seven groups represent seven general privacy groups and the last group “Neutral statements” represents those tweets that do not disclose any category of privacy. These eight groups make a higher-level categorization of privacy-related information, which covers most of personal privacy disclosures we can observe in the real world. Furthermore, the eight privacy groups are subdivided into 32 finer-grained privacy categories, which show different types of privacy-related information more specifically. Our experiments are based on 32 privacy-oriented personal aspects and each label represents one privacy-oriented personal aspect. To the best of our knowledge, no any other public datasets offer a comparable level of richness and comprehensiveness considering the size of the dataset and the richness of privacy-oriented personal aspects. Table 1 shows the number of tweets with a specific quantity of unique personal aspects. An average tweet is annotated with 1.31 personal aspects.

Fig. 4
figure 4

Illustration of 32 categories of privacy used in our experiments

Table 1 The number of tweets with a specific quantity of unique personal aspects, as annotated in the Twitter dataset

4.1.2 Evaluation metrics

Following the settings of previous work (Chen et al. 2020), we use average precision (Avg-prec), one-error (One-err), precision at top K (P@K) and S@K for performance evaluation, which are explained as follows:

Average precision (Avg-pre)    Average precision evaluates the overall precision of the input texts over the ranking list of labels according to the ground truth (Nguyen et al. 2013).

One-error (One-err)    One-error represents the mean possibility that the first prediction of the personal aspects does not conform to the ground truth (Zhang and Zhou 2007).

P@K    P@K refers to the average precision of label predictions among the top K recommended results.

S@K    S@K refers to the mean probability that a correct personal aspect is captured within the top K recommended results (Song et al. 2018).

4.1.3 Parameter settings

For fair comparisons, we split the dataset in our experiments in the same way as in previous work (Song et al. 2018; Chen et al. 2020). The experimental results were obtained through the 10-fold cross-validation.

We split the training set into a training subset and a validation subset whose ratio is 8:1. We selected the best parameter configuration based on the validation performance, i.e., the hyper-parameter fine-tuning was completed based on evaluation metrics calculated from the validation subset. To obtain the word embedding and label embedding, we utilized the word2vec method to convert texts into 300 dimensional vectors, which means \(d_1=300\). The BiLSTM hidden dimension is set as \(d_2=300\). The hyper-parameter corresponding to the self-attention mechanism is set as \(d_3=200\). Furthermore, our model’s GCN uses a 2-layer model with the hidden dimension of 450. The batch size searched are 16, 32, 64, and 128, and the learning rate searched are 0.1, 0.01, 0.001, and 0.0001. According to the validation performance, we took 64 as the batch size, and used the Adam optimizer (Kingma and Ba 2015) to minimize the loss with the initial learning rate of 0.001. We use the Floating-Point Operations (FLOPs) and Multiply-Accumulates (MACs) to measure the computational complexity of the proposed model. The experimental results indicate that the FLOPs of the proposed model is 12.61G and the MACs of the proposed model is 1.59M.

4.2 Baseline models

First, we compared our proposed model with several methods for predicting privacy disclosures in online posts, including five shallow learning methods and four deep learning methods. To further demonstrate our proposed method’s performance, we compared it with two recent state-of-the-art MLTC models. Therefore, we used the following eleven models as baselines.

  • SVM (Cortes and Vapnik 1995): A classical machine learning model that concatenates the privacy-oriented features into a single vector and learns each personal aspect individually.

  • MTL-Lasso (Tibshirani 1996): A multi-task learning method (MTL) with Lasso which implements the \(l_1\)-penalization to the regression objective function.

  • GO-MTL (Kumar and Daumé III 2012): A model using grouping and overlap mechanism to learn the semantic correlations among personal aspects.

  • CMTL (Jacob et al. 2008): The clustered multi-task learning (CMTL) which assumes personal aspects can be clustered into several groups and each group can be learned together.

  • TOKEN (Song et al. 2018): The latent group MTL that utilizes the pre-defined personal aspect taxonomy to learn the group-sharing and aspect-specific latent features of personal aspects simultaneously.

  • TextRNN (Giles et al. 1994): A RNN-based model which uses RNN and logistic regression for privacy disclosure detection.

  • TextCNN (Kim 2014): A CNN-based model which also uses CNN and logistic regression (similar to TextRNN) for privacy disclosure detection.

  • D-TOKEN (Song et al. 2018): An end-to-end model as an extension of TOKEN, which replaces the hand-crafted features by representation automatically learned by hierarchical attentive network (HAN).

  • GrHA (Chen et al. 2020): A HAN-based privacy detection model which uses graph-regularization mechanism to enhance label co-occurrences representations.

  • LSAN (Xiao et al. 2019): A label-specific attention network model based on self-attention and label-attention mechanism.

  • LDGN (Ma et al. 2021): A label-specific dual graph network model which contains label-attention and dual graph neural network.

4.3 Experimental results and discussion

Table 2 shows the performance metrics of all the compared methods, all based on the same dataset. For LSAN and LDGN, the two most recent baseline models, the experimental results were obtained from our own experiments. For other baseline models, the performance figures were taken from Chen et al. (2020), which were obtained using the same dataset and experimental settings as we used. The results show that our method outperformed all other baseline models, proving the effectiveness of the double-attention mechanism and the GCN-assisted feature fusion mechanism.

Table 2 Performance comparisons with selected state-of-the-art methods on the dataset used. Partial experimental results of baseline models are directly extracted from Chen et al. (2020)

For all the evaluated models, deep learning methods are proved to access better results than shallow learning methods, which shows the importance of neural network on extracting text’s features. Among all the deep models, TextRNN, TextCNN, D-TOKEN are less effective because those models only focus on the features of the text and ignore the relationship between text and labels. GrHA and LSAN improve the results to a certain extent, on account for using the attention mechanism to extract the texts’ correlation. However GrHA ignores the label-to-text correlation and directly utilizes the GCN to introduce label co-occurrences rather than assisting the feature fusion process. LSAN does not consider the impact of labels’ co-occurrence, which causes the adverse effects on final results. LDGN uses label-attention and dual graph neural network to make up the deficiency of co-occurrence for labels. However by comparing with LDGN and our proposed model, the latter outperforms because its methods for processing label-to-label correlation is based on the GCN-assisted feature fusion mechanism, which uses the compensation coefficients to guide the fusion of text representations, while LDGN only uses the dot product operation.

In conclusion, the proposed network outperforms shallow models, deep embedding models, label attention based models. The improvement of the proposed model demonstrates the effectiveness of the double attention mechanism and the proposed GCN-assisted feature fusion mechanism.

4.4 Ablation tests

A series of ablation tests were conducted to show the contribution of each module in the proposed network. Since the proposed model has three functional modules, the self-attention module (S), the label-attention module (L) and the GCN-assisted feature fusion module (G), in the ablation tests, we experimented all six possible combinations of the three modules: S, L, SL (which is effectively LSAN), SG, LG, and SLG (which is our model). Note that G cannot be used alone.

As Table 3 presents, Model LG outperformed Model L while Model SG outperformed Model S, which shows the function of the GCN-assisted feature fusion module. Meanwhile aforementioned improvement is slight, which indicates that the GCN-assisted feature fusion module can exhibit its maximum function only with double attention mechanism. Model SL performed better than Models LG and SG, which indicates that the text representation is still the core process of the privacy MLTC. Model LG outperformed Model SG, which demonstrates that the label-attention mechanism can capture the feature of texts and labels more effectively and more accurately than the self-attention mechanism. Our proposed model (SLG) gained the best performance for all metrics, showing that combining all the three sources of information is indeed effective.

Table 3 Ablation tests of our proposed method using six different possible combinations of the three key components

4.5 Component analysis

To further illustrate the performance of the proposed model, we conducted some further analysis for each component of our proposed model and present several samples selected from the privacy dataset we used.

4.5.1 Label attention weights

We can use heat maps to show the label attention weights. For several test samples from the test set of our dataset, such a heat map is shown in Fig. 5. The brightness of the red bar represents the label attention weight of each word (darker = larger weight), according to the double-attention mechanism. For example, the more significant words for the label “occupation” are “a coach”. for the label “current location”, the label attention mechanism focuses on names of places such as “Washington DC”. Generally speaking, the label attention mechanism is capable of extracting important information in the input text and benefiting the subsequent classification module.

Fig. 5
figure 5

The visualization of label attention weights

4.5.2 GCN-assisted feature fusion

To show the effectiveness of the GCN-assisted feature fusion visually, we can also use a heat map representing label co-occurrences. One example is given in Fig. 6, which shows that the label “occupation” correlates highly with the label “salary”, and the label “graduation” correlates highly with the label “education”. besides, the label “education” correlates with the label “graduation” to some extent. on the other hand, the label “passing away of relatives” is almost irrelevant to other labels due to their lack of semantic connections. The example demonstrates that the GCN-based model can extract label-to-label relationships with the graph structure quite effectively.

Fig. 6
figure 6

The visualization of labels’ co-occurrence adjacency matrix (O = “occupation”, S = “salary”, G = “graduation”, E = “education”, C = “career promotion”, and P = “passing away of relatives”)

To provide further evidence of the effectiveness of our GCN-based method, we also compared the performance of two groups of distinct GCN-based modules: our proposed GCN-assisted feature fusion module and the more common dot-product-based GCN modules. For the latter, we considered three possible modules: Dot-S—the dot-product-based model with self attention only, Dot-L—the dot-product-based model with label attention only, and Dot-SL—the dot-product-based model with double attention. The comparison results are shown in Table 4, which shows that our proposed GCN-based module outperformed all the other three dot-product-based modules. Compared with the dot-product-based modules, our module utilizes the label-to-label correlation matrix to guide the fusion of the output from the double-attention network, which can gain a better text representation.

Table 4 The performance comparison of models based on our GCN-based module and three dot-product-based modules

4.5.3 Number of GCN layers

The performance of a GCN will differ depending on the number of GCN layers. In order to study how the number of layers affect the performance, we conducted some additional experiments with \(1,\ldots ,5\) GCN layers, represented by GCN-1, \(\ldots\), GCN-5, respectively. Table 5 shows the results, which show that the model with two GCN layers achieved the best classification result. In comparison, the model with only one GCN layer showed the worse performance, which can be explained by the too shallow GCN being unable to extract label-to-label correlation effectively. The model’s performance dropped while the number of GCN layers increases after two. This is likely caused by overfitting since a too deep GCN may learn about label-to-label correlation too specifically, therefore harming its generalizability. Based on the results, we recommend using two GCN layers for our model.

Table 5 The evaluation of performance on different numbers of GCN layers

4.6 Case study

To demonstrate the practical usefulness of our proposed model, we use several example tweets (not included in the dataset) to demonstrate the effect of the model. To avoid potential privacy disclosures by us, we only use anonymous tweets for this part. For better illustration, the tweets tested try to cover multiple common privacy categories. For clarity, we only present the tweets that are correctly classified by our models.

As Table 6 shows, we use several tweets to show the effect of our proposed model, including ten kinds of privacy aspects. For the first seven tweets, our model correctly captured the aspects of the privacy disclosure, which demonstrates the practicality of our proposed model. For example, the third tweet may disclose the travel destination of the user, thus the llace planning to go” as a reminder. The sixth tweet explains where the user obtained their bachelor’s degree, so it may disclose the privacy category of “education background” according to our model. Therefore, Twitter users and the platform (Twitter) can use these kinds of reminders as a reference to avoid unintended privacy disclosures. For the last tweet, the tweet does not reveal any personal privacy aspect. Therefore, the tweet is classified into the category of “Neutral statement” by our model.

Table 6 Case study

Furthermore, as Table 6 shows, if a tweet may disclose multiple categories of privacy information, our fine-grained privacy disclosure detection model can solve this problem with the consideration of multi-label classification, which shows the advantage of our model compared to other binary coarse-grained privacy disclosure detection models. For example, the detection results of the first testing tweet in Table 6 include two privacy aspects: “occupation” and “graduation”, meanwhile the detection results of the fourth tweet include “age” and “health condition”.

5 Conclusions and future work

A new privacy disclosure detection model is proposed in this paper. The proposed model integrates the text information, the label-to-text correlation and the label-to-label correlation for detecting privacy disclosures in the input text. For the first time, a GCN-assisted feature fusion mechanism is proposed to achieve the text feature fusion process with the guidance of the label-to-label correlation. During the process of feature fusion, the compensation coefficients are proposed to help fuse self-attention and label-attention features. Based on a dataset of privacy-disclosing tweets, our experimental results showed that our model outperformed a number of selected state-of-the-art models and that the improved performance comes from the new design elements we introduced. A number of example tweets are used to demonstrate the practical usefulness of the proposed model. The results show that our proposed model can be used to support development of privacy protection tools that alert online users and online platforms about unintended privacy disclosures.

In our paper, our experiment are based on a single dataset covering 32 privacy-oriented personal aspects (Song et al. 2018), considering that this dataset is the best privacy-disclosing dataset we could find. However, using only one single dataset can make it difficult to judge how generalizable our results are. In addition, although the dataset we used covers a rich set of personal aspects, the coverage can still be extended to cover more personal aspects. Therefore, constructing more datasets for privacy disclosure detection is needed so our work can be further validated on multiple datasets. Meanwhile, our model aims to detect the privacy disclosure in text-only UGC. However, non-textual information in UGC such as images and videos can often disclose privacy information, too. Thus, in our future work, we will investigate the construction of a multi-modal privacy disclosure detection model supporting both visual and textual information.