Abstract
We propose a novel privacy preserving learning algorithm that achieves semi-supervised learning in graphs. In real world networks, such as disease infection over individuals, links (contact) and labels (infection) are often highly sensitive information. Although traditional semi-supervised learning methods play an important role in network data analysis, they fail to protect such sensitive information. Our solutions enable to predict labels of partially labeled graphs without disclosure of labels and links, by incorporating cryptographic techniques into the label propagation algorithm. Even when labels included in the graph are kept private, the accuracy of our PPLP is equivalent to that of label propagation which is allowed to observe all labels in the graph. Empirical analysis showed that our solution is scalable compared with existing privacy preserving methods. The results with human contact networks showed that our protocol takes only about 10 seconds for computation and no sensitive information is disclosed through the protocol execution.
Chapter PDF
Similar content being viewed by others
References
Bearman, P., Moody, J., Stovel, K.: Chains of affection: The structure of adolescent romantic and sexual networks. American J. of Sociology 110(1), 44–91 (2004)
Dåmgard, I., Jurik, M.: A Generalisation, a Simplification and Some Applications of Paillier’s Probabilistic Public-Key System. In: Kim, K.-c. (ed.) PKC 2001. LNCS, vol. 1992, pp. 119–136. Springer, Heidelberg (2001)
Damgård, I.B., Koprowski, M.: Practical threshold RSA signatures without a trusted dealer. In: Pfitzmann, B. (ed.) EUROCRYPT 2001. LNCS, vol. 2045, pp. 152–165. Springer, Heidelberg (2001)
Duan, Y., Wang, J., Kam, M., Canny, J.: Privacy preserving link analysis on dynamic weighted graph. Comp. & Math. Organization Theory 11(2), 141–159 (2005)
Eagle, N., Pentland, A., Lazer, D.: Inferring social network structure using mobile phone data. In: PNAS (2007)
Goldreich, O.: Foundations of cryptography: Basic applications. Cambridge University Press, Cambridge (2004)
Goldschlag, D., Reed, M., Syverson, P.: Onion routing. Communications of the ACM 42(2), 39–41 (1999)
Joachims, T.: Transductive inference for text classification using support vector machines. In: Proc. ICML (1999)
Malkhi, D., Nisan, N., Pinkas, B., Sella, Y.: Fairplay: secure two-party computation system. In: Proc. of the 13th USENIX Security Symposium, pp. 287–302 (2004)
Sakuma, J., Kobayashi, S.: Link analysis for private weighted graphs. In: Proceedings of the 32nd International ACM SIGIR, pp. 235–242. ACM, New York (2009)
Sakuma, J., Kobayashi, S., Wright, R.: Privacy-preserving reinforcement learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 864–871. ACM, New York (2008)
Weston, J., Leslie, C., Ie, E., Zhou, D., Elisseeff, A., Noble, W.: Semi-supervised protein classification using cluster kernels. Bioinformatics 21(15), 3241–3247 (2005)
Yao, A.: How to generate and exchange secrets. In: Proc. of the 27th IEEE Annual Symposium on Foundations of Computer Science, pp. 162–167 (1986)
Zhou, D., Bousquet, O., Lal, T., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems 16: Proceedings of the 2003 Conference, pp. 595–602 (2004)
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Arai, H., Sakuma, J. (2011). Privacy Preserving Semi-supervised Learning for Labeled Graphs. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23780-5_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-23780-5_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23779-9
Online ISBN: 978-3-642-23780-5
eBook Packages: Computer ScienceComputer Science (R0)