Full Text:   <3479>

CLC number: TP391.7

On-line Access: 

Received: 2008-04-08

Revision Accepted: 2008-08-10

Crosschecked: 2009-04-29

Cited: 3

Clicked: 2954

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
1. Reference List
Open peer comments

Journal of Zhejiang University SCIENCE A 2009 Vol.10 No.7 P.952~963


Distributed anonymous data perturbation method for privacy-preserving data mining

Author(s):  Feng LI, Jin MA, Jian-hua LI

Affiliation(s):  School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200030, China

Corresponding email(s):   atlas.lee@gmail.com

Key Words:  Privacy-preserving data mining (PPDM), Distributed data mining, Data perturbation

Feng LI, Jin MA, Jian-hua LI. Distributed anonymous data perturbation method for privacy-preserving data mining[J]. Journal of Zhejiang University Science A, 2009, 10(7): 952~963.

@article{title="Distributed anonymous data perturbation method for privacy-preserving data mining",
author="Feng LI, Jin MA, Jian-hua LI",
journal="Journal of Zhejiang University Science A",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Distributed anonymous data perturbation method for privacy-preserving data mining
%A Feng LI
%A Jin MA
%A Jian-hua LI
%J Journal of Zhejiang University SCIENCE A
%V 10
%N 7
%P 952~963
%@ 1673-565X
%D 2009
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.A0820320

T1 - Distributed anonymous data perturbation method for privacy-preserving data mining
A1 - Feng LI
A1 - Jin MA
A1 - Jian-hua LI
J0 - Journal of Zhejiang University Science A
VL - 10
IS - 7
SP - 952
EP - 963
%@ 1673-565X
Y1 - 2009
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.A0820320

Privacy is a critical requirement in distributed data mining. Cryptography-based secure multiparty computation is a main approach for privacy preserving. However, it shows poor performance in large scale distributed systems. Meanwhile, data perturbation techniques are comparatively efficient but are mainly used in centralized privacy-preserving data mining (PPDM). In this paper, we propose a light-weight anonymous data perturbation method for efficient privacy preserving in distributed data mining. We first define the privacy constraints for data perturbation based PPDM in a semi-honest distributed environment. Two protocols are proposed to address these constraints and protect data statistics and the randomization process against collusion attacks: the adaptive privacy-preserving summary protocol and the anonymous exchange protocol. Finally, a distributed data perturbation framework based on these protocols is proposed to realize distributed PPDM. Experiment results show that our approach achieves a high security level and is very efficient in a large scale distributed environment.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1] Agrawal, D., Aggarwal, C.C., 2001. On the Design and Quantification of Privacy Preserving Data Mining Algorithms. Proc. 20th ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems, p.247-255.

[2] Agrawal, R., Srikant, R., 2000. Privacy-preserving data mining. ACM SIGMOD Record, 29(2):439-450.

[3] Ashley, P., Hada, S., Karjoth, G., 2003. The Enterprise Privacy Authorization Language (EPAL 1.1), IBM. Available from: http://www.zurich.ibm.com/security/enterprise-privacy/epal/

[4] Ashrafi, M.Z., Taniar, D., Smith, K., 2003. Towards Privacy Preserving Distributed Association Rule Mining. Distributed Computing-IWDC, p.279-289.

[5] Beaver, D., 1991. Foundations of secure interactive computing. CRYPTO, 1991:377-391.

[6] Bertino, E., Fovino, I.N., Provenza, L.P., 2005. A framework for evaluating privacy preserving data mining algorithms. Data Min. Knowl. Discov., 11(2):121-154.

[7] Chaum, D., Crepeau, C., Damgard, I., 1988. Multiparty Unconditionally Secure Protocols. Proc. 20th Annual ACM Symp. on Theory of Computing, p.11-19.

[8] Chawla, S., Dwork, C., McSherry, F., Smith, A., Wee, H., 2005. Toward Privacy in Public Databases. Theory of Cryptography Conf., p.363-385.

[9] Cramer, R., Damgard, I., Nielsen, J.B., 2001. Multiparty Computation from Threshold Homomorphic Encryption. Proc. EUROCRYPT, p.280-300.

[10] Cranor, L., Langheinrich, M., Marchiori, M., Presler-Marshall, M., Reagle, J. (Eds.), 2002. The Platform for Privacy Preferences 1.0 (P3P1.0) Specification. W3C. Available from: http://www.w3.org/TR/P3P/

[11] CSA (Canadian Standards Association), 2004. Privacy Code. Available from: http://www.csa.ca/standards/privacy/Default.asp?laguage=English

[12] Evfimievski, A., Srikant, R., Agrawal, R., Gehrke, J., 2004. Privacy preserving mining of association rules. Inf. Syst., 29(4):343-364.

[13] Fienberg, S.E., McIntyre, J., 2004. Data swapping: variations on a theme by dalenius and reiss. Priv. Statist. Datab., 3050:14-29.

[14] Fukasawa, T., Wang, J., Takata, T., Miyazaki, M., 2004. An Effective Distributed Privacy-preserving Data Mining Algorithm. Intelligent Data Engineering and Automated Learning (IDEAL), p.320-325.

[15] Goldreich, O., Micali, S., Wigderson, A., 1987. How to Play Any Mental Game or a Completeness Theorem for Protocols with Honest Majority. 19th ACM Symp. on the Theory of Computing, p.218-229.

[16] Kargupta, H., Das, K., Liu, K., 2007. Multi-party, privacy-preserving distributed data mining using a game theoretic framework. LNCS, 4702:523.

[17] Liew, C.K., Choi, U.J., Liew, C.J., 1985. A data distortion by probability distribution. ACM Trans. Datab. Syst., 10(3):395-411.

[18] Paillier, P., 1999. Public-key cryptosystems based on composite degree residuosity classes. Advances in Cryptology EUROCRYPT, 99:223-238. Available from: http://www.springerlink.com/content/kwjvf0k8fqyy2h3d/

[19] Rizvi, S.J., Haritsa, J.R., 2002. Maintaining Data Privacy in Association Rule Mining. Proc. 28th Int. Conf. on Very Large Data Bases, 28:682-693.

[20] Sweeney, L., 2002. Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncert., Fuzz. and Knowl.-based Syst., 10(5):571-588.

[21] Yao, A.C., 1986. How to Generate and Exchange Secrets. Proc. 27th IEEE Symp. on Foundations of Computer Science, p.162-167.

[22] Zhang, P., Tong, Y.H., Tang, S.W., Yang, D.Q., 2005. Privacy Preserving Naive Bayes Classification Advanced Data Mining and Applications. 3584:744-752.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE