Presentation is loading. Please wait.

Presentation is loading. Please wait.

Privacy Preserving Data Publishing

Similar presentations


Presentation on theme: "Privacy Preserving Data Publishing"— Presentation transcript:

1 Privacy Preserving Data Publishing
The 7th Post Graduate Conference of Computer Engineering (cPGCON-2018) Privacy Preserving Data Publishing Paper ID: XX Track: Wireless Networks and Communications Presented by: Mr/Ms XYZ Guided By: Prof/Dr XYZ College Name: XYZ College Code: XX

2 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Contents Introduction and Motivation Our Contribution Literature survey Our Proposed Approach Methodology of Evaluation Performance Result Analysis Conclusions and Future Work References cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

3 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Introduction A large amount of data has been collected by various organization viz. Medical and Insurance For the Research and analysis Contains sensitive personal information Privacy related incidents occur in [1-7] [9-40] [50-64] cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

4 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Introduction.. Attempts to preserve the privacy has been addressed in [3-4][10-29][75][79][80-81][ ][180] cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

5 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Motivation We observe in the Anonymization Approaches [3][10-29][49][74-77][79][80-81] that there is tradeoff between privacy and information loss We notice that the k-anonymity model Suffer from the information loss due to generalization and suppression Could not maintain the diversity among the sensitive attribute cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

6 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Our Contribution We propose a sensitive attribute based clustering approach for the k-anonymity model. For minimizing the information loss For minimizing the disclosure risk To maintain the diversity among the sensitive attributes cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

7 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Literature Survey Various approaches [1-20][30-50][53-77] have been proposed in the literature for PPDM. Traditional Approaches [68-73][75] Disclose the data using inferences from original data cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

8 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
An Illustration Give some example, (wherein state How your proposed approach would solve the problem and gives the solution in better way?) cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

9 Our Proposed Approach/Mathematical Model
The central outline of the proposed algorithm is as follows. Step 1: We first load the database. Step 2: We identify and classify the attributes such as identifier, quasi-identifier and sensitive attribute in a database. Step 3:… Step 4:… cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

10 Our Proposed Approach .. Database Initial Solution Encoding-Grouping Solution Encoding Distance Matrix Grouping with the k and l parameters Objective Function Bacterial Foraging Optimization Chemotaxis Reproduction Elimination- Dispersal Note: Draw a pictorial representation of your proposed system cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

11 Methodology of Evaluation
We compare our proposed approach with state of the art clustering approaches viz. Kabir et al. [17] Systematic clustering algorithm, 2011 Byun et al. [12] Greedy k-member algorithm, 2007 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

12 Methodology of Evaluation..
We use Visual Basic 6.0 and Microsoft Access 2007 for the implementation and run on 3.2 GHz Intel Core 2 Duo Processor machine with 2 GB RAM. The Microsoft Windows XP Professional is used as an operating system. cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

13 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Metrics of Evaluation We evaluated our proposed approach with respect to the parameters such as information loss and execution time. We ran our proposed approach on the various k-values such as 20, 40, 60, 80 and 100. cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

14 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Test Application Write the pseudo code of your proposed algorithm in Courier New Font of size 22. cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

15 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Data Set We use Adult dataset from the UCI Machine Learning Repository with records and 14 attributes. Out of them, we retain only attributes viz. Age, Race, Marital-status, Sex, fnlwt and Occupation. The attribute Occupation is taken as a sensitive attribute in the dataset. cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

16 Performance Result and Analysis
Figure 1: Information loss for the Adult Dataset cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

17 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Our Observations It is indeed feasible to make a cluster based on sensitive attribute for minimizing the disclosure risk with lesser information loss. During the evaluation, we notice that our proposed approach, Sometime our algorithm is affected with similar sensitive attribute, if the real dataset contain similar kind of sensitive attribute. Thus, it becomes simpler to the miner to identify an individuals. cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

18 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Conclusion In our Approach, we proposed a sensitive attribute for the k-anonymity model. The empirical evaluations shows that it is feasible to achieve lesser information loss at k≦40 instead of setting higher value of k. cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

19 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Future Work In our research attempt, we have focused on the clustering of the static and centralized database in privacy preserving data mining. However, the database is growing tremendously via use of the Internet. Thus, our future work would be to extend the k-anonymity and the l-diversity model using the BFO algorithm to the dynamic and distributed database. cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

20 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
References [1] A. I. Anton, Q. He and D. L. Baumer, “Inside JetBlue’s privacy policy violations”, In: IEEE Security and Privacy, Vol. 2, No. 6, pp , 2004. [2] R. Agrawal and R. Srikant, “Privacy preserving data mining”, In: ACM SIGMOD Record, Vol. 29, No. 2, pp , 2000. [3] Y. Lindell and B. Pinkas, “Privacy preserving data mining”, In: Journal of Cryptology, Vol. 15, No. 3, pp , 2002. [4] F. D. Schoeman, “Philosophical dimensions of privacy: an anthology”, In: Cambridge University Press, 1984. [5] G. J. Walters, “Human Rights in an information age: a philosophical analysis”, In: Chapter 5, University of Toronto Press, 2002. [6] J. Zhan, “Using cryptography for privacy protection in data mining system”, In: Proceedings of the 1st WICI International Workshop on Web Intelligence Meets Brain Informatics (WImBI), LNCS 4845, pp , 2007. cPGCON-2018, Track=xx, Paper ID=xx, College code=xx

21 cPGCON-2018, Track=xx, Paper ID=xx, College code=xx
Thank You cPGCON-2018, Track=xx, Paper ID=xx, College code=xx


Download ppt "Privacy Preserving Data Publishing"

Similar presentations


Ads by Google