Privacy Preservation in Data Mining through Geometric Transformation

No Thumbnail Available
Date
2018-04
Journal Title
Journal ISSN
Volume Title
Publisher
KNUST
Abstract
There are a number of benefits that data mining provides. However, because data mining deals with massive data stored in databases or other electronic formats, there have been issues of ethics and privacy violations. The purpose of privacy preserving data mining (PPDM) algorithms is to mine significant information from huge amounts of data while protecting sensitive information against unauthorized use. This work was undertaken to test and experiment two main PPDM algorithms. The focus of this work is on numerical attributes which are considered to be confidential. The dilation algorithm adopted multiplicative data perturbation and the translation implemented additive data perturbation. After conducting series of experiments, the results were validated using statistics, tree structure, Receiver Operating Characteristics (ROC) curve and confusion matrix. Also the levels of privacy provided by the two PPDM algorithms were quantified. It was realised that the algorithms can be described as restrictive and so Privacy Enhancement Procedure (PEP) was instituted to improve the levels of privacy provided by the two algorithms. After the implementation of Privacy Enhancement Procedure (PEP), it was discovered that the level of privacy can be improved but is highly dependent on values given to the lower bound and upper bound. The knowledge discovery is done using the University of Waikato open source data mining software – Waikato Environment Knowledge Analysis (WEKA). It can be concluded that the PPDM algorithms used in this work can provide a good level of privacy when adopted to protect the privacy of data mining participants. Typically the dilation changed the lowest age and income values in the original dataset from 69 and 2100 to 15 and 728 respectively.
Description
A Thesis submitted to the Department of Computer Science, College of Science in partial fulfilment of the requirements for the degree of MASTER OF SCIENCE
Keywords
Citation