CHAPTER ONE
INTRODUCTION
1.1BACKGROUND OF THE STUDY
Positive patient identification is the foundation of effective healthcare. It considers the right care to be conveyed to every patient in light of his or her individual needs. Recently, ECRI Institute analysts found that patient identification issues were common in healthcare, and these mistakes have critical patient wellbeing and financial implication. According to Michael (2012), 7 to 10 percent of patients are misidentified amid therapeutic record seeks. Besides, around 6 percent of those patients experience the ill effects of generally preventable unfavorable occasions, for example, wrong-side surgery or off base systems performed, medication errors, radiation exposures, blood transfusion responses, radiology blunders, or research facility blunders. (Michael, 2012).

When a patient walks through the door of the emergency room, if the correct medical chart with the correct patient information is not accessed, there can be serious repercussions. Diagnosis and treatment is a complex process. Even seemingly minor inaccuracies can lead to big mistakes, because caregivers are basing many high-risk/high-reward treatments on that information. Data, such as past medical history or medication and allergy lists, can be easily omitted or inaccurately listed if a patient is not identified properly.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

Patient misidentification commonly occurs when a staff member begins a new patient chart and certain imperative information is missing. When this happens without a physician’s knowledge, it could seriously impact a patient’s health. For example, if a patient has a severe allergy to IV contrast but the patient’s medical chart does not note this, he or she could experience a life-threatening allergic reaction when the physician orders a CT scan with IV contrast. (Bártlová et al., 2015)
Another common danger is inaccurate medication lists. Consider anticoagulants, for example. These are important and potentially life-saving medications that can prevent strokes or heart attacks. But if a caregiver is unaware that a patient is taking anticoagulants, they may prescribe another, seemingly unrelated medication (antibiotics for treatment of a minor infection, for example) that could interact with the anticoagulant therapy and cause life-threatening hemorrhaging. (Sean, 2016)
The availability of an accurate past medical history is another crucial piece of a patient medical record. When a physician evaluates a patient’s signs and symptoms but does not have access to their complete medical history (or even worse, accidentally viewing another patient’s medical history due to improper patient identification upon registration), there is a chance of misdiagnosis and mistreatment, which can lead to potentially serious medical consequences. If a patient has a history of stomach ulcers and their doctor starts them on aspirin or ibuprofen, they could have a massive, life-threatening GI bleed. If a patient has a history of blood clots in the leg and starts hormone therapy, they are at an increased risk of developing more clotting, including a pulmonary embolism (a blood clot in the lungs) that could be life threatening. (Sean, 2016)
Even in this day and age of technology, clerical errors play a significant role in patient misidentification. Patients may have multiple duplicate charts throughout a healthcare organization due to simple typographical errors, name misspellings, inaccurate birthdates, language barriers, misinterpretations, misunderstandings, and communication errors between hospital staff, patients, outside caregivers, and family members. And remember, sometimes patients may be confused and unable to provide accurate information due to delirium, shock, dementia, psychosis, intoxication, or drug overdoses or they may even intentionally give inaccurate information for purposes of fraud.

Particularly in emergency situations with fast-paced triage, acutely ill patients, and oftentimes overcrowded environments, it can be difficult for hospital registrars to obtain accurate information and correctly identify the patient presenting for care. In these cases, in order to get the patient registered as quickly as possible, the registrar, rather than struggling for a prolonged time (precious moments when someone is in respiratory distress or bleeding profusely, believe me), may choose to just create a new chart rather than delay registration or risk picking the wrong patient’s chart and creating an overlay chart (one that has two different patients’ information tangled together in one chart). This, of course, results in a chart that includes none of the patient’s history and leads to all of the problems described above, compounding the situation further. (Bártlová et al., 2015 and Sean, 2016)
It’s important to remember that, in addition to the effect on patient safety, misidentification of patients also has a large financial effect. For example, if a doctor doesn’t know a patient has had a test in the past, he or she may order another CT scan, exposing the patient to unnecessary radiation, risk, and costs for both the patient and the healthcare system. Excessive and duplicate studies are a major problem in the practice of medicine, and without accurate patient identification, it is difficult to properly assess exactly who has had which studies and who needs a new test. Hospitals spend large amounts of money per year in human resources and information technology to sift through patient records after the fact and try to merge duplicate charts and separate out overlay charts. Furthermore, intentional patient misidentification (fraud) and unintentional incorrect patient identifiers lead to major financial losses for hospitals and the healthcare system as a whole, in part due to patient harm, liability and adverse events, inefficiencies in billing, and insurance claims denials. (Bártlová et al., 2015 and Sean, 2016)
1.2STATEMENT OF THE PROBLEM
Misidentification of patients is a common problem that many hospitals face on the daily basis. Patient misidentification is one of the leading causes of medical errors and medical malpractice in hospitals and it has been recognized as a serious risk to patient safety.

Recent studies have shown that an increasing number of medical errors are primarily caused by adverse drug events which are caused directly or indirectly by incorrect patient identification. In recognition of the increasing threat to patient safety, it is important for hospitals to prevent these medical errors from happening by adopting a suitable patient identification system that can improve upon current safety procedures.

In a nutshell patient misidentification causes the following problem:
Medication errors, blood transfusion errors,
Testing errors,
Wrong person procedures,
Discharge of infants to the wrong families,
Phlebotomy and surgical interventions.
Patient misidentification is a widely reported problem in medical literature. For example, the National Patient Safety Agency quoted this problem as a “significant risk in the NHS” (Thomas and Evans, 2009).
1.3AIM AND OBJECTIVES OF THE STUDY
The aim of this project work is to develop and implement biometrics based system for patient identification that will improve on the manual method of identifying a patient. The objective of the project is to develop a system that should be able to:
Reduce or possibly eliminate Patient misidentification in hospitals
To increase accuracy of patient identification system
Transform the manual process of patient identification to a computerized system through an improved method.

Provide guidance for doctor and nurses accurately identification of patient related issues.

Eliminate paper costs, and provide all the reports for patient on demand.

To reduce the risk of patient misidentification.

1.4.SIGNIFICANCE OF THE STUDY
The fingerprint based patient identification system will be of great benefit as it identifies patients accurately and retrieves their correct medical record. Also the biometric identification will create a one-to-one link between patients’ identities and their medical records. The biometric patient identification solution will enable healthcare providers to improve patient safety. By positively identifying patients, physicians ensure that the right care is provided to the right patient. Smoother, more accurate recordkeeping also increases revenue cycle efficiency by reducing duplicate medical records, overlays, and insurance fraud; enhances patient satisfaction by accelerating the patient check-in processes; reduces the risk of identity theft posed by conventional patient identifiers; and seamlessly integrates with existing electronic medical record (EMR) solution.

1.5SCOPE OF THE STUDY
Since the topic of patient misidentification is very broad, this thesis concentrates on the technical aspects of the design, implementation, and evaluation of a patient identification system – while providing only references for further reading concerning the medical background of this topic. Therefore, the information in this thesis is of technical nature and aimed at readers with a background in medical informatics or IT managers working in healthcare institutions.

This project is to produce a working prototype, including both hardware and software as the proof of concept system. The prototype will need to demonstrate:
Identifying a patient using biometrics with a high degree of confidence.
Code the software required for data mining techniques to match the patterns with known patterns. In addition, the proof of concept will include Web portal to show results to an operator such as a healthcare professional.
Using the biometric system to confirm a patient’s’ identification. This includes writing software that prompts patient to scan their vein patterns, and confirms a patient’s identity via the use of biometrics
1.6METHODOLOGY OF THE STUDY
According to Ndunagu, (2004) he defined Methodology as a way of thinking about and studying social reality”. “Potter in 1996 defined methodology as strategies that lay out the means for achieving the goals of research”. They all defined methods as procedures and techniques used to reach the study’s goal. “Potter in 1996 sums up the inter-relationship and differences by stating: “Methodologies are the blue prints; methods are the tools”. The research methodology used helps to ensure that a thorough study of the present system is effectively carried out, thus helping the project research team to completely understand the modus operandi of the present existing system so as to know how the new system should be structured and the functionalities needed in it to address the seemingly, existing problems discovered. This helps to know if there should be a total over hailing of the existing system or if only improvements should be made. Hence, after duly considering the above reasons, out of the whole software engineering standard for transforming ideas into an inference Engine which includes prototyping, experts’ system methodology and usability Engineering methodology, this work will adopt the steps of structured system analysis and design methodology (SSDM). SSDM is a methodology used in the analysis of design stages of system development. The step includes:
iProblem identification
iiSystem design
iiiSystem implementation and maintenance.

The proposed system in its all whole is intended to totally take out the issue of the current system. The proposed system as a stand-alone application would empower the identification of patient using fingerprint biometric. The propose system is very easy to use and viable. It takes care of the issue of multifaceted nature by building a basic stand-alone application that can be effectively be utilized and comprehended by users at the hospital. The proposed system is intended to appear as the current system; the main change is in the stage that is from a manual one to an automated platform that is using patient fingerprint for identification. The reason is that new systems are better worked around a current system, so the administration can abstain from investing part of energy in manually identifying patient(s) that with the use of card numbers. The bedrock of this system is it utilizes fingerprint biometrics, and an all-around organized database, this database is intended for each table to go about as different patient record keeping. The excellence about the proposed system is that any information that should be entered with the database naturally shares assets with the information entered.

CHAPTER TWO
LITERATURE REVIEW
2.1CHALLENGES IN PATIENT IDENTIFICATION
Distinctively distinguishing patients in the health system has evaded the Nigerian health section players. Advanced health instruments are being sent to address various difficulties in the Nigerian health system with small hinting at any scale. Regardless of worldwide enthusiasm for computerized identity system and its capability to enhance health result, little advancement has been enrolled in Nigeria. Nigeria has a convoluted patient identity system at the point of writing with patient identity local to health facility and occasionally in department
Health system in Nigeria is feeble and faces numerous overwhelming difficulties. Poor patient records persist in spite of immense investment in recent decade. Routine health office created information regularly can’t be depended on for planning or for critical decision-making. Health facilities are progressively utilizing advanced digital health instruments with limited versatility. This remains constant independent of any meaning of the word ‘scale’. Various copies of individual patient records have resolute the health system by and large and computerized tools specifically. Adaptability of these tools has been hampered by lack of unique patient identity system. Proof demonstrate that the general public’s most vulnerable remain the ones most with no form of Identifiers. They are frequently monetarily prohibited, and do not have access to essential social benefits including health. (National Populations Commission, 2008; National Populations Commission, 2013; Federal Ministry of Health, 2016; World Bank Group, 2016).
In Nigeria, Identification systems for patients at most health office (essential, auxiliary or tertiary) are local to the health facilities. The case is the same independent of their proprietorship private or public. Patient data are as of now scattered over various departments, and facilities and every health institution utilizes their individual identifiers that can’t be utilized beyond the facility or sometimes department The numbering classification regularly can’t be comprehended beyond the generating health facility. Consider a speculative instance of a pregnant lady ‘Uduak’ that registers at a Primary Health Center (PHC) close to her. A patient number is produced for her at her first visit, if she gets tested at the clinic’s laboratory, another identification may be created depending on the health facility. For a situation that Uduak requires expert care and should be alluded, she may get another registration at the referral center. In the event that she chooses to change health facility for any reason, either in light of the fact that she needs to deliver close to her relatives or just required medical care while travelling, she will get another new registration. And all these happen notwithstanding when she recalls or has her registration data from past health facilities. (Chukwu, 2017)
Medical writing routinely uncovered the requirement for significant changes in the delivery of healthcare. Medical blunders result in no less than 44,000 superfluous death every year in the United States, with the most helpless patients, for example, the old or incessantly sick enduring the worst part of these mistakes (Weingart, et al, 2000). In the UK, around 5% of patients conceded yearly encountered some sort of medicinal mistake, which thusly has a quantifiable monetary effect – costing around £1 billion in additional bed days (Murphy and Kay, 2004). While medical mistakes occur in numerous parts of healthcare, for example, analytic and surgical strategies, antagonistic medication responses and lab tests – precise and productive patient ID is a basic angle in these methodology. For instance, especially in blood transfusion, understanding misidentification can have disastrous impacts. In the blood transfusion setting, tolerant misidentification is the absolute most contributing component to mistransfusion, with it being sufficiently regular that the danger of mistransfusion is considerably more prominent than the transmission of HIV by blood, with the recognizable proof process really deteriorating as time passes by (Murphy and Kay, 2004)
In a study carried out by Bártlová et al. (2015), the goal of the study was to assess the opinions of nurses regarding patient safety associated with patient misidentification. The investigation was focused on actual patient misidentification as well as loss of patient materials (e.g., blood samples, X-rays, etc.). These are problems often associated with patient identification methods and/or confusing patients with the same surname assigned to the same ward. The risks of misidentification incidents pose a considerable threat to patient health especially when the confusion extends to the operating room. their objective was to identify the potential causes of patient misidentification and offers solutions to correct the issue. A survey as part of a sociological investigation was carried out through the use of questionnaires. The selected sample included, in accordance with the needs of the project and methodology of the Institute for Health Care Information and Statistics of the Czech Republic, registered nurses working shifts on inpatient wards. The study took place across the Czech Republic between Sept. 15 and 30, 2013. The sample consisted of 772 registered nurses.
According to the result of Bártlová et al. (2015), the potential for patient misidentification (PM) was described as non- negligible by 38.8% of respondents. 33.1% of nurses admitted problems associated with patient misidentification. Respondents reported that the greatest potential for patient misidentification was associated with patients having the same surname staying on the same ward. The study shows that registered nurses regard patient misidentification as a likely event. Nonetheless, statistics suggest education, changes in protocols, and new technologies are needed to improve the precision of patient identification. (Bártlová et al., 2015)
2.2 BIOMETRICS IDENTIFICATION
There are many biometrics in use today and a range of biometrics that are still in the early stages of development. Biometrics can, therefore, be divided into two categories: those that are currently in use across a range of environments and those still in limited use or under development, or still in the research realm.
2.2.1. Biometrics Currently in Use across a Range of Environments
2.2.1.1. Fingerprint
Fingerprint is the pattern of ridges and valleys on the tip of a finger and is used for personal identification of people. Fingerprint based recognition method because of its relatively outstanding features of universality, permanence, uniqueness, accuracy and low cost has made it most popular and a reliable technique and is currently the leading biometric technology (Jain et al. 2004). There is archaeological evidence that Assyrians and Chinese ancient civilizations have used fingerprints as a form of identification since 7000 to 6000. Henry Fauld in 1880 laid the scientific foundation of the modern fingerprint recognition by introducing minutiae feature for fingerprint matching (Maltoni et al. 2003). Current fingerprint recognition techniques can be broadly classified as Minutiae-based, Ridge feature-based, Correlation-based and Gradient based (Aggarwal et al. 2008).

Most automatic fingerprint identification systems employ techniques based on minutiae points. Although the minutiae pattern of each finger is quite unique, noise and distortion during the acquisition of the fingerprint and errors in the minutiae extraction process result in a number of missing and spurious minutiae (Chikkerur et al. 2006). To overcome the difficulty of reliably obtaining minutiae points from a poor quality fingerprint image, ridge feature-based method is used. A ridge is a pattern of lines on a finger tip. This method uses ridge features like the orientation and the frequency of ridges, ridge shape and texture information for fingerprint matching. However, the ridge feature-based methods suffer from their low discrimination capability (Maltoni et al. 2003). The correlation-based techniques make two fingerprint images superimposed and do correlation (at the intensity level) between the corresponding pixels for different alignments. These techniques are highly sensitive to non-linear distortion, skin condition, different finger pressure and alignment (Yousiff et al. 2007). Most of these techniques use minutiae for alignment first.

The smooth flow pattern of ridges and valleys in a fingerprint can be also viewed as an oriented texture. Jain et al. (2000) describe a global texture descriptor called ?Finger Code’ that utilizes both global and local ridge descriptions for an oriented texture such as fingerprints. A variation to this method is used by Chikkerur that use localized texture features of minutiae and another one by Zhengu that uses texture correlation matching. Further, Aggarwal et al. (2008) proposed gradient based approach to capture textural information by dividing each minutiae neighbourhood locations into several local regions of which histograms of oriented gradients are then computed to characterize textural information around each minutiae location. Recently, Jhat et al. (2011) proposed that Texture feature of Energy of a fingerprint can be used for effecting fingerprint identification.

2.2.1.2. Face recognition
Face recognition for its easy use and non intrusion has made it one of the popular biometric. A summary of the existing techniques for human face recognition can be found in (Zhao et al. 2003). Further, a survey of existing face recognition technologies and challenges is given (Abate et al. 2007). A number of algorithms have been proposed for face recognition. Such algorithms can be divided into two categories: geometric feature-based and appearance-based. Appearance-based methods include: Eigenfaces, Independent Component Analysis (ICA), Kernel Principal Component Analysis (KPCA), Kernel Fisher Discriminant Analysis (KFDA), General Discriminant Analysis (GDA), Neural Networks and Support Vector Machine (SVM). An inherent drawback of appearance-based methods is that the recognition of a face under a particular lighting and pose can be performed reliably when the face has been previously seen under similar circumstances. Further, in appearance-based methods the captured features are global features of the face images and facial occlusion is often difficult to handle in these approaches. Geometric feature-based methods are robust against variations in illumination and viewpoints but very sensitive to feature extraction process. The geometry feature-based methods analyze explicit local facial features, and their geometric relationships. The geometry feature-based methods include: Active Shape Mode, Elastic Bunch Graph matching and Local Feature Analysis (LFA) (Penev and Atick 2002).

According to Mccool et al., (2008). Recognition of faces from still images or 2D images is a difficult problem, because the illumination, pose and expression changes in the images create great statistical differences and the identity of the face itself becomes shadowed by these factors. To overcome this problem 3D face recognition has been proposed which has the potential to overcome feature localization, pose and illumination problems, and it can be used in conjunction with 2D systems. Research using 3D face data to identify humans was first published by Cartoux. The 3D face data encodes the structure of the face and so is inherently robust to pose and illumination variations. Applying HMMs to 3D face identification was first attempted by Achermann. A recent advance for 3D face identification has been to show the applicability of the Gaussian Mixture Model (GMM) parts-based approach (Mccool et al. 2008). The drawbacks of 3D face recognition include high cost and decreased ease-of-use for laser sensors, low accuracy for other acquisiton types, and the lack of sufficiently powerful algorithms.

2.1.1.3. The Iris
The iris is a thin circular diaphragm, which lies between the cornea and the lens of the human eye. A survey on the current iris recognition technologies is available in (Bowyer et al. 2008). Flom and Ara, first proposed the concept of automated iris recognition. It was John Daugman who implemented a working automated iris recognition system (Daugman, 2003). Though Daugman’s system is the most successful and most well-known, many other systems have also been developed. An automatic segmentation algorithm based on the circular Hough transform is employed by Wildes. Boles and Boashash, extracted iris features using a 1-D wavelet transform. Sanchez-Avila and Sanchez-Reillo further developed the iris representation method proposed by Boles.
Lim et al. (2001) extracted the iris feature using 2-D Haar wavelet transform and (Park et al. 2003) utilized directional filter banks to extract the normalized directional energy as a feature. (Kumar et al. 2003) employed correlation filters. Recently Ma et al. proposed two iris recognition methods, one using multi-channel Gabor filters and the other using circular symmetric filters. Later, they proposed an improved method based on characterizing key local variations with a particular class of wavelets, recording a position sequence of local sharp variation points in these signals as features. Several other methods have also been developed for iris recognition. Chen et al. (2006) proposed using Daugman’s 2-D Gabor filter with quality measure enhancement. Du et al. (2006) proposed using 1-D local texture patterns and (Sun et al. 2005) proposed using moment- based iris blob matching.

2.2.1.4 Hand geometry
Hand geometry refers to the geometric structure of the hand that is composed of the lengths of fingers, the widths of fingers, and the width of a palm, etc. The advantages of a hand geometry system are that it is a relatively simple method that can use low resolution images and provides high efficiency with great users’ acceptance (Jain et al. 1999). A brief survey of reported systems for hand-geometry identification can be found in Sanchez-Reillo et al. (2000). An elaborate survey on hand geometry identification is given in (Dutan, 2009). Geometrical features of the hand constitute the bulk of the hand features adopted in most of the hand recognition systems. One advantage is that geometrical features are more or less invariant to the global positioning of the hand and to the individual planar orientations of the fingers. Among numerous geometrical measures include lengths, widths, areas, and perimeters of the hand, fingers and the palm. (Jain et al. 2005), have shown that hand geometrical features solely are not sufficiently discriminative. Therefore, for more demanding applications one must revert to alternative features such as hand global shape, appearance and/or texture. (Jain et al. 2005) thus use 16 axes predetermined with the aid of five pegs. Sanchez-Reillo et al. (2007) use a similar set of geometric features, containing the widths of the four fingers measured at different latitude, the lengths of the three fingers and the palm. Wong and Shi, (2009), in addition to finger widths, lengths and interfinger baselines, employ the fingertip regions. Bulatov et al. (2010) describe a peg-free system where 30 geometrical measures are extracted from the hand images. In addition to widths, perimeters and areas of the fingers, they also incorporate the radii of inscribing circles of the fingers.

The other approach in hand geometry identification is contour-based (Jain and Duta, 2002). The contour is completely determined by the black-and-white image of the hand and can be derived from it by means of simple image-processing techniques. It can be modelled by features that capture more details of the shape of the hand than the standard geometrical features do. Accordingly, various techniques have been proposed to obtain and mathematically represent these hand features (Alexandra et al. 2002). Yoruk et al. (2006) introduced a more accurate and detailed representation of the hand using the Hausdorff distance of the hand contour, and Independent Component Analysis (ICA).

2.3FINGERPRINT BASED IDENTIFICATION AND RECOGNITION SYSTEMS
Human fingerprints have been discovered on a large number of archaeological artifacts and historical items. The English plant morphologist, Nehemiah Grew, published the first scientific paper reporting his systematic study on the ridge, furrow, and pore structure and detailed description of the anatomical formations of fingerprints was made by Mayer. Konda, (2010) stated biometrics is an automated method that recognizes people based on their physical and action characteristics, and is a field that used to authenticate a certain individual’s characteristics, recognize a person’s character, or study a person’s measurable characteristics Pankanti, (2000). People have unique fingerprints that do not change, and fingerprints consist of ridge and furrow parts of a finger’s surface. Fingerprints can be categorized according to many key patterns that include lops, whirl polls and arches.

Fingerprint matching is the process used to determine whether two sets of fingerprint come from the same finger. One fingerprint is stored into the database and other is employee’s current fingerprint. Minutiae point refers to the topical characteristic at the end point of the ridge part. The best way to compare fingerprints is to compare al visual information on the fingerprints. However, this is realistically impossible. Comparing all visual information requires too much data, and this is inappropriate to making a commercialized system. Actual commercialized systems do not store the fingerprint itself, but characteristics of the fingerprints, and codes related to the position of these points of characteristics. Since only characteristics are stored, they cannot be revived as fingerprint visuals, and therefore cannot be used as evidence in legal facilities Geng, (2012).

Josphineleela, (2012) proposed one system, in which attendance is being taken using fingerprint. This system can be used for student and staff. In this system the fingerprint is taken as an input for attendance management and it is organized into the following modules Pre-processing, Minutiae Extraction, Reconstruction, Fingerprint Recognition, Report generation. In this system, novel fingerprint reconstruction algorithm is used. In 2013, Seema and Satoa proposed one new system for employee attendance using fingerprint. In this system, fingerprint identification is done using extraction of minutiae technique and the system automates the whole process of taking attendance. For employee fingerprint checking, it checks one fingerprint template with all templates stored in the database, like wise it checks for all student which will take more time.
In Neha, (2013) fingerprint recognition based identification system is designed for student identification. This system is being designed for taking attendance in institutes like NIT Rourkela. In this system, fingerprint template matching time is reduced by partitioning database. Fingerprint scanner will be used to input fingerprint of teachers/ students into the computer software.
2.4 FACE RECOGNITION AND VERIFICATION SYSTEM
Yohei et al., (2005), proposed a system where the student identification is monitored by continuous observation. Continuous observation is the method of using video streaming so that the students sitting position, presence, status and other information is collected. Active Student Detecting (ASD) approach is used to estimate the existence of a student sitting on the seat by using the background subtraction and inter-frame subtraction of the image from the sensing camera on the ceiling.
Wei et al., (2009), came up with the half face template face detection method. In a classroom, the camera is used for capturing the video, sometimes this video contains the half face of students. The half face template can capture side face images in great angle, which improves the correctness of side face detection. This method decreases the time complexity in face detection and adopts face in greater angle. The half face template increases the speed of face detection.

Senthamil et.al., (2014), Face recognition based Attendance marking system. In this project work, the team sort to find the attendance, positions and face descriptions in classroom lecture, by projecting the presence administration system based on face detection in the classroom lecture. The system estimates the presence and the location of each student by continuous inspection and footage. The result of our beginning experiment shows continuous inspection improved the performance for estimation of the attendance. Patil et al., (2014), Student Attendance Recording System Using Face Recognition with GSM Based Student footage system using face validation was considered and implemented. It was tested with dissimilar face images. This idea is working properly with different panel. All windows are running separately and equivalent. If appreciation is to participate as a viable biometric for validation, then a further order of improvement in detection score is necessary. Under controlled condition, when lighting and pose can be controlled, this may be possible. It is more likely, that future improvement will rely on making better use of video knowledge and employing fully 3D face models.
Chintalapati et al., (2013), develop an Automated Attendance Management System based on Face Recognition Algorithm. This system is based on face detection and recognition algorithm that automatically detects the student when he enters the class room and mark the attendance by recognizing him. This technique is to be used in order to handle the threats like spoofing. The problem with this approach is that it captures only one student image at a time when he enters the classroom, thus it is time consuming and may distract the attention of the student. Sajid et al., (2014), came up with a Conceptual Model. Their model captured the image from a fixed camera in the classroom. The noise from the image is reduced and Gabor Filters or jets are used for extracting the facial fiducially points of every detected face. Calculated facial measurements are matched or verified with the data stored in the database. This all computation will be headed on the server. Humans have a diverse set of facial expressions which can reduce the accuracy of facial recognition software.
Fernandes et al., (2013), analyzed and reviewed the current face recognition algorithms in order to deduce a new and robust algorithm. They used ORL and SHEFFIELD database for analyzing the performance of combination of appearance-based methods like Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). PCA works better when the images are capture with no disturbance. The paper inferred that PCA is better than LDA at recognizing individuals even with background disturbance, since it took shorter time span for recognition. Thus, PCA and its variants are the best facial recognition algorithms. Srivastava (2013), suggested about using Emgu CV which is a cross platform .NET wrapper to the OpenCV image processing library. It allows OpenCV functions to be called from .NET compatible languages such as C#, VB, VC++, Iron Python etc. The software proposed here takes images from a CCTV camera instead of using still images database. Most of the web cameras face the problem of non-uniform lightning since they are dependent on the natural light and cannot have an artificial lighting source. The grayscale images from the camera must be of the same size so as to equalize the histograms. This equalization is crucial for better performance during natural lighting.
Fuzail et al., (2014), Face Detection System for Attendance of Class’ Students. A regular attendance supervision system is an essential tool for any LMS. Most of the existing system are time taking and necessitate for a semi instruction manual work from the instructor or students. This approach aims to explain the issues by integrates face detection in the procedure. Even though this method still lacks the capability to identify each student in attendance on class, there is still much more room for enhancement. Since we implement a modular approach we can get better different module until we reach an acceptable detection and identification rate. Another issue that has to be taken in consideration in the opportunity is a process to ensure users privacy. Whenever you like a representation is stored on servers, it must be impossible for a person to use that image.
Gopala et al., (2015), Implementation of Automated Attendance System using Face Recognition”, automated presence System has been envisioned for the purpose of falling the errors that occur in the conventional (manual) attendance taking system. The aim is to computerize and make a system that is useful to the institute such as an organization. The efficient and exact method of attendance in the office atmosphere that can reinstate the old manual methods. This technique is secure enough, reliable and available for use. No need for dedicated hardware for installing the system in the office. It can be constructed using a camera and computer.
2.5 SUMMARY
An attempt has been made to review existing works on biometric implementation with a view of knowing the current tools used in its various application. Fingerprint technology is so far the most suitable and reliable approach for the system development as it basically takes care of security and prevents misidentification among patients. It is less prone to error compared to the existing method on ground and hence can be deployed to solve the problem of patient misidentification at hospitals.

CHAPTER ONE

INTRODUCTION
1.1 Introduction
Recently anonymity in location based services has attracted a great deal of attention. This is because of the evolving location detection devices coupled with pervasive connectivity that have supported a many types of location-based services. To access location based services, the users have to disclose their spatial information which threatens their (users) privacy. Therefore, location privacy concepts become mandatory to ensure the user’s acceptance of location-based services. Various solution approaches have been proposed and developed to handle this challenge. However not all attacks have been catered for in the existing approaches. And there is need for privacy protection approaches that consider user habits and preferences.
1.2 Background
Because to the fast developments in location technologies for example Global System for Mobile Communication, Radio Frequency Identification and Wireless Fidelity, Global Positioning System, the mobile devices are usually fitted with geolocated and wireless communication abilities. The latest improvements of pervasive devices have led to the creation of a new group of services known as Location Based Services (LBS) that are personalized to the current location of the user sending the query to the service. LBS can be defined as a service that takes as input the current location of a user (generally acquired through a mobile device carried by this user) and tailors its output depending on the acquired location data (Gambs et al., 2013).
LBS can access, combine, and transform contextual information and more specifically location information, in order to personalise the service provided to the user. For example, LBS can be used for resource discovery, path finding, real time social applications or location-based gaming. When people use LBS to support them in their daily tasks, their position is usually acquired automatically through mobile equipment they carry with them, thus these systems continuously monitor and reveal information about the location of their users as the position of these mobile systems is essentially the same as the users of such systems (Gambs et al., 2013).
Users with location-aware mobile devices can issue location-based snapshot or continuous queries to a database server at anytime and anywhere. Examples of snapshot queries include “Where my nearest petrol station is” and “what are the restaurants within one mile of my location”, while examples of continuous queries include “continuously report my nearest police car” and “continuously report the taxis within one Mile of my car”( Mokbel et., al 2006).
Although location-based services promise safety and convenience, they threaten the security and privacy of their customers. Spatial information privacy is the ability to prevent unauthorized entities to access the spatial location information of a user. With untrustworthy servers, an adversary may access sensitive information about an individual based on their issued location-based queries. E.g. an adversary may check a user’s habit and interest by knowing the places he seeks (Chow & Mokbel 2006).
Due to the nature of spatial queries, LBS needs the user position in order to process his requests. LBS makes spatial data available to the users through one or more location servers (LS) that index and answer user queries on them. Examples of spatial queries could be “where is the closest hospital to my current location?” or “Which and where is the nearest Uber taxi and how long will it take it to reach me?” In order for the LS to be able to answer such questions, it needs to know the position of the querying user. There exist many algorithms for efficient spatial query processing, but the main challenge in the LBS industry is of a different nature. In particular, users are reluctant to use LBSs, since revealing their position may link to their identity. Even though a user may create a fake identity to access the service, his location alone may disclose his actual identity. A privacy problem arises in LBS when the user is concerned with the possibility that an attacker may connect the user’s identity with the information contained in the service requests, including location and any other publicly available information such as telephone directories. In general, the association between the real identity of the user issuing an LBS request and the request itself as it reaches the service provider can be considered a privacy threat. User privacy may be threatened because of the sensitive nature of accessed data e.g. inquiring for pharmacies that offer medicines for diseases associated with a social stigma, or asking for nearby addiction recovery groups. Another source of threats comes from less sensitive data e.g. gas station, shops, restaurants, that may reveal the user’s interest and shopping needs, resulting in a flood of unsolicited advertisements through e-coupons and personal messages (Mouratidis & Yiu 2012).
To solve this problem the K-Anonymity concept is adopted, when a user wishes to pose a query, he sends his location to a trusted server the anonymizer (AZ) through a secure connection. The later obfuscates his location, replacing it with an Anonymizing Spatial Region (ASR) that encloses a user. The ASR is then forwarded to the LS. Ignoring where exactly the user is, the LS retrieves (and reports to the AZ) a Candidate Set (CS) that is guaranteed to contain the query results for any possible user location inside the ASR. The AZ receives the CS and reports to user the subset of candidates that corresponds to her original query. In order for the AZ to produce valid ASRs the user send location updates whenever they move through their secure connection. (Mouratidis & Yiu 2012).
The ASR construction at the AZ (i.e., the anonymization process) abides by the user’s privacy requirements. Particularly specified in an anonymity degree K by user, the ASR satisfies two properties that ASR must contain user and at least other k-1 users, and even if the LS knew the exact locations of all users in the system, it would not be able to infer with a probability higher than 1/k who among those included in the ASR is the querying one.
In the ASR LS must produce an inclusive and minimal CS. Inclusiveness demands that CS is a superset of users query results; this property ensures that user receives accurate and complete answers. Minimality, on the other hand, requires that the CS contains the minimum number of data objects, without violating inclusiveness. Minimality ensures that CS transmission (from the LS to the AZ), and its filtering at the AZ do not incur unnecessary communication and processing overheads (Mouratidis ; Yiu 2012).
K-anonymity approach is appropriate for preserving the privacy of the users who request LBS services. The advantage of this approach is that it incurs low communication cost between the client and the anonymizer. However adversary’s knowledge and capabilities, and user needs are not considered which limits the privacy protection. An attacker may use additional knowledge about the user (the habits, regular user behavior, user interests) to infer the query source. This proposal is to formulate an effective and efficient approach to protect privacy of users from such context linked attacks.
1.3 Problem Statement
A context linking attack is where the adversary has prior context information about the user additionally to the spatio-temporal information. The adversary can exploit personal context knowledge like preferences or interest. Assume that each user is interested in certain types of queries, e.g., traffic conditions, restaurants, pubs etc. An attacker may use this additional knowledge to infer the query source. Furthermore the attacker may gather knowledge through observation. For instance, if a user is using pseudonyms and the attacker can see the observed user, then the attacker can retrace all prior locations of the user for the same pseudonym by a single correlation (Wernke et al. 2014). When the user position is revealed, it can reveal his regular habits and routines, and when he deviates from them. Depending upon the information and who learns it, the ramifications could range from annoying to embarrassing to downright dangerous. Robberies, mugging and stalking cases have been linked to attacking of user’s location privacy (Ozer 2011).
Such attacks require a privacy approach that considers user habits, regular user behavior and user interests (Wernke et al. 2014; Shivaprasad, Li ; Zou 2016). Ghinita et al. 2010 proposes that for such an approach, users are classified into groups according to their interests. Then, spatial diversity would take into account these groups when forming Anonymized Spatial Region (ASR); i.e., an ASR should contain users with similar interests, from the same group.
1.4 Major Objective
The overall objective of this research is to develop an approach that enhances user privacy against personal context linking attacks in Location Based Services.
1.5 Specific Objectives
1. Investigate various models and techniques that have been used in protecting privacy in location based services and identify the challenges.
2. Formulate an approach that protects the users from personal context linking attacks and more specifically observation attacks. The approach will extend k-anonymity to consider the user preferences and interests.
3. Evaluate the approach for efficiency and effectiveness or Validating the approach for improved protection
1.6 Research questions
1. What are the challenges in protecting privacy in LBS?
2. What are the existing approaches and techniques that have been used in protecting privacy in location based services?
3. How will the approach be formulated and implemented?
4. How will the approach be evaluated?
1.7 Scope
The study will examine privacy in location-based services and various techniques and models used in protecting privacy. It will specifically consider the personal context linking attacks. The important features to examine will include k-anonymity concept and formation of ASR with consideration of user preferences.
This research will take a period of five months.
1.8 Justification
User location privacy is a major concern in today’s mobile applications and there has been significant research dedicated to address this issue (Shivaprasad, Li & Zou 2016). Despite all the efforts made to enhance location privacy of the user sensitive information, issues of context linking attacks there are still open and need more attention .
Currently, only Private Information retrieval (PIR) mechanism can resist observation attack (Shivaprasad, Li & Zou 2016). The drawback of PIR mechanism is the computation and communication overhead, making it hard to implement on handheld/portable devices. Also it is very difficult to service providers such as Google maps, as this will have more real time data.
Unlike PIR, k-anonymity incurs low communication cost between the user and the location-based database server. However, it is unknown if k-anonymity can resist personal context linking attacks. This research is intended to build on Ghinita et al. 2010 proposed solution to this kind of attacks.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

CHAPTER TWO
2. LITERATURE REVIEW
2.1 Location Based Services
LBS have recently attracted significant attention due to their potential to transform mobile communications and the potential for a range of highly personalized and context-aware services. Since early begging having its origins in E911 project in U.S. in late 1990’s and the first location-tracking functionalities introduced in Japan in 2001 Location-Based Services have made considerable progress. Today new applications of LBS are limited only by the technology and creativity of service developers and it is growing on monthly bases.
2.1.1 The GPS and emergence
2.1.2 Communication Model
2.1.3 LBS Major Components
If the user wants to use a location based service different infrastructure elements are necessary. In Figure
3 the five (4+1) basic components and their connections are shown:
• Mobile Devices: A tool for the user to request the needed information. The results can be given
by speech, using pictures, text and so on. Possible devices are PDA’s, Mobile Phones, Laptops,
… but the device can also be a navigation unit of car or a toll box for road pricing in a truck.
• Communication Network: The second component is the mobile network which transfers the
user data and service request from the mobile terminal to the service provider and then the requested
information back to the user.
• Positioning Component: For the processing of a service usually the user position has to be determined.
The user position can be obtained either by using the mobile communication network
or by using the Global Positioning System (GPS). Further possibilities to determine the position
are WLAN stations, active badges or radio beacons. The latter positioning methods can especially
used for indoor navigation like in a museum. If the position is not determined automatically
it can be also specified manually by the user.
2.1.4 Applications of LBS

2.2 LBS system Model
The most common LBS system model consists of three components namely; mobile user devices, location servers, and clients (Wernke et al. 2014). The mobile device of a user is equipped with an integrated position sensor to determine the current user position. Mobile devices send their position information to a location anonymizer. The location anonymizer is a trusted third party that acts as a middle layer between the mobile user and the location-based database server in order to: (1) receive the user’s exact location along with his/her privacy prole, (2) blur the user’s exact location into a cloaked area, and (3) send the cloaked area to the location-based database server.
The privacy- aware query processor is embedded inside the location-based database server (LS) to tune its functionality to deal with anonymous queries and cloaked areas rather than the exact location information.
Clients query the LS for user positions in order to implement a certain location-based service. The LS grants clients’ access to the stored positions based on an access control mechanism.

Figure 2.1: The centralized model for privacy in LBSs
2.3 Challenges of Location Based Services
Although LBS promise safety and convenience, they threaten the privacy and security of their users. The privacy threat comes from the fact that LBS providers rely mainly on an implicit assumption that the user agrees to reveal his/her private location to get LBS. In other words, the user trades his/her privacy with the service. If the user wants to keep his/her private location information, the user has to turn off the location-aware device and (temporarily) unsubscribe from the service. With potentially untrusted servers, such a service subscription model poses several privacy threats to the user. For example, an employer may check on his/her employee’s behavior by knowing the places where the employee visits and the time of each visit, the personal medical records can be inferred by knowing which clinic a person visits, or someone can track the locations of his/her ex-friends. In many real-life cases, people abuse GPS devices to stalk personal locations, and many people worry about their location privacy when they are using LBS. Unfortunately, the traditional approach of pseudonymity (i.e., using a fake identity) is not applicable to LBS, as the location information of a person can directly lead to the true identity. For example, asking about the nearest Pizza restaurant to the location of my house using a fake identity will reveal my true identity, as a resident of the house. In fact, many web-based tools are available to translate a location into a street address (e.g., Google Maps) and find the resident of a street address (e.g., Intelius).
2.4 Privacy is Location Based Services
Privacy is a human right and should be respected whenever users interact with electronic systems. LBSs are not exceptions. When making use of LBSs, users expose their locations and queries. Both of them can be explored by attackers to infer users’ private information. Such malicious inference in turn threatens users’ privacy in LBSs. First, locations can serve as a piece of subsidiary information to peek users’ personal life. For instance, hospitals are public places and the location of a hospital itself does not carry any sensitive information about users. However, it will become sensitive when the functionality of hospitals and the purpose of people in hospitals are taken into account. An appearance in a cancer centre reveals that a person may suffer from a bad health problem. In order to avoid the abuse of inferred personal information, users desire the protection of location privacy in LBSs. Second, even if where users are located does not reveal any sensitive information, their queries may still put their privacy at risk.
In order to systematically discuss the effectiveness of the different approaches protecting location privacy, we first need to know which information the user actually wants to protect, i.e., his privacy goal. Second, we need to know what kind of information is available to an attacker, in order to analyze how an attacker could use this information to infer private user information with respect to the defined protection goal.
2.4.1 Protection Goals
There are three protection goals which include; user identity, spatial information, and temporal information. The protection goal of the user defines which attributes of the information must be protected and which can be revealed. Each attribute controls the amount of shared information and thus directly affects user location privacy. The adversary can use the attributes as deductive knowledge to minimize user privacy. Location privacy can be achieved by fulfilling some or all of the entities given below.
2.4.1.1 Spatial Information Privacy
Spatial information privacy is the ability to prevent unauthorized entities to access the spatial location information of a user. By choosing the granularity of spatial information, the level of privacy can be varied according to a user’s requirement. For example, the user might be willing to provide coarse location information such as name of the city to a service provider but, might want to share more specific location such as latitude and longitude values with her friends. Another important goal of spatial information privacy is to hide the identity of the user’s location. For instance, knowing that a user entered HIV care center would reveal the user’s private information about her health status. The adversary can deduce that maybe the user is HIV positive.
2.4.1.2 Temporal Information Privacy
Temporal information is related to the point in time when spatial information becomes available. One of privacy preserving methods would be to delay a user’s spatial information so that the time and location of the user cannot be related. The level of privacy can be managed by controlling the temporal resolution of the user’s location. For example, if a user wants to share the location that she has visited but does not reveal that she is not present at her home, the user’s location update can be delayed by a certain amount of time per her requirement. Hence by delaying the user’s location update by “x” amount of time instead of providing real-time update, the user’s private information can be protected. However, (Liu, X et al 2012) found out that even sporadic location information under the pseudonym protection are exposed to location privacy threats.
2.4.1.3 User Identity Privacy
Another important goal of privacy protection is to hide user identity. A user’s identity can be the name, social security number, address, or any such unique information. However, an attacker can still identify a user by correlating the user’s location information and the type of location.
2.4.2 Privacy approaches
Numerous privacy approaches have been proposed to preserve privacy of the user information in LBS. Although the mechanisms such as anonymization and spatial obfuscation provide location privacy through hiding identity or reporting fake locations, they ignore adversary’s knowledge about user’s access pattern and algorithm, and disregard the optimal attack where an attacker may design in an inference attack to reduce his calculation error. Based on the protection goals/attributes and attacker’s knowledge, fundamental principles of the privacy approaches used in LBS are discussed below. The following principles are discussed: position dummies, mix zones, k-anonymity, spatial obfuscation, coordinate transformation, encryption, and position sharing.
2.4.2.1 Mix Zone
One of the mechanisms to protect users’ privacy is mix zone. The main idea of the mix zone mechanism is to define mix zone areas, and user positions are concealed such that their positions are unknown within these mix zone areas. This mechanism uses a trusted location server. A user identity is assorted with many others within the zone by changing pseudonyms whenever the user enters a mix zone. Thus, it protects the user’s identity even when an adversary tries to trace the ingress and egress points of a mix zone and the adversary cannot link these various pseudonyms of the users ((Shivaprasad, Li & Zou 2016)).
The examples that follows as discussed by Beresford & Stajano, 2004 explains this mechanism of preventing attackers from tracking long-term user movements. Consider a scenario of the mix-zone where three users moving through a simple mix zone. The attacker can record the crossing points of the users between the zones and then makes use of the past information from close-by zones to model an attack based on users’ movement, hence providing an attacker a chance in matching entering and exiting users with high reliability. These ingress/egress points and the times of user movement generate a movement matrix. Observing these ingress and egress movements, an attacker can reconstruct correct mapping between all these events (mapping between new and old pseudonyms). This mapping could be related to a bipartite graph that represents possible mapping of ingress and egress pseudonyms. The attacker cannot model an attack with this little knowledge though. He needs to measure the probabilities of these mappings and then finds a perfect match.
Palanisamy and Liu, 2014 propose a mix zone approcah that can be used for road networks. This mechanism considers the different context sensitive information that an adversary may use to obtain complete geometrical trajectory and temporal constraints.
2.4.2.2 K-Anonymity
Pfitzmann ; Kohntopp, 2001 define anonymity as “the state of being not identifiable within a set of subjects, the anonymity set”. It was first discussed in relational databases, where published data (e.g., census, medical) should not be linked to specific persons (Panos et al 2007).
k-anonymity is a wide-spread general privacy concept not restricted to location privacy. It provides the guarantee that in a set of k objects (in the author’s case, mobile users) the target object is indistinguishable from the other k – 1 objects. Thus, the probability to identify the target user is 1/k. The idea of this framework is that a user reports to a client a cloaked region containing her position and the positions of k?1 other users instead of her exact position (Gruteser & Grunwald 2003). Due to inherent delay, this approach may not be suitable for services that need quick response.
Wairimu, 2016 proposes a framework protects privacy of a user who moves and continuously sends the rectangular regions containing her location to LS.
The basic concept of k-anonymity has been used by various approaches to increase the efficiency of k-anonymity, for example, clique cloak approach, personalized k-anonymity, historical k-anonymity, and l- diversity.
Personalized k-anonymity model discusses about users having different privacy preferences based on different context and levels of privacy. It presents a few drawbacks in the existing anonymization mechanism such as the mechanism involving users to specify different ks at different times and requiring a large value of k to be successful and also leads to bad quality of service (QoS) in LBS applications.
In the l-diversity mechanism, a user’s location is unidentifiable from a collection of l different locations such as hospitals, restaurants, cinemas, pubs, etc. The l-diversity mechanism promises that the l-users positions are not only different but also they are located far enough from each other. If they do not differentiate this way, an adversary might know the specific victim user location with low imprecision because all the user positions might belong to the same semantic position.
2.4.2.3 Spatial Obfuscation
Spatial obfuscation approach preserves location privacy by reducing the position information precision by sending a circular region to the location server when the user accesses LBS (Ardagna 2007). The advantage of spatial obfuscation approaches is that it provides location privacy without a fully trusted location server (LS). In addition, the user defines only an obfuscation area according to her preference.
However, this advantage comes at the cost that the location service providers are not provided with a precise user location. Hence it decreases the quality of service. Also this approach does not provide user identity privacy, and superiority of information about an individual position is degraded in order to protect that mobile user’s position.
2.4.2.4 Dummies
This approach achieves location privacy by misleading an adversary by sending multiple dummy events (false positions) through an event injection method. An essential advantage of this approach is that the user herself can generate dummies without any need for other TTP components. However, it is challenging to create dummies which cannot be distinguished from the true user position, in particular, if an adversary has additional context information such as a map and can track the user for longer times (Wernke et al. 2014).
2.4.2.5 Private Information Retrieval (PIR)
Cryptographic location privacy mechanisms utilise private information retrieval (PIR) to add and enhance location privacy. In a PIR approach, without disclosing/learning any query data, the location server answers a user’s queries when she accesses LBS. The advantage of the PIR mechanism is that a compromised location server cannot disclose any user location information. The disadvantage of the PIR mechanism however is that the location server cannot execute computation on the shares, particularly range queries.
A drawback of PIR mechanism is the computation and communication overhead, making it hard to implement on handheld/portable devices. Also it is very difficult to service providers such as Google maps, as this will have more real time data (Shivaprasad, Li ; Zou 2016).
2.4.3 Background Knowledge Based Attacks
Attacker knowledge can be defined as, an adversary having prior knowledge of the system along with having access to the technologies that enables him to capture the events (Shivaprasad, Li ; Zou 2016). Attacker knowledge is categorised into two dimensions, namely temporal information and context information.
The temporal dimension considers whether the attacker has only access to a single user position or whether he can access historic information. In the first case, the attacker knows only a single snapshot of a user position. This is a common assumption for many privacy approaches. In the second case, the attacker knows a set of multiple position updates collected over time or even a whole movement trajectory. Such information could be revealed, for instance, by a compromised LS or a compromised client. In particular, if an LS got compromised, the attacker might also get historic position information of several users.
In the context dimension the attacker has additional context knowledge beyond spatio-temporal information. For instance, an advanced attacker might have additional context information provided by a phone book, statistical data, a map, etc. The attacker can use this information in addition to the known user positions. Attacks that exploit context information are called the context linking attacks which include; the probability distribution attack, map matching, and the personal context linking attack.
The probability distribution attack is based on gathered traffic statistics and environmental context information. The attacker tries to derive a probability distribution function of the user position over the obfuscation area. If the probability is not uniformly distributed, an attacker can identify areas where the user is located with high probability statistics and environmental context information.
In map matching the obfuscation area is restricted to certain locations where users can be located by removing all the irrelevant areas. For instance, a map could be used to remove areas like lakes from the obfuscation area, which effectively shrinks the obfuscation area size below the intended size. The attacker can also use semantic information provided by the map such as points of interest or type of buildings (bars, hospitals, residential building, etc.) to further restrict the effective obfuscation area size.
2.4.3.1 Personal Context Linking Attacks
This an attack based on personal context knowledge about individual users such as user preferences or interests. For instance, assume it is known that a user visits a pub on a regular basis at a certain point in time and that he uses simple obfuscation mechanism to protect his location privacy. Then, an attacker can increase his known precision of an obtained obfuscated position by decreasing the obfuscation area to locations of pubs within the obfuscation area.
The proposed solution to this problem is to group users into partitions according to their areas of interest (e.g., users who query frequently about restaurants, pubs, or night clubs, etc). Then, when a query is issued, the corresponding ASR is generated with users from the same interest group as the query source, such that each user in the ASR has an equally likely probability of having asked the query.
A special kind of the personal context linking attack is the observation attack, where the attacker has user knowledge gathered through observation. For instance, if a user is using pseudonyms and the attacker can see the observed user, then the attacker can retrace all prior locations of the user for the same pseudonym by a single correlation.
2.5 Legal Privacy Protections

CHAPTER THREE
3. RESEARCH METHODOLOGY
3.1 Introduction
This chapter provides the methodology the will be used in the research. It includes research method, design, testing techniques, tools for analysis, technology for development, proposed approach and target population.
3.2 Research Design
In this study the researcher will use the descriptive survey design. This is because descriptive survey research study has the dimension of investigating possible relationships between two or more variables. The descriptive survey design is ideal since it is concerned with making accurate assessment of the inference, distribution and relationship of the phenomenon (Edwards ; Gillies 2006). According to (Gay, 1981) descriptive research is a process of collecting data in order to answer questions concerning the current status of the subject in study.
3.3 Target Population and sample
The population of interest in this study will be mobile phone users in Uganda who use location based services applications to search for Point of Interest (POI). The researcher will adopt a simple random sampling technique. Random sampling is a sampling design in which k distinct items are selected from the n items in the population in such a way that every possible combination of k items is equally likely to be the sample selected (Thompson, 2012).
3.4 Instrumentation
The type of data that will be used in the research study is primary data collected through a questionnaire. According to (Kothari 1984) primary data is original information collected for the first time. To ensure reliability of data collected a pre-test of the questionnaire will be done to determine whether the respondents understand the questions correctly and where the questions do not seem clear enough, the necessary adjustments will be made. The questionnaire will be distributed to mobile phone users and will contain both open ended and close ended questions. Questionnaire is chosen because of its simplicity of administration and high reliability as advocated by (Babbie 1993). The items on the questionnaire will be developed on the basis of the objectives of the study. The collection of the data to the questionnaire will be done through mobile data collection (ODK).
3.5 Reliability and Validity of the instrument
Reliability refers to a measure of the degree to which a research instrument yields consistent results or data after repeated trials. This type of reliability is referred to as Test-Retest. Test and retest simply put, is that you should get the same result on test 1 as you do test 2 when the two tests are administered after a time lapse. Retest involves two administration of the measurement instrument (Yin, 2003). The sample for pre-test should be small hence four mobile phone users will be used for the pre-test of the instrument.
3.6 Development of Proposed Approach
An anonymizer system will be developed using scala (JVM-based). The developed anonymizer that will help cloaking queries will form ASR depending on user preferences or interests. An Android mobile application system will be created with a user interface that will help mobile users to search for POI. The mobile application system will be developed using Android developer.
3.7 System Architecture

3.7.1 Mobile Clients
Mobile devices include mobile phone, PDA, and other devices such as laptops with positioning capabilities. First, each mobile device computes its physical location from the GPS or WiFi component on the device. Mobile users will specify the Point of Interest they desire to search from the user interface of their mobile device in the proposed system. Mobile client will search for Point of Interest, and send a query (search term) to the Anonymizer.
3.7.2 Anonymizer
The anonymizer will receive the location of all the mobile clients. The physical location, computed by the mobile device, is sent to the anonymizer server with the query. The anonymizer sends to the LBS an Anonymized Spatial Region (ASR) instead of the actual user location. This procedure is called cloaking. Cloaking hides the actual location by a K-anonymity spatial region (ASR), which is an area that encloses the client that issued the query as well as at least k-1 other users.
The cloaking algorithm that groups users into partitions according to their areas of interest will be developed.
3.7.3 LBS
Provides access to location data sources for example the Google places. The LBS receives ASR, ignoring where exactly the user is, the LBS retrieves (and reports to the AZ) a Candidate Set (CS) that is guaranteed to contain the query results for any possible user location inside the ASR. The AZ receives the CS and reports to user the subset of candidates that corresponds to her original query.
3.7.4 Spatio-temporal database
The spatio-temporal database will manage data received from the LBS by supporting corresponding query functionalities. A spatiotemporal data is a kind of data where spatial locations are dynamically updated and/or extents along with time.
Contrary to traditional database systems, a spatiotemporal database must be able to manage the dynamic properties of spatiotemporal objects efficiently. As spatiotemporal objects constantly change, a spatiotemporal database usually receives intensive object updates in a short period of time. To reduce the number of object updates, a sampling-based updating method or a velocity-based updating method is usually adopted. The sampling-based updating method samples the spatial extent/location of a spatiotemporal object either periodically or whenever the spatial difference to the previous sample is greater than a pre-defined threshold. In the velocity-based updating method, a spatiotemporal object reports its location along with the velocity to a spatiotemporal database. The object does not issue a new report unless there is a velocity change. Besides the updating methods, spatiotemporal databases also employ novel access methods to efficiently store and update spatiotemporal objects (Mokbel et al. 2003).
3.8 Experiment and Test
According to (Stenneth.L; Philip S. 2012), success rate is one of the most important evaluation criteria. The main goal of any anonymization server is to maximize the number of messages that can be successfully anonymized with the personalize quality of service and privacy requirement desired. The success rate will be measured as the ratio of the number of successful anonymized request, by the total number of incoming mobile request. A success rate of 100% will imply that all the requests that are sent by the mobile clients are safely anonymized.
3.8.1 Performance measure
According to (Stenneth.L; Philip S. 2012), an algorithm with a lower cloaking time does better, because the cloaking time is a measure of the temporal complexity. Efficient cloaking implies that the algorithm spends less time processing the incoming mobile requests from the mobile clients.
Jmeter software will be used for simulation by sending a given number of http requests to the anonymizer server.
3.9 Conclusion or summary
The chapter includes the methodology that will be used in the research. The researcher will adopt a descriptive survey design. A descriptive survey research study is preferred since it has the dimension of investigating possible relationships between two or more variables. The descriptive survey design is ideal since it is concerned with making accurate assessment of the inference, distribution and relationship of the phenomenon. The approach that will be developed will be included in the anonymizer system which be developed using scala, and a mobile application system which will be developed using android developer. An experiment will be conducted to evaluate the approach for efficiency and effectiveness. The evaluation criteria will be success rate and performance measure.