Skip to content

TeamErlich/venter_response

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

In R:

source('no_face_value.R')
success_rate = test_venter(rounds, n)

n is the number of people in each group rounds is the number of simulation rounds. success_rate is the identifiability power of demographic iddentifiers

The function runs a simple procedure that matches the Venter et al. definition of identifiability. In each round, the function genreates sex, age, and self-reported ethnicity labels for n people according to the distributions of the Venter paper. It then takes the first person to be the person of interest and compares whether this person is unique in the n people. If the combination of labels for this person is unique, it says: "Success!"

Venter had a team of 30 researchers that developed fancy face morphology predictions, voice signatures, and many other sophisticated algorithms. Using the same success creterion as above, they reported a success rate of 80% fof a group of n=10.

You are about to test a reidentiability procedure that uses age, sex, and ethnic group. These data labels are not protected by HIPAA and took me less than an hour to develop. Try running:

source('no_face_value.R')
success_rate = test_venter(1000, 10)

and see the success rate of my simple procedure.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages