Create a social network graph based on coappearance in images. Given a large volume of images of people this tool:
- Extracts faces
- Clusters faces based on similarity
- Creates a social network graph based on co-appearance in images
The three steps above correspond to the black arrows on the left of the diagram below:
face_network.extract_faces(source_dir, age_gender=False)
This function extracts all faces from a directory of images using Dlib’s face detector, and must be run prior to further analysis.
This function creates a new folder called “Face Network” in your image directory. When a face is identified, it is cropped and stored in a new folder “source_dir/Face Network/Faces/”. Given “Image.jpg” containing two faces, this function will save two cropped faces: “face1_Image.jpg” and “face2_Image.jpg”. Facial encodings (128-dimensional vectors used for clustering and matching similar faces) are stored in a file called “FaceDatabase.h5”.
face_network.cluster(source_dir, algorithm='DBSCAN', iterations=1, initial_eps=0.45, max_distance=45)
Once faces are extracted, similar faces are clustered together. This function uses a density-based clustering algorithm (DBSCAN) to identify clusters of similar faces in the list of facial encodings. Starting with loose clustering parameters, the function iteratively decreases the neighborhood distance parameter. In each iteration, facial similarity within clusters is evaluated. Dense clusters are extracted, and sparse clusters are assigned to be re-evaluated in the next iteration. When an iteration returns no new clusters, the function returns a dataframe containing facial encodings grouped into clusters based on similarity.
Rows in the FaceDatabase.h5 file now contain a unique numeric identifier, grouping similar faces into clusters. If the “mosaic” option is enabled, an image composed of all of the face tiles in a given cluster is created:
face_network.network(photo_dir, scale=10)
Having identified individuals across multiple pictures, this function generates a force directed graph based on co-appearance in images. Each individual is a node, and each co-appearance is an edge.
A file called “Image_Network.html” is created in "photo_directory/Face Network/Data/".
The graph can be opened in a web browser and is fully interactive. Hovering over a node will display a tooltip showing the cluster’s unique identifier. This corresponds to the filenames of the mosaics generated in the previous step:
Given a folder called “photo_dir” with five images, the following process extracts faces, clusters similar faces, and creates a co-appearance network:
import face_network
photo_dir="Users/oballinger/Downloads/photo_dir"
face_network.extract(photo_dir, age_gender=True)
face_network.cluster(photo_dir, algorithm='DBSCAN', iterations=10, initial_eps=0.44, max_distance=40)
face_network.network(photo_dir, size=20)
The diagram below shows the file structure of the resulting outputs: