Skip to content

Supplementary data for On the Structural Properties of Social Networks andtheir Measurement-calibrated Synthetic Counterparts

License

Notifications You must be signed in to change notification settings

marcessz/Social-Networks

Repository files navigation

Social-Networks

Supplementary data for the paper On the Structural Properties of Social Networks and their Measurement-calibrated Synthetic Counterparts - M. Nagy, R. Molontay (2019)

How to Cite

@inproceedings{nagy2019structural,
  title={On the Structural Properties of Social Networks and their Measurement-calibrated Synthetic Counterparts},
  author={Nagy, Marcell and Molontay, Roland},
  booktitle={2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)},
  pages={584--588},
  year={2019},
  organization={IEEE}
  }

Source

The graphs are collected from the following sources:

Summary of Networks

Domain Description Range of network size Number of networks
Friendship Online social networks of
friendships (mostly Facebook)
324-23,613
(median: 4,065)
58
Communication Retweet, email and
reply networks
96-33,696
(avg: 4,687
43
Collaboration Co-authorship and collaboration
networks (mostly scientific)
86-21,363
(median: 553)
19

Graph Measurements

The data folder contains the spreadsheets which contain the calculated metrics of the 120 real networks.

The calculated metrics are the following:

  • Assortativity (the Pearson correlation coefficient of degree between pairs of linked nodes)
    The assortativity coefficient is given by where the term is the mass function of the distribution of the remaining degrees (degree of the nodes minus one) and j and k indicate the remaining degrees. Furthermore, refers to the mass function of the joint probability distribution of the remaining degrees of the two vertices. Finally, denotes the variance of the remaining degree distribution with mass function i.e.
  • Average clustering coefficient (the average local clustering coefficient. The local clustering coefficient of a node quantifies how close its neighbours are to being a clique)
    The local clustering coefficient of vertex v is the fraction of pairs of neighbors of v that are connected over all pairs of neighbors of v. Formally: where is the neighbourhood of the node v i.e. the vertices adjacent to v.

The average (local) clustering coefficient of a G graph is defined as: where n is the size of the graph.

  • Average degree (the mean of the degrees)
  • Average path length (average number of steps along the shortest paths for all possible pairs of nodes)
    A path is a sequence of edges which connect a sequence of vertices. The distance between the vertices u and v is the length (number of edges) of the shortest path connecting them. The average path length of a graph G of size n is defined as:
  • Density (measures how dense the graph is)
    Graph density D is the ratio of the number of edges of the graph divided by the number of edges of a complete graph with the same number of vertices, i.e:
  • Global clustering coefficient (the number of closed triplets over the total number of triplets (both open and closed)),
  • Four interval degree probabilities introduced in this paper (quantifies the degree distribtuion of the graph)
  • Largest eigenvector centrality (eigenvector centrality is a measure of the influence of a node in a network),
  • Maximum degree (the maximum of the degrees)
    For a graph, the vector of eigenvector centralities satisfies the eigenvector equation , where is the largest eigenvalue of the graph's adjacency matrix . In other words, for a connected undirected graph, the vector of eigenvector centralities is given by the (suitably normalized) eigenvector of corresponding to the largest eigenvalue of the adjacency matrix.
  • Maximum vertex betweenness centrality (betweenness centrality is a measure of centrality in a graph based on shortest paths)
    The betweenness centrality of a node v is given by the expression: where is the total number of shortest paths from node s to node t and is the number of those paths that pass through v.
  • Maximum edge betweenness centrality (betweenness centrality is a measure of centrality in a graph based on shortest paths)
    The edge betweenness centrality of an edge is the number of shortest paths between pairs of vertices that run along it. In other words this is analogous to the previously defined , but here we consider an edge instead of a node.
  • Number of edges
  • Number of nodes (size of the network)
  • Pseudo diameter (diameter is the greatest distance between any pair of vertices. Pseudo diameter is an approximation of the diameter)

The correlation heatmaps of the metrics together with some analysis can be found here.

A detailed description of the dataset and the metrics can be found in Data-driven Analysis of Complex Networks and their Model-generated Counterparts

About

Supplementary data for On the Structural Properties of Social Networks andtheir Measurement-calibrated Synthetic Counterparts

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published