-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.Rmd
80 lines (58 loc) · 2.41 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# gdim
<!-- badges: start -->
[![R-CMD-check](https://github.com/RoheLab/gdim/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/RoheLab/gdim/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/RoheLab/gdim/branch/main/graph/badge.svg)](https://app.codecov.io/gh/RoheLab/gdim?branch=main)
[![CRAN status](https://www.r-pkg.org/badges/version/gdim)](https://CRAN.R-project.org/package=gdim)
<!-- badges: end -->
`gdim` estimates graph dimension using cross-validated eigenvalues, via the graph-splitting technique developed in <https://arxiv.org/abs/2108.03336>. Theoretically, the method works by computing a special type of cross-validated eigenvalue which follows a simple central limit theorem. This allows users to perform hypothesis tests on the rank of the graph.
## Installation
You can install `gdim` from CRAN with:
``` r
install.packages("gdim")
# to get the development version from GitHub:
install.packages("pak")
pak::pak("RoheLab/gdim")
```
## Example
`eigcv()` is the main function in `gdim`. The single required parameter for the function is the maximum possible dimension, `k_max`.
In the following example, we generate a random graph from the stochastic block model (SBM) with 1000 nodes and 5 blocks (as such, we would expect the estimated graph dimension to be 5).
```{r}
library(fastRG)
B <- matrix(0.1, 5, 5)
diag(B) <- 0.3
model <- sbm(
n = 1000,
k = 5,
B = B,
expected_degree = 40,
poisson_edges = FALSE,
allow_self_loops = FALSE
)
A <- sample_sparse(model)
```
Here, `A` is the adjacency matrix.
Now, we call the `eigcv()` function with `k_max=10` to estimate graph dimension.
```{r eigcv}
library(gdim)
eigcv_result <- eigcv(A, k_max = 10)
eigcv_result
```
In this example, `eigcv()` suggests `k=5`.
To visualize the result, use `plot()` which returns a `ggplot` object. The function displays the test statistic (z score) for each hypothesized graph dimension.
```{r}
plot(eigcv_result)
```
## Reference
Chen, Fan, Sebastien Roch, Karl Rohe, and Shuqi Yu. “Estimating Graph Dimension with Cross-Validated Eigenvalues.” ArXiv:2108.03336 [Cs, Math, Stat], August 6, 2021. https://arxiv.org/abs/2108.03336.