Skip to content

Commit

Permalink
Merge pull request #1 from hackseq/master
Browse files Browse the repository at this point in the history
GitHub Pages
  • Loading branch information
dy-lin authored Oct 22, 2019
2 parents a9604ef + 98c310e commit 510b690
Show file tree
Hide file tree
Showing 11 changed files with 997 additions and 467 deletions.
11 changes: 8 additions & 3 deletions R/TextRankAnalysis.Rmd
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
---
title: "TextRank"
author: "Lucia Darrow"
date: "October 19, 2019"
output: html_document
date: "19/10/2019"
output:
html_document:
toc: true
---

```{r setup, include=FALSE}
Expand All @@ -14,12 +16,15 @@ knitr::opts_chunk$set(echo = TRUE)
library(data.table)
library(textrank)
library(udpipe)
library(tidyverse)
library(stringr)
library(magrittr)
```


```{r import}
termResults <- fread("./data/NGS.csv")
termResults <- fread("../data/NGS.csv")
termResults2019 <- termResults %>% filter(Year == 2019)
```
Expand Down
79 changes: 65 additions & 14 deletions R/bigram_relationships.Rmd
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
---
title: "Bigram Relationships"
author: "Shannon Lo"
date: "19 Oct 2019"
output: html_document
date: "19/10/2019"
output:
html_document:
toc: true
---

```{r Setup, include=FALSE}
Expand All @@ -22,6 +24,7 @@ library(tidygraph)
```

The following function(s) are defined below:

* **visualize_bigrams**: extracts bigrams from a text field, calculates frequency of bigrams, and creates a bigram plot to visualize relationships between words
+ *df_name*: name of dataframe that contains the text field of interest
+ *textfield*: name of text field (ie. column name)
Expand Down Expand Up @@ -78,6 +81,7 @@ visualize_bigrams <- function(df_name, textfield, topic_title){
```

Read in the following data sources:

* **Web scraped data**: this CSV contains information on journals and was created using *database_parallel2.R*
* **Topics and search terms**: this CSV contains 10 topics and 3 search terms for each topic
```{r Read Data, results='hide'}
Expand Down Expand Up @@ -129,17 +133,64 @@ df_varcall <- df %>%
```

Create bigram plots to visualize the relationships between two words. The darker lines represent higher frequency of occurrence.
```{r Create Bigram Plots}
#visualize_bigrams(df,abstract, "All Topics")
visualize_bigrams(df_assembly, abstract, "Assembly")
visualize_bigrams(df_databases, abstract, "Databases")
visualize_bigrams(df_epigenetics, abstract, "Epigenetics")
visualize_bigrams(df_geneexp, abstract, "Gene Expression")
visualize_bigrams(df_genomeann, abstract, "Genome Annotation")
visualize_bigrams(df_phylogenetics, abstract, "Phylogenetics")
visualize_bigrams(df_seqal, abstract, "Sequence Alignment")
visualize_bigrams(df_sequence, abstract, "Sequencing")
visualize_bigrams(df_strucpred, abstract, "Structural Prediction")
visualize_bigrams(df_varcall, abstract, "Variant Calling")

## Assembly

```{r Create Bigram Assembly}
# visualize_bigrams(df,abstract, "All Topics")
visualize_bigrams(df_assembly, abstract, "")
```

## Databases

```{r Create Bigram Database}
visualize_bigrams(df_databases, abstract, "")
```

## Epigenetics

```{r Create Bigram Epigenetics}
visualize_bigrams(df_epigenetics, abstract, "")
```

## Gene Expression

```{r Create Bigram Gene Expression}
visualize_bigrams(df_geneexp, abstract, "")
```

## Genome Annotation

```{r Create Bigram Genome Annotation}
visualize_bigrams(df_genomeann, abstract, "")
```

## Phylogenetics

```{r Create Bigram Phylogenetics}
visualize_bigrams(df_phylogenetics, abstract, "")
```

## Sequence Alignment

```{r Create Bigram Alignment}
visualize_bigrams(df_seqal, abstract, "")
```

## Sequencing

```{r Create Bigram Sequencing}
visualize_bigrams(df_sequence, abstract, "")
```

## Structural Prediction

```{r Create Bigram Structural Prediction}
visualize_bigrams(df_strucpred, abstract, "")
```

## Variant Calling

```{r Create Bigram Variant Calling}
visualize_bigrams(df_varcall, abstract, "")
```
148 changes: 106 additions & 42 deletions R/bigram_relationships.html

Large diffs are not rendered by default.

10 changes: 6 additions & 4 deletions R/general_vis.Rmd
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
---
title: "General Visualizations"
author: "Jasmine and Diana"
author: "Jasmine Lai and Diana Lin"
date: "20/10/2019"
output: html_document
output:
html_document:
toc: true
---
```{r, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, error = FALSE, message = FALSE, warning = FALSE, fig.width=12, fig.height=8)
Expand Down Expand Up @@ -74,7 +76,7 @@ for (yr in 2003:2019) {
```

The data is limited to the establishment of PLoS in 2003
```{r racing bars animate, echo=TRUE}
```{r racing bars animate, echo=FALSE}
p <- ordered_df %>%
ggplot(aes(ordering, group = topic))+
geom_tile(aes(y = cum_total/2,
Expand All @@ -89,7 +91,7 @@ p <- ordered_df %>%
transition_states(Year, transition_length = 8, state_length = 4, wrap = FALSE) +
ease_aes("cubic-in-out") +
#aesthetics
labs(subtitle = "Trends in sequening methods",title = "Year {closest_state}", y = "cumulative total papers") +
labs(subtitle = "Trends in sequencing methods",title = "Year {closest_state}", y = "cumulative total papers") +
theme(plot.background = element_blank(),
legend.position = "none",
axis.ticks.y = element_blank(),
Expand Down
41 changes: 9 additions & 32 deletions R/general_vis.html

Large diffs are not rendered by default.

Loading

0 comments on commit 510b690

Please sign in to comment.