-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] stretch() not stretching $blocks in GRanges #85
Comments
Hi Miguel, So How are you expecting blocks to be updated? Are you expecting the total width of the block to be extended by 10? |
Here's a few ways you could do this: suppressPackageStartupMessages(library(plyranges))
test_path <- system.file("tests", package = "rtracklayer")
bed_file <- file.path(test_path, "test.bed")
gr <- read_bed(bed_file)
newGR <- stretch(anchor_5p(gr), 10)
# add stretch amount to each element by the number of blocks,
# naive if not completely divisible (off by one)
ans <- newGR %>%
mutate(blocks = resize(blocks, width(blocks) + 10 / lengths(blocks)))
ans
#> GRanges object with 5 ranges and 5 metadata columns:
#> seqnames ranges strand | name score itemRgb
#> <Rle> <IRanges> <Rle> | <character> <numeric> <character>
#> [1] chr7 127471197-127472373 + | Pos1 0 #FF0000
#> [2] chr7 127472364-127473540 + | Pos2 2 #FF0000
#> [3] chr7 127473521-127474697 - | Neg1 0 #FF0000
#> [4] chr9 127474698-127475874 + | Pos3 5 #FF0000
#> [5] chr9 127475855-127477031 - | Neg2 5 #0000FF
#> thick blocks
#> <IRanges> <IRangesList>
#> [1] 127471197-127472363 1-303,501-703,1068-1170
#> [2] 127472364-127473530 1-255,668-1172
#> [3] 127473531-127474697 1-1177
#> [4] 127474698-127475864 1-1177
#> [5] 127475865-127477031 1-1177
#> -------
#> seqinfo: 2 sequences from an unspecified genome; no seqlengths
width(ans)
#> [1] 1177 1177 1177 1177 1177
sum(width(ans$blocks))
#> [1] 609 760 1177 1177 1177
even_block_strech <- function(blocks, extend) {
part <- PartitioningByEnd(blocks)
lens <- width(part)
div <- extend %/% lens
rem <- extend %% lens
extends <- rep(div, times = lens)
extends[end(part)] <- extends[end(part)] + rem
# unlist and resize
collapsed <- unlist(blocks)
collapsed <- resize(collapsed, width(collapsed) + extends)
relist(collapsed, part)
}
ans <- newGR %>%
mutate(blocks = even_block_strech(blocks, 10))
ans
#> GRanges object with 5 ranges and 5 metadata columns:
#> seqnames ranges strand | name score itemRgb
#> <Rle> <IRanges> <Rle> | <character> <numeric> <character>
#> [1] chr7 127471197-127472373 + | Pos1 0 #FF0000
#> [2] chr7 127472364-127473540 + | Pos2 2 #FF0000
#> [3] chr7 127473521-127474697 - | Neg1 0 #FF0000
#> [4] chr9 127474698-127475874 + | Pos3 5 #FF0000
#> [5] chr9 127475855-127477031 - | Neg2 5 #0000FF
#> thick blocks
#> <IRanges> <IRangesList>
#> [1] 127471197-127472363 1-303,501-703,1068-1171
#> [2] 127472364-127473530 1-255,668-1172
#> [3] 127473531-127474697 1-1177
#> [4] 127474698-127475864 1-1177
#> [5] 127475865-127477031 1-1177
#> -------
#> seqinfo: 2 sequences from an unspecified genome; no seqlengths
width(ans)
#> [1] 1177 1177 1177 1177 1177
sum(width(ans$blocks))
#> [1] 610 760 1177 1177 1177
last_block_stretch <- function(blocks, extend) {
part <- PartitioningByEnd(blocks)
zeros <- rep_len(0L, sum(width(part)))
zeros[end(part)] <- extend
# unlist and resize
collapsed <- unlist(blocks)
collapsed <- resize(collapsed, width(collapsed) + zeros)
relist(collapsed, part)
}
ans <- newGR %>%
mutate(blocks = last_block_strech(blocks, 10))
#> Error in last_block_strech(blocks, 10): could not find function "last_block_strech"
ans
#> GRanges object with 5 ranges and 5 metadata columns:
#> seqnames ranges strand | name score itemRgb
#> <Rle> <IRanges> <Rle> | <character> <numeric> <character>
#> [1] chr7 127471197-127472373 + | Pos1 0 #FF0000
#> [2] chr7 127472364-127473540 + | Pos2 2 #FF0000
#> [3] chr7 127473521-127474697 - | Neg1 0 #FF0000
#> [4] chr9 127474698-127475874 + | Pos3 5 #FF0000
#> [5] chr9 127475855-127477031 - | Neg2 5 #0000FF
#> thick blocks
#> <IRanges> <IRangesList>
#> [1] 127471197-127472363 1-303,501-703,1068-1171
#> [2] 127472364-127473530 1-255,668-1172
#> [3] 127473531-127474697 1-1177
#> [4] 127474698-127475864 1-1177
#> [5] 127475865-127477031 1-1177
#> -------
#> seqinfo: 2 sequences from an unspecified genome; no seqlengths
width(ans)
#> [1] 1177 1177 1177 1177 1177
sum(width(ans$blocks))
#> [1] 610 760 1177 1177 1177 Created on 2020-09-30 by the reprex package (v0.3.0) |
Let me know if that's what you had in mind, and I'll close the issue. |
Sorry I didn't explain the expected outcome. I'm not expecting an even stretch, nor the last block without considering the GRange strand. I was expecting only the blocks containing the 3' UTR to be extended. For the case of positive-strand GRanges that would be the last block, while for negative-stranded it will be the first. The complement function would be to extend the 5' UTR, in which the opposite ending blocks are extended along with the GRanges. If $blocks awareness is not expected to be added to stretch() in future plyranges versions, I think your answer will already help me to do what I want. It's ok if you close the issue 👍 Thanks for your speedy follow-up. |
Ok that makes way more sense! So I think if this is urgent, you could modify the
What do you think? |
Sounds good, although not sure if it will be general enough, especially without considering the use of the anchor_3p() or anchor_5p() functions. If anchoring is not considered, even stretching of blocks will most probably give unexpected outcomes. Perhaps a specialized function will work better, in this case. |
Hello Stuart.
I want to use the stretch() function to extend the 5' UTR and 3'UTR ends of a GRanges object as loaded by using the plyranges::read_bed() function, but stretch() is not extending the $blocks IRanges contained in the GRanges.
Following is a reproducible example using the in-package data.
stretch() not stretching gr$blocks
Setup
Reproducible example
Loading a bed file using read_bed()
Extending 3 prime end works fine for GRanges main start and end
coordinates.
Total width is extended
width(gr)
width(newGR)
But blocks size remain the same
Session info
--
Is there a function I'm missing that does what I am expecting (blocks size update)?
Thanks for the help
Best
Miguel
The text was updated successfully, but these errors were encountered: