-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New unimod developments #7
Comments
A quick input from my collegues (not sure how this can be implemented):
|
While I didn't work on this since the EuroBioc 2018 I now pushed my draft with the classes into the ModificationClass branch. There I have a complex class hierarchy implemented:
The user just needs to call the universal constructor x <- "GKHNICHHGFBEOGMFHHLBRRQPQARHKSKFRMIJAGCSDGDQOHGRSKAMTSMBLDAIMGFAGENRGSPB"
m <- c(
UnimodModification("Met-loss+Acetyl:P-M"),
UnimodModification("Acetyl:K"),
UnimodModification("Acetyl:N-term"),
UnimodModification("Acetyl:S", fixed=FALSE, globalIndex=30, siteIndex=3:5)
# 30 40 49 54 70
)
m
sapply(m, class)
# [1] "ModificationFixedRegExp" "ModificationFixedRegExp"
# [3] "ModificationFixed" "ModificationNterm"
# [5] "ModificationVariable"
lapply(m , .countSites, pepseq=x)
# [[1]]
# [1] 0
#
# [[2]]
# [1] 0
#
# [[3]]
# [1] 4
#
# [[4]]
# [1] 1
#
# [[5]]
# [1] 0 1 2 3 4 5
.pepseq(m[[1]], x)
# [1] "GKHNICHHGFBEOGMFHHLBRRQPQARHKSKFRMIJAGCSDGDQOHGRSKAMTSMBLDAIMGFAGENRGSPB"
deltaMass(m, x)
# [[1]]
# [1] 0
#
# [[2]]
# [1] 0
#
# [[3]]
# [1] 168.0423
#
# [[4]]
# [1] 42.01056
#
# [[5]]
# [1] 0.00000 42.01056 84.02113 126.03169 168.04226 210.05282 @tfkillian: Feel free to completely remove this draft or build upon it. Your are welcome to ask any question. I am happy to help and review your code if you wish. (I wrote this at the train journey back home from the EuroBioc2018. I didn't touch it since then. Unfortunately there is no documentation and no unit tests in the ModificationClass branch.) Some notes I had taken at the EuroBioc 2018 (some may be outdated or do not correspond to the draft in the 2 use cases: modify mass and modify sequence (somehow coupled) modify mass
modify sequenceSome modifications like Met-loss modify the sequence, e.g. remove M:
Problems
User Interface for Modification InputCurrently all unimod modifications are in a
… and the user just needs to specify the Id:
To add custom modifications the user just need to supply a site (which For fixed modification on a specific site we would need an additional For variable modifications more columns would be useful: maximal number User Interface for outputWhile fixed all/fixed specific modifications produce just one While for the mass calculation it doesn’t matter if the 1st, 2nd, 3rd, … |
@pavel-shliaha these are use cases we are indeed considering. Thanks for the input. |
@pavel-shliaha sorry for ignoring you input in my first comment:
That's what I called a fixed and localized modification and is supported by the
That's what I called a variable modification and should be supported by the |
Let me start by providing a few more implementation details: setClassUnion("CharOrInt", c("character", "integer"))
.Modifiction <- setClass("Modification",
slots = c(modification = "character",
n = "integer",
position = "CharOrInt"))
## Constructor, setting defaults and checking inputs
Modification <- function(mod, n = NA_integer_, pos = NA_integer_) {
if (missing(mod))
stop("Please defined your modification")
## check that mod is a valid modification
if (grepl("N-term", mod)) pos <- "N-term"
if (grepl("C-term", mod)) pos <- "C-term"
if (anyNA(n)) {
n <- NA_integer_
} else {
if (length(n) == 1 && n <= 0)
stop("Can't define an absent")
if (any(n < 0))
stop("Can't define a negative number of modifications")
}
.Modifiction(modification = mod, n = n, position = pos)
} And now some definitions:
modType <- function(object) {
type <- rep(FALSE, 4)
names(type) <- c("variable", "fixed", "multiple", "positional")
if (!is.na(object@position)) type["positional"] <- TRUE
if (!anyNA(object@n)) {
if (length(object@n) == 2 && identical(object@n, 0:1)) type["variable"] <- TRUE
if (length(object@n) == 1 && object@n > 0) type["fixed"] <- TRUE
if (length(object@n) > 1 && all(object@n > 0)) type["multiple"] <- TRUE
}
type
}
setMethod("show", "Modification",
function(object) {
type <- modType(object)
cat(object@modification, " is ",
paste(names(type[type]), collapse = ", "),
".\n", sep = "")
}) Demonstration: > Modification("Acetyl:N-term")
Acetyl:N-term is positional.
> Modification("Acetyl:C-term", n = 2L)
Acetyl:C-term is fixed, positional.
> Modification("Acetyl:K", n = 0:1)
Acetyl:K is variable.
> Modification("Acetyl:C-term", n = 0:1)
Acetyl:C-term is variable, positional.
> Modification("Methyl:S", pos = 2L, n = 0:3)
Methyl:S is positional.
> Modification("Methyl:S", pos = 2L, n = 1L)
Methyl:S is fixed, positional.
> Modification("Methyl:S", pos = 2L, n = 0:1)
Methyl:S is variable, positional. The examples I consider here are a simplification compared to what is defined in @sgibb what do you think? |
That's possible true. While your current approach looks very simple it would work for most of the cases I guess. And I really like your fixed, variable, multiple, positional-approach. But what if the user wants a modification of the second K? (your I don't really like the |
Indeed, I don't considered a But is this a real use case/requested feature? Re the int/char slot, we could use 0 or 1 and Inf. What do you think of that? |
Thinking back at this, I think we could drop the |
I more or less assumed it because of this comment: lgatto/MSnbase#167 (comment)
Mh, that would just change I agree with your: setClass("Modification",
slots = c(modification = "character",
position = "integer",
n = "integer")) It is much simpler than my class hierarchy and currently I can't think of an example where it wouldn't work. |
Hi @sgibb - following up with our conversation about
unimod
at the EuroBioc2018 conference, I would suggest the following class (see also #6).where
modification
refers to a specific modification inunimod::modifications
.position
is an optionalinteger
defining where in a peptide sequence a modification is expected. Default would beNA
, for a modification to be present on any of the amino acids of that peptiden
is aninteger(0 - n)
where0:1
defines a variable modification,1
a fixed modification, and1 - n
the case where 1 to n modifications can be present (@pavel-shliaha's example of methyl, dimethyl, trimethyl)From there on we could build a `ModificationList. What do you think?
@tfkillian, who's starting to work with me, will have a go once we agree that this is the way forward.
The text was updated successfully, but these errors were encountered: