Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional signal input for gABC scores #64

Open
nictru opened this issue Oct 18, 2023 · 2 comments
Open

Add optional signal input for gABC scores #64

nictru opened this issue Oct 18, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@nictru
Copy link
Collaborator

nictru commented Oct 18, 2023

STARE can handle activity measures in order to improve its affinity calculations. The type of measure is not really important. In order to make use of this, we can implement an optional input for condition-specific signal inputs and then query the signal values based on the STARE input regions.

Currently we can take every kind of bed file as input, since only the first 3 columns are relevant. In this sense, also narrowPeak and broadPeak files are bed-like files. We also could provide the user with the option to select one column of the input peak files as signal values. Problem with this approach is, that the signal is not continous and we potentially need signal values for regions with no matching peak. We could kind of take the average of neighbouring peaks in this case, but I do not think this makes sense biologically.

@nictru nictru added the enhancement New feature or request label Oct 18, 2023
@nictru nictru self-assigned this Oct 18, 2023
@mlist
Copy link

mlist commented Oct 18, 2023

something to ask Marcel how they deal with this but if no peaks are there should we not set the contribution to zero?

@nictru
Copy link
Collaborator Author

nictru commented Oct 18, 2023

So by design we currently have 3 ways of determining potentially active regions based on the input peak files:

  1. INSIDE: Take peaks as they are (ATAC-Seq mostly)
  2. BETWEEN: If there are two peaks with a max distance of k, use the region between them
  3. INCL_BETWEEN: Like 2, but also include the original peaks

By definition, in the second case we do not have any peaks in the investigated regions. Since this is the method we used for the HM-ChIP-Seq analyses, this is quite relevant.

PS: For INSPECT we also want to further shrink the investigated regions using various enhancer localization methods, such as eHMM, which would make even more local activity information required

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants