-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add VCF to JSON conversion #60
Labels
enhancement
New feature or request
Milestone
Comments
from pydantic import BaseModel, Field, validator
from typing import Optional
class Model(BaseModel):
chr: str = Field(..., description="The chromosome where the structural variant (SV) is located")
start_position: int = Field(..., description="The start position of the SV on the chromosome", ge=0)
end_position: int = Field(..., description="The end position of the SV on the chromosome", ge=0)
end_chr: Optional[str] = Field(None, description="The end chromosome where the SV is located, if different from the start chromosome")
end_chr_start_position: Optional[int] = Field(None, description="The start position of the SV on the end chromosome, if different from the start chromosome", ge=0)
end_chr_end_position: Optional[int] = Field(None, description="The end position of the SV on the end chromosome, if different from the start chromosome", ge=0)
sv_type: str = Field(..., description="The type of the SV, such as deletion, duplication, inversion, translocation, etc.")
size: int = Field(..., description="The size of the SV in base pairs", ge=0)
caller: str = Field(..., description="The name of the tool or algorithm that detected the SV")
qc: str = Field(..., description="The quality control status of the SV")
genotype: str = Field(..., description="The genotype of the SV, such as 0/0, 0/1, 1/1, etc.")
relevant_genes: Optional[list[str]] = Field(None, description="The list of genes that are affected by the SV")
population_frequency: Optional[float] = Field(None, description="The frequency of the SV in the general population, if available", ge=0, le=1)
repeat_content: Optional[bool] = Field(None, description="Whether the SV is located in a repeat region or not")
@validator('end_chr', 'end_chr_start_position', 'end_chr_end_position', always=True)
def check_end_chr(cls, v, values):
# If end_chr is not None, then end_chr_start_position and end_chr_end_position must also be not None
if values.get('end_chr') is not None and (values.get('end_chr_start_position') is None or values.get('end_chr_end_position') is None):
raise ValueError('end_chr_start_position and end_chr_end_position must be specified if end_chr is not None')
# If end_chr is None, then end_chr_start_position and end_chr_end_position must also be None
if values.get('end_chr') is None and (values.get('end_chr_start_position') is not None or values.get('end_chr_end_position') is not None):
raise ValueError('end_chr_start_position and end_chr_end_position must be None if end_chr is None')
return v |
Can you also post an example of a JSON entry used to create this model? |
{
"chr": "chr1",
"start_position": 123456,
"end_position": 123789,
"end_chr": null,
"end_chr_start_position": null,
"end_chr_end_position": null,
"sv_type": "deletion",
"size": 333,
"caller": "Manta",
"qc": "PASS",
"genotype": "0/1",
"relevant_genes": ["BRCA1"],
"population_frequency": 0.001,
"repeat_content": false
} |
Some simple notes I already have seeing this:
These are just suggestion, please let me know what you think 😃 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description of feature
Add VCF to JSON conversion with certain filter to use for the visualization of the SVs
The text was updated successfully, but these errors were encountered: