Skip to content

Manifest content hash computation times out #6123

@dsotirho-ucsc

Description

@dsotirho-ucsc

Computing content hash for manifest took over 30 seconds. A reindex was not occurring at this time.

CloudWatch logs (anvilprod):

[
    {
        "@timestamp": "2024-04-02 18:22:46.909",
        "@message": "START RequestId: 25cf6787-dc1c-415f-a5c2-bd788d467dfb Version: $LATEST\n"
    },
…
    {
        "@timestamp": "2024-04-02 18:22:46.925",
        "@message": "[DEBUG]\t2024-04-02T18:22:46.925Z\t25cf6787-dc1c-415f-a5c2-bd788d467dfb\t
                     azul.service.manifest_service\tComputing content hash for manifest using 
                     filters Filters(explicit={
                         'donors.organism_type': {
                             'is': ['Homo sapiens', None]
                         }, 
                         'files.file_format': {
                             'is': ['.vcf.gz', '.cram', '.crai', '.tbi', '.md5', '.idat', '.csi',
                                    '.gtc', '.tar', '.bam', '.svs', '.bai', '.txt', '.tab.gz', 
                                    '.bigWig', '.junc.gz', '.fastq.gz', '.idat.gz', '.csv', 
                                    '.bed.gz', '.tsv.gz', '.gvcf.gz', '.interval_list', '.txt.gz',
                                    '.bw', '.bedpe', '.alignment_summary_metrics', 
                                    '.arrays_control_code_summary_metrics', '.loupe', 
                                    '.bait_bias_detail_metrics', '.bait_bias_summary_metrics', 
                                    '.crosscheck', '.duplicate_metrics', '.insert_size_metrics', 
                                    '.pre_adapter_detail_metrics', '.pre_adapter_summary_metrics', 
                                    '.quality_distribution_metrics', '.fingerprinting_summary_metrics', 
                                    '.h5', '.detail_metrics', '.summary_metrics', '.tsv', '.mtx', 
                                    '.md5sum', '.bcf', '.vcf', '.vcf.bgz', '.raw_wgs_metrics', 
                                    '.wgs_metrics', '.cloupe', '.html', '.ped', '.log', '..gz', '.bed', 
                                    '.wdl', '.bb', '.egt', '.xlsx', '.tar.gz', None, '.gct.gz', 
                                    '.variant_calling_summary_metrics', '.R', '.fai', '.fasta', 
                                    '.h5ad', '.dict', '.jpg', '.md', '.pdf', '.readme', '.chain', 
                                    '.docx', '.fingerprintcheck', '.gct', '.sizes', 
                                    '.variant_calling_detail_metrics', '.xls', '.zip', '.bim', '.fam', 
                                    '.gff3.gz', '.gtf', '.gtf.gz', '.ipynb', '.parquet', '.png']
                         }
                     }, 
                     source_ids={
                         '0af50ad2-1e53-4cc1-ba71-280b04a9f488',
                         'df0b99c1-1b1d-482d-930c-dea8d6101f78',
                         '3047ec0b-695e-4c1f-8c20-bf060b5f72aa',
                         '1167858d-f9d9-4dbc-bbcb-473ac21eb823',
                         '2f8a866a-b38a-447c-b2d4-a22d998e659a',
                         '3ac713b5-3645-4381-ac66-ecbc281a2ab8',
                         …
                         "
    },
    {
        "@timestamp": "2024-04-02 18:22:46.929",
        "@message": "[INFO]\t2024-04-02T18:22:46.929Z\t25cf6787-dc1c-415f-a5c2-bd788d467dfb\t
                     elasticsearch\tMaking POST request to
                     https://vpc-azul-index-anvilprod-ggipah4skn2ftt47u4xgvydzqm.us-east-1.es.amazonaws.com:443/azul_v2_anvilprod_anvil4_files_aggregate/_search"
    },
    {
        "@timestamp": "2024-04-02 18:22:46.929",
        "@message": "[INFO]\t2024-04-02T18:22:46.929Z\t25cf6787-dc1c-415f-a5c2-bd788d467dfb\t
                     elasticsearch\t… with request body b'{\"query\":{\"bool\":{\"must\":[{
                     \"constant_score\":{\"filter\":{\"bool\":{\"should\":[{\"terms\":{
                     \"contents.donors.organism_type.keyword\":[\"...'"
    },
    {
        "@timestamp": "2024-04-02 18:23:17.948",
        "@message": "2024-04-02T18:23:17.948Z 25cf6787-dc1c-415f-a5c2-bd788d467dfb 
                     Task timed out after 31.04 seconds\n\n"
    },
    {
        "@timestamp": "2024-04-02 18:23:17.948",
        "@message": "END RequestId: 25cf6787-dc1c-415f-a5c2-bd788d467dfb\n"
    },
    {
        "@timestamp": "2024-04-02 18:23:17.948",
        "@message": "REPORT RequestId: 25cf6787-dc1c-415f-a5c2-bd788d467dfb\t
                     Duration: 31038.88 ms\tBilled Duration: 31000 ms\tMemory Size: 2048 MB\tMax Memory Used: 239 MB\t\n"
    }
]

Metadata

Metadata

Assignees

Labels

-[priority] Mediummanifests[subject] Generation and contents of manifestsscale limit[subject] An upper bound to the size or complexity of a task, request or operation

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions