couldn't find matching transcriptome, returning non-ranged SummarizedExperiment (alignment mode)

Greetings, 

I am following this vignette: https://bioconductor.org/packages/release/bioc/vignettes/tximeta/inst/doc/tximeta.html#Pre-computed_checksums.

I am trying to use tximeta on data from salmon's alignment-based mode and I am getting a `couldn't find matching transcriptome, returning non-ranged SummarizedExperiment` error.

I believe I am using the most recent versions of Bioconductor and Tximeta. I am using tximeta 1.16.0 and BiocVersion 3.16.

```
> sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux 8.6 (Ootpa)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblaso-r0.3.15.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BiocGenerics_0.44.0 tximeta_1.16.0     

loaded via a namespace (and not attached):
  [1] ProtGenerics_1.30.0           bitops_1.0-7                  matrixStats_0.63.0            bit64_4.0.5                  
  [5] filelock_1.0.2                progress_1.2.2                httr_1.4.4                    GenomeInfoDb_1.34.3          
  [9] tools_4.2.1                   utf8_1.2.2                    R6_2.5.1                      DBI_1.1.3                    
 [13] lazyeval_0.2.2                withr_2.5.0                   tidyselect_1.2.0              prettyunits_1.1.1            
 [17] bit_4.0.5                     curl_4.3.3                    compiler_4.2.1                cli_3.4.1                    
 [21] Biobase_2.58.0                xml2_1.3.3                    DelayedArray_0.24.0           rtracklayer_1.58.0           
 [25] readr_2.1.3                   rappdirs_0.3.3                stringr_1.4.1                 digest_0.6.30                
 [29] Rsamtools_2.14.0              XVector_0.38.0                pkgconfig_2.0.3               htmltools_0.5.3              
 [33] MatrixGenerics_1.10.0         dbplyr_2.2.1                  fastmap_1.1.0                 ensembldb_2.20.2             
 [37] rlang_1.0.6                   rstudioapi_0.14               RSQLite_2.2.18                shiny_1.7.3                  
 [41] BiocIO_1.8.0                  generics_0.1.3                jsonlite_1.8.3                vroom_1.6.0                  
 [45] BiocParallel_1.32.1           dplyr_1.0.10                  RCurl_1.98-1.9                magrittr_2.0.3               
 [49] GenomeInfoDbData_1.2.9        Matrix_1.5-3                  Rcpp_1.0.9                    S4Vectors_0.36.0             
 [53] fansi_1.0.3                   lifecycle_1.0.3               stringi_1.7.8                 yaml_2.3.6                   
 [57] SummarizedExperiment_1.28.0   zlibbioc_1.44.0               BiocFileCache_2.4.0           AnnotationHub_3.4.0          
 [61] grid_4.2.1                    blob_1.2.3                    parallel_4.2.1                promises_1.2.0.1             
 [65] crayon_1.5.2                  lattice_0.20-45               Biostrings_2.66.0             GenomicFeatures_1.50.2       
 [69] hms_1.1.2                     KEGGREST_1.38.0               pillar_1.8.1                  GenomicRanges_1.50.1         
 [73] rjson_0.2.21                  codetools_0.2-18              biomaRt_2.52.0                stats4_4.2.1                 
 [77] XML_3.99-0.12                 glue_1.6.2                    BiocVersion_3.16.0            BiocManager_1.30.19          
 [81] tzdb_0.3.0                    png_0.1-7                     vctrs_0.5.1                   httpuv_1.6.6                 
 [85] purrr_0.3.5                   assertthat_0.2.1              cachem_1.0.6                  mime_0.12                    
 [89] xtable_1.8-4                  restfulr_0.0.15               AnnotationFilter_1.22.0       later_1.3.0                  
 [93] tibble_3.1.8                  GenomicAlignments_1.34.0      AnnotationDbi_1.60.0          memoise_2.0.1                
 [97] IRanges_2.32.0                tximport_1.26.0               ellipsis_0.3.2                interactiveDisplayBase_1.36.0
```

How I generated my data from paired fastq files...

```
# define variables
REFERENCE_ENSEMBLE_FASTA="reference_data/mouse/GRCm39/fasta/Mus_musculus.GRCm39.dna.primary_assembly.fa" 
REFERENCE_ENSEMBLE_TRANSCRIPT="reference_data/mouse/GRCm39/fasta/Mus_musculus.GRCm39.cdna.all.fa" 
REFERENCE_ENSEMBLE_CHR_GTF="reference_data/mouse/GRCm39/annotations/Mus_musculus.GRCm39.108.chr.gtf"
GFFREAD_TRANSCRIPT_FILE_CHR="reference_data/mouse/GRCm39/fasta/Mus_musculus.GRCm39.108.chr_GFFREAD.fa"
SALMON_TRANSCRIPTS_INDEX="reference_data/mouse/GRCm39/salmon_transcripts_index/"

# create transcript file
gffread -w $GFFREAD_TRANSCRIPT_FILE_CHR -g $REFERENCE_ENSEMBLE_FASTA $REFERENCE_ENSEMBLE_CHR_GTF

# create star index folder
STAR --runThreadN 20 --runMode genomeGenerate --genomeDir $STAR_INDEX_DIR_CHR --genomeFastaFiles $REFERENCE_ENSEMBLE_FASTA --sjdbGTFfile $REFERENCE_ENSEMBLE_CHR_GTF --sjdbOverhang 75

# align each fastq files (paired) to reference, make bam files
STAR --runThreadN 18 --genomeDir $STAR_INDEX_DIR_CHR --readFilesIn $fq1 $fq2 --outSAMtype BAM SortedByCoordinate --quantMode TranscriptomeSAM GeneCounts --readFilesCommand zcat --outFileNamePrefix $aligned_sample_dir_chr --outReadsUnmapped Fastx

#create Salmon index
salmon index -t $REFERENCE_ENSEMBLE_FASTA -i $SALMON_TRANSCRIPTS_INDEX --decoys decoys.txt -k 31 -p 12

# quantify aligned bam files with salmon
salmon quant -t $GFFREAD_TRANSCRIPT_FILE_CHR -l A -p 10 -a $aligned_sample_dir_chr"Aligned.toTranscriptome.out.bam" -o $salmon_quant
```

```
suppressPackageStartupMessages(library('BiocVersion', lib.loc='/home/reichenbee/franco_lab/tools/R421'))

#reading in sample information
csvfile <- read.csv('experimental_groups.txt', header=T, sep='\t')

coldata <- csvfile
coldata$files <- file.path('salmon_quant_CHR', coldata$sample, 'quant.sf') #add file names/location
colnames(coldata)[1] <- 'names'
coldata <- coldata[14, ] # take one sample
coldata

#   names  group genotype                quant                         files
# 14   B69 cancer     XXXX salmon_quant_CHR/B69 salmon_quant_CHR/B69/quant.sf

se <- tximeta(coldata)

# importing quantifications
# reading in files with read_tsv
# 1 
# couldn't find matching transcriptome, returning non-ranged SummarizedExperiment
```

Based on this older post: https://github.com/mikelove/tximeta/issues/38, I went into one of the `cmd_info.json` files and placed  "index" and path to the salmon index directory inside. 

```
{
    "salmon_version": "1.9.0",
    "targets": "reference_data/mouse/GRCm39/fasta/Mus_musculus.GRCm39.108.chr_GFFREAD.fa",
    "libType": "A",
    "threads": "10",
    "alignments": "star_aligned_sample_files_CHR/B69/Aligned.toTranscriptome.out.bam",
    "output": "salmon_quant_CHR/B69/",
    "auxDir": "aux_info",
    "index": "salmon_transcripts_index/"
}
```
I re-ran it and received the same error message. 

I did try adding hash strings as suggested, but I was not really sure how to organize them inside the `aux_info/meta_info.json` file and got errors like this:

```
Error in parse_con(txt, bigint_as_char) : 
  parse error: invalid object key (must be a string)
          bc4deb376dcf88a2e45878911e", } 
                     (right here) ------^
```

Aside from using an incorrect reference file, I am out of error origin ideas. Any information that can resolve this error is greatly appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

couldn't find matching transcriptome, returning non-ranged SummarizedExperiment (alignment mode) #72

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

couldn't find matching transcriptome, returning non-ranged SummarizedExperiment (alignment mode) #72

Description

Activity

mikelove commented on Dec 5, 2022

errcricket commented on Dec 5, 2022

mikelove commented on Dec 5, 2022

errcricket commented on Dec 5, 2022

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions