-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
float and string error #220
Comments
Hmm, it looks like it's possibly one of the FRQ columns, based on the line
that's throwing the error (mtag.py line 357 it looks like ).
If you are willing to make a tiny tweak to the code, you could add a print
of "GWAS_int.dtypes" on the line right before the error and that could tell
us what data types have been assigned to each column. Alternatively, you
could check your freq columns in your data to confirm that they are all
truly numeric.
…On Thu, Oct 3, 2024 at 10:47 AM zhong156 ***@***.***> wrote:
Hello! I hope you are doing well! I am trying to use finngen GWAS summary
statistics to run MTAG, but I get this error:
Trait 2: Dropped 64331 SNPs for duplicate values in the "snp_name" column
Dropped 975022 SNPs due to strand ambiguity, 6388280 SNPs remain in
intersection after merging trait1
Dropped 8179 SNPs due to inconsistent allele pairs from phenotype 2.
6287048 SNPs remain.
unsupported operand type(s) for -: 'float' and 'str'
Traceback (most recent call last):
File "mtag.py", line 1577, in
mtag(args)
File "mtag.py", line 1343, in mtag
DATA_U, DATA, args = load_and_merge_data(args)
File "mtag.py", line 357, in load_and_merge_data
GWAS_int.loc[snps_to_flip, freq_name + str(p)] = 1. -
GWAS_int.loc[snps_to_flip, freq_name + str(p)]
File
"/home/zhong156/.conda/envs/2024.02-py311/py27_env/lib/python2.7/site-packages/pandas/core/ops.py",
line 1583, in wrapper
result = safe_na_op(lvalues, rvalues)
File
"/home/zhong156/.conda/envs/2024.02-py311/py27_env/lib/python2.7/site-packages/pandas/core/ops.py",
line 1533, in safe_na_op
lambda x: op(x, rvalues))
File "pandas/_libs/algos.pyx", line 690, in pandas._libs.algos.arrmap
File
"/home/zhong156/.conda/envs/2024.02-py311/py27_env/lib/python2.7/site-packages/pandas/core/ops.py",
line 1533, in
lambda x: op(x, rvalues))
File
"/home/zhong156/.conda/envs/2024.02-py311/py27_env/lib/python2.7/site-packages/pandas/core/ops.py",
line 148, in rsub
return right - left
TypeError: unsupported operand type(s) for -: 'float' and 'str'
Analysis terminated from error at Thu Oct 3 10:31:05 2024
Total time elapsed: 6.0m:30.12s
I have tried making the chr, bpos, freq, z, pval columns numeric but I
still get this error. I was wondering if you would know what part of the
data is causing this issue. Thank you so much!
—
Reply to this email directly, view it on GitHub
<#220>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/APIOF56EV7JLF3RQF2CW36TZZVKHNAVCNFSM6AAAAABPJ6VY5CVHI2DSMVQWIX3LMV43ASLTON2WKOZSGU3DIMRVHAYTKOA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thank you so much for your reply! I added print datatype and the frequency was interpreted as object: but I read my data into r and it is numeric: |
I'm not sure what the differences are between Python and R in terms of how
they read things in and convert to numeric values. R is apparently making
some judgment that a particular input maps to some numeric value that
Pandas / Python does not feel as confident about. Regardless, it looks
like something is maybe up with one or more values in the frequency column
in whatever your second data file is. I'd go through that and look for
NaNs or string / character values.
…On Thu, Oct 3, 2024 at 2:31 PM zhong156 ***@***.***> wrote:
Thank you so much for your reply! I added print datatype and the frequency
was interpreted as object:
SNP object
CHR0 int64
BP0 int64
FRQ0 float64
A10 object
A20 object
Z0 float64
P0 float64
N0 int64
strand_ambig bool
CHR1 int64
BP1 float64
FRQ1 object
A11 object
A21 object
Z1 float64
P1 float64
N1 float64
flip_snps1 bool
dtype: object
but I read my data into r and it is numeric:
sapply(data1, class)
snpid chr bpos freq a1 a2 z pval
"character" "integer" "integer" "numeric" "character" "character"
"numeric" "numeric"
n
"integer"
Do you know why this happens? Thank you so much!
—
Reply to this email directly, view it on GitHub
<#220 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/APIOF53NWA7AWPJREHU26Q3ZZWEO7AVCNFSM6AAAAABPJ6VY5CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJSGA3DIMJQGM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Thank you so much! There were some missing values in the snp column and it worked after I removed them. |
Hello! I hope you are doing well! I am trying to use finngen GWAS summary statistics to run MTAG, but I get this error:
Trait 2: Dropped 64331 SNPs for duplicate values in the "snp_name" column
Dropped 975022 SNPs due to strand ambiguity, 6388280 SNPs remain in intersection after merging trait1
Dropped 8179 SNPs due to inconsistent allele pairs from phenotype 2. 6287048 SNPs remain.
unsupported operand type(s) for -: 'float' and 'str'
Traceback (most recent call last):
File "mtag.py", line 1577, in
mtag(args)
File "mtag.py", line 1343, in mtag
DATA_U, DATA, args = load_and_merge_data(args)
File "mtag.py", line 357, in load_and_merge_data
GWAS_int.loc[snps_to_flip, freq_name + str(p)] = 1. - GWAS_int.loc[snps_to_flip, freq_name + str(p)]
File "/home/zhong156/.conda/envs/2024.02-py311/py27_env/lib/python2.7/site-packages/pandas/core/ops.py", line 1583, in wrapper
result = safe_na_op(lvalues, rvalues)
File "/home/zhong156/.conda/envs/2024.02-py311/py27_env/lib/python2.7/site-packages/pandas/core/ops.py", line 1533, in safe_na_op
lambda x: op(x, rvalues))
File "pandas/_libs/algos.pyx", line 690, in pandas._libs.algos.arrmap
File "/home/zhong156/.conda/envs/2024.02-py311/py27_env/lib/python2.7/site-packages/pandas/core/ops.py", line 1533, in
lambda x: op(x, rvalues))
File "/home/zhong156/.conda/envs/2024.02-py311/py27_env/lib/python2.7/site-packages/pandas/core/ops.py", line 148, in rsub
return right - left
TypeError: unsupported operand type(s) for -: 'float' and 'str'
Analysis terminated from error at Thu Oct 3 10:31:05 2024
Total time elapsed: 6.0m:30.12s
I have tried making the chr, bpos, freq, z, pval columns numeric but I still get this error. I was wondering if you would know what part of the data is causing this issue. Thank you so much!
The text was updated successfully, but these errors were encountered: