Is long format preferable for the environmental data matrix? Should the example data set dnam_ex
be transposed?
#17
Labels
question
Further information is requested
When it comes to the matrix containing the environmental data, are long data matrices more favourable than wide matrices?
I'm guessing that in EWAS data sets with DNAm data, the number of individuals/samples will usually be smaller than the number of probes/CpGs. If this is correct, it would make sense to have the rows of the environmental data matrix represent the probe IDs as a rule if in fact long data is less demanding to process than wide data. (I don't know if this "long vs. wide" rule applies in
HaplinMethyl
or to ffdata; let me know if I'm wrong😊).In
dnam_ex
, the row names = the individual/sample IDs and the column names = the probe/CpG IDs (according to the vignettes). Should we transposednam_ex
so that the rows represent the probe IDs instead?The example data set is not large enough for this to make a difference in practice, but perhaps we should change it for illustrative purposes?
It might also be a good idea to explicitly mention in the package documentation and the vignettes that you can have
and offer some pointers regarding which of the formats users should use.
The text was updated successfully, but these errors were encountered: