Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

output-style from report #65

Open
strengejacke opened this issue Feb 14, 2020 · 14 comments
Open

output-style from report #65

strengejacke opened this issue Feb 14, 2020 · 14 comments
Labels
enhancement 💥 Implemented features can be improved or revised feature idea 🔥 New feature or request

Comments

@strengejacke
Copy link
Member

strengejacke commented Feb 14, 2020

I think we can / should improve the output-style from reporting model tables. Currently, it is:

library(report)
library(magrittr)
data(iris)

lm(Sepal.Length ~ Petal.Length + Species, data=iris) %>%
  report() %>%
  table_long() 
#> Parameter         | Coefficient |   SE | CI_low | CI_high |     t | df_error |    p | Std_Coefficient |    Fit
#> --------------------------------------------------------------------------------------------------------------
#> (Intercept)       |        1.50 | 0.19 |   1.12 |    1.87 |  7.93 |      146 | 0.00 |            1.50 |       
#> Petal.Length      |        1.93 | 0.14 |   1.66 |    2.20 | 13.96 |      146 | 0.00 |            1.93 |       
#> Speciesversicolor |       -1.93 | 0.23 |  -2.40 |   -1.47 | -8.28 |      146 | 0.00 |           -1.93 |       
#> Speciesvirginica  |       -2.56 | 0.33 |  -3.21 |   -1.90 | -7.74 |      146 | 0.00 |           -2.56 |       
#>                   |             |      |        |         |       |          |      |                 |       
#> AIC               |             |      |        |         |       |          |      |                 | 106.23
#> BIC               |             |      |        |         |       |          |      |                 | 121.29
#> R2                |             |      |        |         |       |          |      |                 |   0.84
#> R2 (adj.)         |             |      |        |         |       |          |      |                 |   0.83
#> RMSE              |             |      |        |         |       |          |      |                 |   0.33

Created on 2020-02-14 by the reprex package (v0.3.0)

Things that can be improved

  1. CIs can be collapsed into one column, like in model_parameters().
  2. Column Std_Coefficient is identical to Coefficient
  3. My main concern are the fit indices, which are additional rows for an additional column. I think we can change the stlye here, having
  • top left: headline, maybe formula, or "linear regression" or so
  • top right: fit indices
  • bottom: coefficient table

For the layout of 3) I have something like the stata output in mind (without table for sums of squares)

hqdefault

or

image

@DominiqueMakowski
Copy link
Member

Mmh I thought that the output was using the same pipeline that model_parameters() so that it would automatically format for instance the CI column 🤔 but it's true that I got lost in the endless calls of methods. Happy that your fresh and hawk-like eye finds out things to improve

@strengejacke strengejacke added enhancement 💥 Implemented features can be improved or revised feature idea 🔥 New feature or request labels Mar 19, 2020
@strengejacke
Copy link
Member Author

Some points have been resolved:

library(report)
library(magrittr)
data(iris)

lm(Sepal.Length ~ Petal.Length + Species, data=iris) %>%
  report() %>%
  table_long() 
#> Parameter            | Coefficient |   SE |             CI |     t |  df |      p | Coefficient (std.) |    Fit
#> ---------------------------------------------------------------------------------------------------------------
#> (Intercept)          |        3.68 | 0.11 | [ 3.47,  3.89] | 34.72 | 146 | < .001 |               1.50 |       
#> Petal.Length         |        0.90 | 0.06 | [ 0.78,  1.03] | 13.96 | 146 | < .001 |               1.93 |       
#> Species [versicolor] |       -1.60 | 0.19 | [-1.98, -1.22] | -8.28 | 146 | < .001 |              -1.93 |       
#> Species [virginica]  |       -2.12 | 0.27 | [-2.66, -1.58] | -7.74 | 146 | < .001 |              -2.56 |       
#>                      |             |      |                |       |     |        |                    |       
#> AIC                  |             |      |                |       |     |        |                    | 106.23
#> BIC                  |             |      |                |       |     |        |                    | 121.29
#> R2                   |             |      |                |       |     |        |                    |   0.84
#> R2 (adj.)            |             |      |                |       |     |        |                    |   0.83
#> RMSE                 |             |      |                |       |     |        |                    |   0.33

Created on 2020-09-18 by the reprex package (v0.3.0)

Now 3) is still remaining. And we should add the CI-level to the column name as well...

@DominiqueMakowski Maybe we should just copy the print() method from model_parameters() to report as well?

@DominiqueMakowski
Copy link
Member

@DominiqueMakowski Maybe we should just copy the print() method from model_parameters() to report as well?

Yes, the only thing that we wanted to do and that was implemented very early on, is to color code the values. We had for instance green/red for the coefficient depending on the direction, and white/yellow for CIs excluding 0 and p-values < 0.1 (and pd > 95%). But we could bake that directly into parameters down the line as well

@bwiernik
Copy link
Contributor

bwiernik commented Apr 6, 2021

I would suggest putting the fit indices in rows below the table, in the Coefficient column (with SEs, p values, etc. as relevant). Something like this:

library(report)
library(magrittr)
data(iris)

lm(Sepal.Length ~ Petal.Length + Species, data=iris) %>%
  report() %>%
  table_long() 
#> Parameter            | Coefficient |   SE |             CI |     t |  df |      p | Coefficient (std.) 
#> ------------------------------------------------------------------------------------------------------
#> (Intercept)          |        3.68 | 0.11 | [ 3.47,  3.89] | 34.72 | 146 | < .001 |    .50        
#> Petal.Length         |        0.90 | 0.06 | [ 0.78,  1.03] | 13.96 | 146 | < .001 |   1.93       
#> Species [versicolor] |       -1.60 | 0.19 | [-1.98, -1.22] | -8.28 | 146 | < .001 |  -1.93       
#> Species [virginica]  |       -2.12 | 0.27 | [-2.66, -1.58] | -7.74 | 146 | < .001 |  -2.56    
#> ------------------------------------------------------------------------------------------------------                 
#> AIC                  |   106.23    |      |                |       |     |        |                    
#> BIC                  |   121.29    |      |                |       |     |        |                    
#> R2                   |     0.84    |      |                |       |     |        |                    
#> R2 (adj.)            |     0.83    |      |                |       |     |        |                    
#> RMSE                 |     0.33    |      |                |       |     |        |                    

That is compact and permits reporting of intervals with the same format as coefficients. It would also play nicely with RMarkdown output.

@rempsyc
Copy link
Member

rempsyc commented Sep 3, 2022

Just a thought: the vertical separator lines look neat and is a good visual helper to separate the cells. However, I have a small laptop screen, and those lines sometimes make the output much wider, so that the output that normally fits in my screen ends up having to wrap at the end of my console, messing the whole output and making the thing unreadable because things don't align anymore. That said, I feel like that might be more of a "me" problem (and I should probably get an external monitor). Combining the 95% CI was a good call in making it narrower though.

@strengejacke
Copy link
Member Author

I think you can pass the sep argument via the print() methods.

library(parameters)
m <- lm(Sepal.Width ~ Species, data = iris)

model_parameters(m)
#> Parameter            | Coefficient |   SE |         95% CI | t(147) |      p
#> ----------------------------------------------------------------------------
#> (Intercept)          |        3.43 | 0.05 | [ 3.33,  3.52] |  71.36 | < .001
#> Species [versicolor] |       -0.66 | 0.07 | [-0.79, -0.52] |  -9.69 | < .001
#> Species [virginica]  |       -0.45 | 0.07 | [-0.59, -0.32] |  -6.68 | < .001
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald t-distribution approximation.

print(model_parameters(m), sep = "")
#> Parameter           Coefficient  SE        95% CIt(147)     p
#> -------------------------------------------------------------
#> (Intercept)                3.430.05[ 3.33,  3.52] 71.36< .001
#> Species [versicolor]      -0.660.07[-0.79, -0.52] -9.69< .001
#> Species [virginica]       -0.450.07[-0.59, -0.32] -6.68< .001
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald t-distribution approximation.

Created on 2022-09-03 with reprex v2.0.2

I'm not sure which IDE you use, but with vscode you can easily create shortcuts to show/hide/expand panels. I usually use a 3-column layout (also on my laptop), and either expand console as needed, or switch console- and editor panel from side-by-side to top-bottom. But anyway, we should ensure that all arguments that we have in export_table() are passed down to that function from report-print() methods

Here examples with a very minimized window:

screensize1.mp4
screensize2.mp4

@rempsyc
Copy link
Member

rempsyc commented Sep 3, 2022

Ok let's take the extreme case of correlation(mtcars) |> summary(). Actually a reprex is less useful here because GitHub makes an horizontal scrolling bar instead of wrapping. And the print(sep = "") trick is very cool but seems like it doesn't work with correlation for some reason.

library(correlation)
correlation(mtcars) |>
  summary() |> 
  print(sep = "")
#> # Correlation Matrix (pearson-method)
#> 
#> Parameter |    carb |    gear |       am |       vs |     qsec |       wt |     drat |       hp |     disp |      cyl
#> ---------------------------------------------------------------------------------------------------------------------
#> mpg       |  -0.55* |    0.48 |   0.60** |   0.66** |     0.42 | -0.87*** |  0.68*** | -0.78*** | -0.85*** | -0.85***
#> cyl       |   0.53* |   -0.49 |   -0.52* | -0.81*** |   -0.59* |  0.78*** | -0.70*** |  0.83*** |  0.90*** |         
#> disp      |    0.39 |  -0.56* |   -0.59* | -0.71*** |    -0.43 |  0.89*** | -0.71*** |  0.79*** |          |         
#> hp        | 0.75*** |   -0.13 |    -0.24 | -0.72*** | -0.71*** |   0.66** |    -0.45 |          |          |         
#> drat      |   -0.09 | 0.70*** |  0.71*** |     0.44 |     0.09 | -0.71*** |          |          |          |         
#> wt        |    0.43 |  -0.58* | -0.69*** |   -0.55* |    -0.17 |          |          |          |          |         
#> qsec      | -0.66** |   -0.21 |    -0.23 |  0.74*** |          |          |          |          |          |         
#> vs        |  -0.57* |    0.21 |     0.17 |          |          |          |          |          |          |         
#> am        |    0.06 | 0.79*** |          |          |          |          |          |          |          |         
#> gear      |    0.27 |         |          |          |          |          |          |          |          |         
#> 
#> p-value adjustment method: Holm (1979)

Created on 2022-09-03 by the reprex package (v2.0.1)

And thanks for the tip of resizing my console to full screen width. It works in RStudio too (which I use), and I might have used that before and then just stopped bothering with it haha. I should probably learn the keyboard shortcuts for that though...

@strengejacke
Copy link
Member Author

Adding another example of tabular output styles:
https://www.statsmodels.org/stable/index.html

image

@rempsyc
Copy link
Member

rempsyc commented Feb 2, 2023

Probably unpopular opinion, but I think the vertical lines (in our printing method) are ugly. I'm fully aware that this is a strong bias because I've fully introjected APA style conventions. But I still think our outputs should actually look more like this 😈

library(bruceR)

model = lm(Temp ~ Month + Day + Wind + Solar.R, data=airquality)
print_table(model)
#> ──────────────────────────────────────────────
#>              Estimate    S.E.      t     p    
#> ──────────────────────────────────────────────
#> (Intercept)    68.770 (4.391) 15.662 <.001 ***
#> Month           2.225 (0.441)  5.047 <.001 ***
#> Day            -0.084 (0.070) -1.194  .234    
#> Wind           -1.003 (0.176) -5.695 <.001 ***
#> Solar.R         0.027 (0.007)  3.991 <.001 ***
#> ──────────────────────────────────────────────

Created on 2023-02-02 with reprex v2.0.2

https://psychbruce.github.io/bruceR/reference/print_table.html

Edit: I'm also OK with tabular output.

@strengejacke
Copy link
Member Author

Yes, feel free to modify the output style! We have a certain style in, say, parameters, and I don't see why we need to copy this in report as well. It's ok to make it different here. See https://easystats.github.io/insight/reference/export_table.html, you can change the sep argument to remove vertical lines.

@DominiqueMakowski
Copy link
Member

and I don't see why we need to copy this in report as well. It's ok to make it different here.

Agreed. And APA-like style as a default makes sense I'd say

@bwiernik
Copy link
Contributor

bwiernik commented Feb 2, 2023

The main reason for the vertical lines is that they are valid markdown tables. We should prioritize making tables that become formatted properly when compiled to html, word, or pdf

@rempsyc
Copy link
Member

rempsyc commented Feb 3, 2023

We should prioritize making tables that become formatted properly when compiled to html, word, or pdf

Good point. But where or for whom should we prioritize such tables? For the pkgdown website? It seems not to be working as expected there: https://easystats.github.io/report/articles/report.html#grouped-dataframes

Or do you mean for users that compile reports to html, word, or pdf for other reasons? What is the added benefit from using html_document: df_print: kable for all those formats which will apply to all data frames?

Regardless, the table pasted as text (not code) does format correctly, but how do you get the same result from within a code chunk to these formats? It does not seem to pick it up automatically. Even using results = "asis", I'm not getting the same result.

report table

@strengejacke
Copy link
Member Author

For markdown, we have the format argument in export_table(). See https://easystats.github.io/insight/articles/display.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 💥 Implemented features can be improved or revised feature idea 🔥 New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants