Passing named lists to `.SDcols` / `.SD`

Is there any scope/appetite for supporting multiple `.SD` and `.SDcols`?

**Motivation:** I frequently encounter situations where I need to perform different aggregation tasks on distinct column groups. One group of columns will be aggregated as means, another group will be aggregated as medians, yet another group will be aggregated as sums, etc. In these cases, only one group can be passed through the convenience features of `.SD(cols)`, while the other group(s) must all be aggregated manually. 

Here's a simple (and somewhat ill-advised) example that illustrates the mechanics:

``` r
library(data.table)
d = as.data.table(iris)

d[, 
  c(lapply(.SD, sum),
    list(Petal.Length = mean(Petal.Length), Petal.Width = mean(Petal.Width))), 
 .SDcols = patterns("^Sepal"), 
 by = Species]
#>       Species Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1:     setosa        250.3       171.4        1.462       0.246
#> 2: versicolor        296.8       138.5        4.260       1.326
#> 3:  virginica        329.4       148.7        5.552       2.026
```
Here the summed Sepal columns get the `.SD` convenience treatment, while I have to manually take the mean of the Petal columns separately (and name them; see also https://github.com/Rdatatable/data.table/issues/1227#issuecomment-844760088).

My proposal is to allow something like this instead:

```r
d[, 
  c(lapply(.SD, sum), lapply(.SD2, mean)), 
  .SDcols = patterns("^Sepal"), .SDcols2 = patterns("^Petal"),
  by = Species]
```

I'm assuming here that you can match the relevant subsets based on the index (`.SD2` =>  `.SDcols2`). If this is easy to do for one additional subset, then in principle it seems possible for any additional subsets (`.SD3` => `.SDcols3`, etc). Of course, this may impose some small overhead that doesn't pass the cost-benefit test.

Feel free to close if this seems undesirable / too much work. Thanks for considering!

**Update.** Related:  https://github.com/Rdatatable/data.table/issues/1063#issuecomment-748159218 and possibly https://github.com/Rdatatable/data.table/issues/4970

Proposed solution in the former is:

```r
s_cols = grep("^Sepal", names(d), value = TRUE)
p_cols = grep("^Petal", names(d), value = TRUE)
d[, 
  {
   SD = unclass(.SD)
   c(lapply(SD[s_cols], sum), lapply(SD[p_cols], mean))
  },
  .SDcols = c(s_cols, p_cols),
  by = Species]
#>       Species Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1:     setosa        250.3       171.4        1.462       0.246
#> 2: versicolor        296.8       138.5        4.260       1.326
#> 3:  virginica        329.4       148.7        5.552       2.026
```

<details><summary>SessionInfo</summary>
<p>

```r
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux

Matrix products: default
BLAS/LAPACK: /usr/lib/libopenblas_haswellp-r0.3.13.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
```

</p>
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Passing named lists to `.SDcols` / `.SD` #5020

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Passing named lists to .SDcols / .SD #5020

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Passing named lists to `.SDcols` / `.SD` #5020