Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for categorical attributes #26

Open
meakbiyik opened this issue Apr 20, 2021 · 1 comment
Open

Support for categorical attributes #26

meakbiyik opened this issue Apr 20, 2021 · 1 comment
Assignees
Labels
Difficulty: High Expected workload is several months. Status: Blocked Issue requires other issues (e.g. bugs, dependencies) to be resolved before it can be addressed. Type: Feature Request Issue is about adding a new feature.

Comments

@meakbiyik
Copy link

meakbiyik commented Apr 20, 2021

Something from SAOM: when we test a statistic over a categorical (e.g. character) attribute, SAOM creates multiple binary attributes from that and tests them altogether, with some good naming scheme. As far as I remember, it looked like this (though I might be mistaken):

ego.gender.male -> ...
ego.gender.female -> ...

We can of course achieve this by manually one-hot-encoding these variables, but this would result in very long formulas that might be harder to oversee, and lots of preprocessing work. I am also not sure whether this can be optimized if done on goldfish's side. In any case, I would really like to be able to add such an effect:

formula <- dep_var ~ ego_categorical(days_of_week)

This also would be mighty helpful with global attributes! (as mentioned in #25 )

@auzaheta auzaheta added Difficulty: High Expected workload is several months. Status: Blocked Issue requires other issues (e.g. bugs, dependencies) to be resolved before it can be addressed. Type: Feature Request Issue is about adding a new feature. labels Apr 21, 2021
@jhollway jhollway self-assigned this Apr 14, 2022
@jhollway
Copy link
Collaborator

jhollway commented May 4, 2022

This could be handled in preprocessing where necessary, similar to how it is managed in network_reg() in {migraph}?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Difficulty: High Expected workload is several months. Status: Blocked Issue requires other issues (e.g. bugs, dependencies) to be resolved before it can be addressed. Type: Feature Request Issue is about adding a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants