Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take component size into account for Q13(a) and Q14(a) variants #197

Open
szarnyasg opened this issue Oct 13, 2022 · 2 comments
Open

Take component size into account for Q13(a) and Q14(a) variants #197

szarnyasg opened this issue Oct 13, 2022 · 2 comments
Assignees

Comments

@szarnyasg
Copy link
Member

szarnyasg commented Oct 13, 2022

To make the queries really challenging, in the paramgen we could select components that are roughly similar in size -- and reasonably large, if possible.

@szarnyasg szarnyasg self-assigned this Oct 13, 2022
@szarnyasg szarnyasg changed the title Take component size into account Q13(a) and Q14(a) variants Take component size into account for Q13(a) and Q14(a) variants Oct 13, 2022
@szarnyasg
Copy link
Member Author

szarnyasg commented Oct 13, 2022

Getting two large components may not be possible... E.g. in the SF10,000 data set, most people are in a single large CC while the rest are all isolated nodes.

D create or replace table cc as select * from read_parquet('factors-sf10000/parquet/raw/composite-merged-fk/personKnowsPersonConnected/*.parquet');
D select distinct component, count from cc order by count desc limit 5;
┌────────────────┬──────────┐
│   Component    │  count   │
├────────────────┼──────────┤
│ 14             │ 26519261 │
│ 10995121104613 │ 1        │
│ 10995121125221 │ 1        │
│ 10995121148539 │ 1        │
│ 10995121161903 │ 1        │
└────────────────┴──────────┘

@szarnyasg
Copy link
Member Author

See the related issue for the BI workload.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant