Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparing array subset checks #125

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

gabteles
Copy link

No description provided.

@mblumtritt
Copy link

mblumtritt commented May 16, 2017

Your code above is massively dependent to used sample arrays. Using this samples

ARRAY1 = [*(400..500)].freeze
ARRAY2 = [*(1..500)].freeze

will change your results completely. I would prefer to use a more real-world sample for the test to avoid misleadings…

@Arcovion
Copy link
Collaborator

Arcovion commented May 16, 2017

Maybe (ARRAY1 & ARRAY2).size == ARRAY1.size is faster - also try the set library; convert both #to_set and check with #subset?

@gabteles
Copy link
Author

@mblumtritt What you suggest as a real world example? It would be nice if I generate a random-filled array of integers and let it static in the test or something like that?

@Arcovion Nice, I'll add both.

Thank you guys. (:

@mblumtritt
Copy link

@gabteles Do you mean something like this?

ARRAY1 = [*(400..500)].shuffle.freeze
ARRAY2 = [*(1..500)].shuffle.freeze

Then we have to other problem: how is the weight between the set and the subset 🤔 Or do you prefer a complete random set to compare?…

In "real world" I would check my chances to be faster with a complete different algorithm. For sample what about sorting first descending? This might be faster if both sets already nearly sorted…

I'm unsure to give a "general advice" for this subset problem. In "real world" it always depends… ;)

@gabteles
Copy link
Author

gabteles commented Jun 1, 2017

@mblumtritt Yeah, I agree that the weight between set/subset is a big problem to benchmark. I'm not sure how to handle it.

Before I meant something like this:

  • Generate random arrays (generated with small size to better readability in this example, test could have bigger arrays):
$ irb
2.4.1 :001 > 10.times.map { rand(100) }
 => [20, 83, 86, 53, 63, 52, 50, 43, 18, 94] 
2.4.1 :002 > 10.times.map { rand(100) }
 => [27, 93, 84, 86, 71, 89, 38, 48, 37, 22]
  • Statically use it in code:
ARRAY1 = [20, 83, 86, 53, 63, 52, 50, 43, 18, 94].freeze
ARRAY2 = [27, 93, 84, 86, 71, 89, 38, 48, 37, 22].freeze

Using it statically grants that we won't have performance improved/degraded by randomness effect of shuffle or rand in runtime. But it also suffers of the weight problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants