-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: dynamically adding checks during execution? #222
Comments
I might have something to help you here. You're correct in that check50 creates a dependency DAG prior to executing checks. And because checks run in a separate process, it's impossible, or at least very difficult, to change the DAG from within a check. However, to create this DAG, check50 will import the checks module. That means any code that is in that module will run. You can use this to dynamically create checks prior to running checks. In fact I've done so in an extension of my own at: Here I use one check ( @check50.check()
def exists():
...
for cell in cells:
# Execute the cell
try:
execute(cell)
exception = ""
passed = True
except check50.Failure as f:
exception = str(f)
passed = False
# If it's a test cell, record result
if check_jupyter.is_test_cell(cell):
results.append((check_jupyter.get_cell_id(cell), passed, exception))
# Pass down the results to the actual checks
return tuple(results) Then I dynamically create checks, that each depend on for test_id in get_test_ids("module 1.ipynb"):
check = create_check(test_id)
# Register the check with check50
check = check50.check(dependency=exists)(check)
# Add the check to global module scope
globals()[test_id] = check The dynamically created checks themselves just look up their results and behave accordingly. def check(results):
for id, passed, exception in results:
if id == test_id and not passed:
raise check50.Failure(exception)
return results This makes it so that I don't have to write a check50 check for each unittest, the set of unittests is run exactly once, and from the students perspective, each unittest is just a check50 test. Hopefully this helps! |
Thanks @Jelleas this is very close to what I had in mind. The problem with this for my particular use case is that getting the list of test IDs before running any checks is not possible. The thing is that unit tests may not even compile, because the student's submission may not match the expected method signatures. This is why I believe it makes sense to have one check for compiling the unit test (and possibly running them all) to be able to complain about a non-conformant submission without blowing up check50. Does that make sense? At the moment, the only ways around this problem I can think of would be to include compiled test classes into the problem set and hope that the produced |
I've decided to go include compiled tests, run them once and interpret the junit xml report by looking for all testcases, then introducing one check, as you propose. I am observing some really strange behaviour trying to reproduce your approach. Basically, I've hard-coded the list of test cases for now, and am trying to create checks as you do in The check itself does nothing, has check.__name__ = "4"
check.__doc__ = "ItemTest__getPrice passes" The output of
Now, if I omit line 90 or line 93 in your code, then none of these child-checks run. (both register the check globally, why are both necessary?). If I include both lines, I see the error above, but only for the last of the generated child checks! Any help would be much appreciated; Thanks a lot! |
check50 will create a directory (inside a temporary directory) for each check, named after the check. So for
Apparently, the directory (or file)
What line 90 does is manually call the
But because decorators in Python are essentially just functions that produce functions themselves , we can transform the above to:
Line 93 goes ahead, and puts the newly created function in the
You can now call the function
Underneath the hood check50 uses the
For this to work, each check needs to have a |
Thanks again for your quick response Jelle!
Quoting Jelle van Assema (2020-07-04 20:23:33)
I am observing some really strange behaviour trying to reproduce your
approach.
Could you comment on this?
check50 will create a directory (inside a temporary directory) for each check,
named after the check. So for check.__name__ = "4", check50 will try to create
the directory 4 within a temporary directory (happens to be /tmp/tmpw3sw2n7o in
this case). Looks like that's failing now with the message:
FileExistsError: [Errno 17] File exists: '/tmp/tmpw3sw2n7o/4'
I understand (from reading the check50 sources).
Apparently, the directory (or file) 4 already exists. My guess is, without
seeing the rest of the code, that there's probably two checks with the same
__name__. In that case, renaming one should fix it.
That's the thing. There is no other check with that name.
I suspect that the one that I have is registered more than once for some reason, but I don't see why that would only affect the one last check in the sequence.
I have posted a stripped down version of my code here in case you want to have a look:
https://gist.github.com/pazz/f109183b56cf8919e1efd358f3b7cc9a
Now, if I omit line 90 or line 93 in your code, then none of these
child-checks run. (both register the check globally, why are both
necessary?).
What line 90 does is manually call the check50.check decorator, that would
normally be used to mark checks in check50 like so:
@check50.check(dependency=exists)
def my_check():
...
But because decorators in Python are essentially just functions that produce
functions themselves , we can transform the above to:
def my_check():
...
my_check = check50.check(dependency=exists)(my_check)
Yes, I was referring to this line inside the definition of the decorator, which seems to register the check in a global list called "_check_names".
https://github.com/cs50/check50/blob/b8b140ffe0beb701b801e70071916b7d3207add7/check50/runner.py#L122
I was wondering why, if the check is known there, it needs to be registered as well under the pset module.
But that was just a side issue and a design decision, which is fine.
I am not sure how to debug my pset module during set up except by calling check50._api.log, so I cannot be 100% sure, but while I was playing with check50.run, it sometimes looked as if my child-checks (which only log in the linked gist) caused their parent to run multiple times:
Only the parent called run, and I saw it logged multiple times in the check50 output).
Botton line is that I suspect my checks are executed more than once, but not all of them.
|
I tried to replicate the error from this snippet, but I couldn't exactly. I did however run into another issue that might be the underlying cause of this. Because you are running the following inside the module, and not inside a function (such as
All variables become globals, including the variable
That means there are now two globals pointing to the same check, check 4 in your case. Both checks will try to create a directory called Should be a simple fix, just wrap this bit of code in a function and call the function instead:
|
Oh, looks like I'm blocking a PR that should make this easier. Long story short, a recent release accidentally inverted the logic for check50's verbose mode and through that Jumping back to
|
Of course! that was exactly what I was missing and it works just as expected now. |
Could you explain what's going on here? import check50
@check50.check()
def item_exists():
pass
@check50.check()
def item_still_exists():
pass
def init():
print("init called") It seems that with every new
|
This is semi-true, check50 runs checks in parallel by default through multi-processing. But because processes are relatively expensive to start, check50 uses a pool of them through The ability to easily disable multi-processing has been on our minds for some time, but there wasn't been a compelling enough use-case yet to push it up the list of priorities. But perhaps this is one! For now, if you would like to opt-out, you can set the environment variable import os
os.environ["CHECK50_WORKERS"] = "1" Note that this still runs your checks in a separate process from the tool itself. That means your checks module will be imported twice, once by the tool, and once by the process running your checks. |
That sounds like a plausible explanation but your fix seems to have no effect:
I get the exact same result when removing the first two lines of |
The bug I described above is somewhat unrelated to the original feature request. Regarding dynamically adding checks: I noticed that the dependency DAG is in fact a tree: you cannot declare more than one dependency per check. This means instead of precomputing a full dependency tree, one could in principle only keep a global todo-queue, which contains only unfinished parent checks but not its children. |
You're right, with the notebooks the time spent on creating the dynamic checks is rather negligible and as such I wasn't concerned with how many times it actually runs. I can see how that is different in your situation. The problem at hand is, currently each "check/worker" process needs to have all possible checks in memory. Because the "manager/main" process can call for any check to run in any process. This is how the current model with a Now my solution was to rediscover and recreate each check every time. But in your case, perhaps it's best if you discover checks once (run the shell command once), but then leverage something persistent like the file system to communicate the results of the discovery. Then have each check process dynamically create checks from that. Something like: https://gist.github.com/Jelleas/21278df92f40e4804bd74084a2a3dbc0 I did have to add support for this in check50. This currently lives in a feature branch over at https://github.com/cs50/check50/tree/run_dir. So to try out my gist, please install check50 via:
|
As to why |
I would like to make this possible, and I would like to natively support dynamic checks too. Such that we don't have to "hack" it in. I do also like the idea of being able to create checks only once they become relevant, a collapsible list of sorts. Where you only start to show checks for some part of an assignment once students reach that part. But this all causes a lot of friction with multiprocessing. In particular, once a check starts to dynamically create other checks, we'd then need to ensure those checks are run in the same process as the check that created them. That's a level of control that we don't have with Keeping that in mind, a bit of future talk, I propose a new decorator perhaps. That signals a priori, this check can create other checks and cannot be run through our current model of
But perhaps baby-steps first, I think a lot of headaches and gotchas in this thread have to do with multiprocessing. And being able to opt-out of MP would address most of the issues. Something like:
Edit: just to note, it would always be two processes (a manager and a worker) to catch any timeouts and any potential crashes. |
Hello once more, and sorry for spamming you with feature requests without contributing much first.
I am teaching an undergraduate "Intro to OOP" class, with exercises in Java. In order to facilitate check50 checks for structural inspection (class hierarchy, or method signatures), I wrote a small extension that allows to run junit5 unit tests from check50. At this point, its use is rather cumbersome due to the fact that one has to manually add check50 checks for every unit test on its own unless one is fine with one big check for all unit tests in a test class.
It'd be great if I could dynamically add checks to the dependency tree at execution time.
This would allow to first compile a (junit) test class, then extract the contained unit tests and interpret them as individual check50 checks for marking. In fact, I can easily run all unit test in one go and retroactively interpret the outcome. So it would even suffice to make check50 aware of new checks after the results are computed. But for the sake of consistent student feedback Id'be be great to at least make it look as if regular checks have been run.
It looks to me as if check50's runner class computes the dependency DAG of checks before executing them.
Any pointers would be much appreciated.
Thanks again for an awesome tool,
P
The text was updated successfully, but these errors were encountered: