Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin based mechanism for Dataset definitions #284

Open
awilliamson opened this issue Feb 12, 2018 · 4 comments
Open

Plugin based mechanism for Dataset definitions #284

awilliamson opened this issue Feb 12, 2018 · 4 comments

Comments

@awilliamson
Copy link
Contributor

The idea would be to encapsulate datasets. All datasets share a common interface, for generators, evaluators, etc. When we create datasets, the same type of code is executed, but leads to an ugly long if chain.
The common use-case seems to be that people will want to make their own dataset, or to support future object datasets. In these instances modifying every mention within the source code can be tedious and error-prone.
I propose the use of a the plugin pattern. This allows those who so wish to drag a folder in, be self-loaded, self-identified. Then a mechanism ( I want to say strategy ) will select the appropriate plugin if loaded, and invoke whatever method via the interface. E.g getGenerators.

I spoke briefly to @hgaiser about this, and he recommended I make this issue, so that @de-vri-es can have a look.

I have been working on this as my project requires a confidential dataset, and keeping up with development changes is getting difficult with multiple files needing to be edited. I have made my own Dataset Class, as it is very similar to VOC standard but needed tweaking.

There is currently a library for handling a plugin system, which is minimal and written in standard python. I found very little in the way of Python Plugin systems in recent times, all seem to date 4+ years ago.

@de-vri-es
Copy link
Contributor

de-vri-es commented Feb 12, 2018

I think this is a good idea. I expect it will significantly clean up the scripts and remove a lot of code duplication between them. I do have a few concerns, but I think those should not be a problem:

  • I don't want to extend the API of generators themselves to support this. In fact, I kinda want to reduce the API (but that is another topic). Instead, this could be implemented as separate objects/functions that produce generators (and it sounds like you were also thinking along these lines).
  • The main focus of this repository should be the network itself. I think a system for this is useful, but it should first and foremost be simple and easy to maintain. If it becomes too complex it might make sense to develop it as a separate project that also works with other networks (but this would probably complicate it even further).
  • This is a bit personal, but I'm not a fan of overly OOy names like FooStrategyFactoryObserver or things like that (I'm exaggerating, but you get the point I think). I prefer passing callable objects (which could be regular functions) as arguments rather than having users derive some base class. To me this has two advantage: it prevents any names like FooStrategyFactoryObserver, but it also helps to keep boilerplate code to a minimum when implementing a new dataset. Of course, callable objects aren't always flexible enough so this not a strict rule.
  • I'd like to provide early feedback on the API and implementation.

I prefer github for discussions on code because the threshold is lower for random people to read it and join in. We can discuss on slack in person to get things going though. I'll summarize here from time to time if it feels important.

@hgaiser
Copy link
Contributor

hgaiser commented Feb 15, 2018

Should we close this issue and continue discussion on the PR?

@awilliamson
Copy link
Contributor Author

@hgaiser Perhaps leave it open until PR resolved? However, I see your point of discussing purely on the PR. But it would be nice to close this alongside PR resolution.

@de-vri-es
Copy link
Contributor

Agreed, lets keep the discussion at PR #291, but leave this open until it is actually resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants