Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a data classifier, I would like the transition from one call type to another to be more performant #113

Open
jamesiarmes opened this issue Sep 12, 2022 · 1 comment

Comments

@jamesiarmes
Copy link
Member

When a call type classification page is loaded, the data set file appears to be downloaded each time. There is likely some additional parsing that happens at this time as well. We should look into this and determine if this needs to be done and if so how we can address the performance impact.

This has presented itself in at least two data sets:

@jamesiarmes
Copy link
Member Author

After looking into this a bit, I found that the classifyr does a "lazy" generation of the examples for each unique value. This isn't a problem for small data sets, but larger data sets take considerably longer to generate these examples.

We can look at improving the performance of generating these examples, but I also recommend backgrounding this process and generating the full set of examples for a data set before classification begins.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

1 participant