Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to contribute? #3

Open
Faldict opened this issue Jul 9, 2020 · 9 comments
Open

How to contribute? #3

Faldict opened this issue Jul 9, 2020 · 9 comments
Labels
good first issue Good for newcomers

Comments

@Faldict
Copy link
Contributor

Faldict commented Jul 9, 2020

Hello,

I find it is really an interesting project, and I would like to make some contribution. What can I start with?

@ashryaagr
Copy link
Owner

Glad to hear that you found the project interesting.
The project is currently being developed by me and my mentors under the JSOC program by Julia Computing. After confirmation from my mentors, I will let you know shortly whether we can have external open-source contributions as well before our first release.
Thanks!

@ashryaagr
Copy link
Owner

ashryaagr commented Jul 10, 2020

I have confirmed. We can have external open-source contributions :-)

Before starting with any contribution I recommend that you first go through the documentation and understand the design of the package and concepts like fairness tensor and wrappers.

Possible contributions can be

These are a few possible things I could think of. There might be even more. Feel free to discuss in case of any suggestions or feedback or issues. I am available on Julia slack workspace as "Ashrya Agrawal". You can join the workspace using https://slackinvite.julialang.org/

@ashryaagr ashryaagr pinned this issue Jul 10, 2020
@vollmersj
Copy link
Collaborator

@Faldict thanks for reaching out - above is quite comprehensive - let us know where you interst and strength lie and we can work something out - happy to jump on a call.
Plots are great: Aequitas is doing a great job at this

@Faldict
Copy link
Contributor Author

Faldict commented Jul 13, 2020

Thanks for your response! I think I could start with adding the fairness datasets. My question here is, why does the dataset macro return the tuple (X, Y, Y_hat). If I understand correctly, the Y_hat is the prediction and it may need training on the dataset. Why not return the sensitive attributes directly?

@ashryaagr
Copy link
Owner

The macro you are talking about is toy-data with only 10 rows. It returns (X, y, ŷ) just to enable users to try out various things like metrics, etc without fitting an algorithm and predicting.

But while adding macros for real datasets like COMPAS, German, Adult, etc. we would not need the macro to return ŷ. So we can normally return (X, y). It is going to be very similar to macros available at https://github.com/alan-turing-institute/MLJBase.jl/blob/master/src/data/datasets.jl#L200 . Let me know if you need further clarification on this.

@Faldict
Copy link
Contributor Author

Faldict commented Jul 13, 2020

Thanks for your clarification. I have added the COMPAS and Adult datasets. Do I need to write the test scripts for them?

Another question is that when I am install the package for testing, I meet the following errors:

┌ Warning: julia version requirement for package MLJFlux not satisfied
└ @ Pkg.Operations /Users/sabae/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.2/Pkg/src/Operations.jl:225
ERROR: Unsatisfiable requirements detected for package Flux [587475ba]:
 Flux [587475ba] log:
 ├─possible versions are: [0.4.1, 0.5.0-0.5.4, 0.6.0-0.6.10, 0.7.0-0.7.3, 0.8.0-0.8.3, 0.9.0, 0.10.0-0.10.4, 0.11.0] or uninstalled
 ├─restricted to versions 0.10.4-0.10 by MLJFlux [094fc8d1], leaving only versions 0.10.4
 │ └─MLJFlux [094fc8d1] log:
 │   ├─possible versions are: 0.1.2 or uninstalled
 │   └─MLJFlux [094fc8d1] is fixed to version 0.1.2
 └─restricted to versions 0.10.3 by an explicit requirement — no versions left

so that I am not able to install the package.

@ashryaagr
Copy link
Owner

Thanks a lot for working on the dataset macros. It would be great if you could write the tests (/tests/datasets/datasets.jl) as well for the datasets you add.
Another comment on your commit 1111e96 : It would be better to download the datasets only when required. So, when the macro is called, we can check whether the data directory contains the dataset. If the directory does not have the dataset, it is then downloaded from the specified link. I will add an example macro and corresponding test for some other fairness dataset for your reference.

I am not sure why this version incompatibility issue is coming on your system. But this MLJFlux package is not required for the package. In the commit 9dee330 I have removed the inessential packages like MLJFlux from the dependencies. Please let me know if you still face any setup issues after pulling the changes.

@ashryaagr
Copy link
Owner

@Faldict you might want to look at the macro I have added for German credit data : https://github.com/ashryaagr/MLJFair.jl/blob/master/src/datasets/datasets.jl
Corresponding Tests are available at https://github.com/ashryaagr/MLJFair.jl/blob/master/test/datasets/datasets.jl

I hope these make it easier for you to add the macros and tests for other fairness datasets.

@Faldict
Copy link
Contributor Author

Faldict commented Jul 14, 2020

@ashryaagr Thanks a lot! I have fixed this problem.

@ashryaagr ashryaagr added the good first issue Good for newcomers label Jul 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants