Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FaIRv2.0 #85

Open
znicholls opened this issue Sep 22, 2020 · 12 comments
Open

FaIRv2.0 #85

znicholls opened this issue Sep 22, 2020 · 12 comments

Comments

@znicholls
Copy link
Contributor

This issue is intended to be a place to discuss how we deal with FaIR v2.0. I have a few questions about how we want to do this in practice, which shouldn't hold up @JGBroadbent's work but would be good to sort out sooner rather than later.

  1. Should FaIRv2.0 be in the same repository? My suggestion is yes, but that is predicated on the assumption that active development of FaIR1.X stops once 2.0 is sorted (we can still do bug fixes but no major improvements). If that assumption fails, I think we should start a new repository because trying to manage two 'masters' at once will be a nightmare.

If the answer to the above question is no, then things become relatively simple and the rest of my thoughts can be ignored. In case the answer is yes, I will continue.

  1. My suggestion would be that we use the FAIR-v2.0.0 branch as a quasi-master for v2.0 until active development of FaIR 1.X stops. It'll be a touch fiddly but should be fine in the end.
  2. For FaIRv2.0, my suggestion would be that we basically start again on the devops side (continuous integration, docs, releasing etc.), making everything a bit cleaner and more automated. This will be a bit of a learning curve, but a worthwhile one I think. I think it also makes sense because we're doing a complete re-write of the codebase, so may as well do the stuff around it while we're at it.
  3. When the time comes, this means we'll have a commit which removes all the FaIR 1.X stuff and replaces it with whatever state FaIR2.0 is in at that point. The history will look a bit odd, but I think that's the simplest way to do it.

Having said all that, the real question is what does everyone else think? e.g. @chrisroadmap @njleach

@njleach
Copy link
Collaborator

njleach commented Sep 24, 2020

First question: my thought is that it should be in the same repo (and this is what's currently stated in the update paper). I do completely agree with you on the two "masters" issue - it would be super confusing for any users.

Second questions:

  1. I completely agree.
  2. Strong agree.
  3. Yes.

You've pretty much described how I imagined the switchover would go so I don't really have anything else to add!

One related (but not urgent) question: where should the Excel version of the model go? I am keen on having a repo for it as well such that a centralised version exists that is (hopefully) correct; rather than having loads of slightly different versions floating around (as has happened in Oxford when I've given people copies of the model at different points...) If having it as a branch to this repo would work I'd be happy with that - but if not I'm happy to create a new repo (could call it EFaIR...?).

@znicholls
Copy link
Contributor Author

znicholls commented Sep 24, 2020

Great! Keen to hear what @chrisroadmap thinks too.

where should the Excel version of the model go?

My instinct would be in this repo too. Then we just need to find an automated way of checking that it is still somewhat correct. That will require a bit of head scratching, but I'm sure we'll find a way.

@njleach
Copy link
Collaborator

njleach commented Sep 24, 2020

Awesome - and yep for sure. I'll do a bit of research on testing excel files. I've now got a working version that is more or less identical to the python version (the timestepping is very slightly different because using what I did in python is tricky in Excel and makes the whole thing way less readable so they aren't bit-identical but they given pretty much the same results).

@chrisroadmap
Copy link
Collaborator

sorry slow - spinning plates with 1.6 for AR6

  1. I did create a new empty repo at https://github.com/OMS-NetZero/FAIR2 which I set up with some skeleton linting and testing which could be a useful outline. It looks like a lot of work has been done on the branch on this repo so we can delete that other one if it won't be used.
  2. Yes - and see 1. above.
  3. Yes fine, breaking history shouldn't be a big problem right as we're going from 1 to 2 major version and all historical versions will still be available on pypi
    3.1. corollary: I guess I'll be the only person interested in periodically fixing 1.X series

Excel version could go in this repo as long as we exclude it from pypi installs? My slight preference would be to put it into a different repo and provide a health warning that it's not likely to be as rigorously tested as the python version.

@znicholls
Copy link
Contributor Author

3. 3.1. corollary: I guess I'll be the only person interested in periodically fixing 1.X series

That would be my guess. It'll be a bit awkward for you as I think we'll have to either a) do it in a branch in this repo (running into the two master problem a bit but if it's only bug fixes maybe not so bad) or b) split out a FaIR1.X repo.

Excel version could go in this repo as long as we exclude it from pypi installs?

Yep, I think if we put it in a folder called e.g. excel-implementation then we can easily keep it all separate.

I'll do a bit of research on testing excel files

Maybe xlwings would give us a way to automate extracting values from an excel workbook? If we can get the values out, we can then write all the tests in Python which would make our life easy (and allow us to test fairly rigorously).

@njleach
Copy link
Collaborator

njleach commented Sep 29, 2020

Sounds good to me.

On xlwings, I thought that looked like the best (only?) option for automating checks of the Excel version. I've started to have a go and it seems like it could be used (in a limited capacity). As far as I can tell, xlwings could be used to extract values from a (static) workbook. So as far as I can see, we could definitely run automated testing to compare the output of a single (pre-defined) scenario between the Excel (default configuration) and python versions to ensure they are similar. What I currently can't see a way to do (outside of pretty laborious manual testing within Excel) would be testing over a range of scenarios or configurations to see if it breaks.

@chrisroadmap
Copy link
Collaborator

That would be my guess. It'll be a bit awkward for you as I think we'll have to either a) do it in a branch in this repo (running into the two master problem a bit but if it's only bug fixes maybe not so bad) or b) split out a FaIR1.X repo.

Very happy to retire FaIR 1.x once this is up and running and tested to verify similar behaviour. I'm keeping it going now to fall back on (and I haven't learned FaIR 2 yet)

@znicholls
Copy link
Contributor Author

xlwings could be used to extract values from a (static) workbook

Damn I was hoping it could be made dynamic... Alright in that case then I think we have to just test that one pre-defined scenario and configuration, and put the warning that everything else is untested. I don't see how we can do much better...

@JGBroadbent
Copy link
Collaborator

Hi All,

Just to let you know that a working (i.e. it passes my tests on my machine) version of FaIR 2.0.0 (minus the OpenSCM runner or any built in parameter sets), is now uploaded on my fork of the FAIR-v2.0.0 repo: https://github.com/JGBroadbent/FAIR/tree/FAIR-v2.0.0/fair/2.0.0 .

Let me know thoughts etc + what you'd like next in terms of functionality (beyond an OpenSCM runner and some cleaning up of my code, which are the next things on my agenda) or presets etc.

@rgieseke
Copy link
Contributor

Damn I was hoping it could be made dynamic... Alright in that case then I think we have to just test that one pre-defined scenario and configuration, and put the warning that everything else is untested. I don't see how we can do much better...

It seems openpyxl (https://openpyxl.readthedocs.io/en/stable/index.html) is another alternative for creating/reading XSLX files.
(Xlwings seems to be Mac/Windows only.)

On the testing side of different scenarios I imagine one could use a headless LibreOffice Calc version to check different scenarios within GitHub Actions or maybe use the Google Sheets API to upload or check files with different scenarios.

It seems Microsoft's Excel Viewer is deprecated (https://docs.microsoft.com/en-us/office/troubleshoot/excel/get-latest-excel-viewer), and it seems it doesn't calculate formulas anyhow (https://groups.google.com/forum/#!topic/spreadsheet-writeexcel/V8KRXnSQ58E).

@njleach
Copy link
Collaborator

njleach commented Sep 30, 2020

Thanks @rgieseke! This is really helpful - I'll have a look into these options. I'm a little concerned that some of the functions I've used to try and make the workbook more dynamic are going to make it incompatible with Calc / Sheets (I've tried opening it in in Calc and it doesn't work, with enough bits going wrong that it would be a lot of effort to make it compatible; though if this is crucial I possibly could make it work without the functions that break compatibility).

I've also realised I've hijacked this thread a little to discuss the Excel version - shall I open a separate issue to discuss this?

@rgieseke
Copy link
Contributor

As i understood one of the ideas was to have a model that could fairly easy be implemented in different languages/environments. In case of these major differences already, LibreOffice/Google Sheets would be different environments then, I guess ...

To test the Excel version programmatically, maybe it's possible to interact with Office 365 or install an Office demo version in a Windows-based GitHub Action ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants