Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Reviewed) Total Recall: flmake and the Quest for Reproducibility #18

Open
mmckerns opened this issue Jun 21, 2014 · 1 comment
Open

Comments

@mmckerns
Copy link

Reviewer:
Michael McKerns
Center for Advanced Computing Research
Division of Engineering and Applied Science
California Insitute of Technology
Pasadena, California, USA

Area of Expertiese: Duh, everything.

General Evaluation

  • Quality of the approach: meets
  • Quality of the writing: below
  • Quality of the figures/tables: meets

Specific Evaluation

  • Is the code made publicly available and does the article sufficiently describe how to access it?

    Code is publicly available, and the article provides a link to the homepage as a publication reference. Would be better if link was provided in article text.

  • Does the article present the problem in an appropriate context?

    Yes.

  • Is the content of the paper accessible to a computational scientist with no specific knowledge in the given field?

    What the heck does this question mean, really? Yes, the paper doesn't use a lot of jargon, but it does assume that the reader is at least a little cognizant of standards and practices in scientific computing. I believe the article is at a good level for computational scientists that may be dabbling in scientific computing, as typified by users and erstwhile developers of scientific python community software (i.e. hacks trying to get faculty positions).

  • Does the paper describe a well-formulated scientific or technical achievement?

    Yes.

  • Are the technical and scientific decisions well-motivated and clearly explained?

    Yes, extremely well-motivated. Technical and scientific decisions are somewhat clearly explained, so I'd have to say, no. Portions of the article are meandering, confusing, and poorly written -- primarily, the abstract and the introduction. Since the abstract and introduction are primary locations for providing clear picture of the technical and scientific decisions that were made in the paper, this is where the article falls flat. There are other sections in the text, such as "Why Reproducibility is Important" or "Conclusions and Future Work" that may serve better as a clear motivation for the decisions in this article. The abstract and introduction are a steaming pile of verbage, and it seems to this reviewer were put together post-haste and pasted in front of an otherwise very well-written flmake user manual.

  • Are the code examples (if any) sound, clear, and well-written?

    Mostly. See Detailed Notes below.

  • Is the paper factually correct?

    As far as I can tell.

  • Is the language and grammar of sufficient quality?

    No. The bulk of the paper is well-written, however portions of the text are elliptical. See "Detailed Notes" below.

  • Are the conclusions justified?

    Yes.

  • Is prior work properly and fully cited?

    Yes.

  • Should any part of the article be shortened or expanded? Please explain.

    The paper somewhat suffers from a dissociative identity disorder, as portions of are clearly a user manual for flmake and portions of it are a discussion on reproducibility in scientific computing. The two sides of this paper are not well integrated, in general. The section on reproducibilty could use the same level of examples that are in the first half of the paper. Much of the introduction could actually be cut, were the paper reorganized. The article should pick a central theme: is it an article on reproducibility, with flmake as a case study, or is it an article on the use of flmake, with an emphasis on the features of flmake that enable reproducibility?

  • In your view, is the paper fit for publication in the conference proceedings?

    This has the makings of an excellent paper, and contains some important work. However, in it's current state, the paper is unfit for publication. Portions need a rewrite. Details follow.

Detailed Notes

Abstract

  • The tense is mixed.

    Best to pick past tense.

  • "Canonically, each of these tasks"

    which tasks? referring to the basic steps?

  • "However with the recent advent of flmake"

    oddly worded

  • "fully reproducible way"

    should define reproduciblity in this context before using it

  • "to achieve such reproducibility a number of developments and abstractions were needed, some only enabled by Python"

    there is a lot wrong with this sentence. 'such reproducibility'... you didn't explain what reproducibility is, nor did you demonstrate such reproducibility. 'some only enabled by Python'... it dangles and is bad English.

  • "These methods were widely"

    Which methods?

  • "The process of writing flmake opens many questions"

    It wasn't likely the process. Maybe better "Writing flmake opened"

Introduction

  • "in a repeatable way [FLMAKE]"

    this is not defined, but sounds like 'automated'.

  • "none of the prior attempts have placed reproducibility as their primary concern"

    again, needs definition to have meaning here.

  • "This is in part because"

    What is 'This'?

  • "setup metadata required alterations to the build system"

    Yes, and? Why should I care? What's the big deal?

  • "The development of flmake... typically under its own version control"

    Because the build system works how? git? svn? Needs details.

  • For each of the important tasks... stored directly in the description"

    Unclear if this is describing the 'old' way of doing things or the 'new' way.

  • "it fundamentally increases the scientific merit of FLASH simulations"

    I'd agree that a job builder and launcher that (1) captures metadata and parameters in a way that all information pertaining to executing a FLASH job is logged and fully available (the notebook concept), and (2) automates the workflow for FLASH simulations, is a huge benefit, and will likely increase the quality and reproducibility of work. This work is not only a nice feature, but possibly a significant advance for FLASH. The abstract and introduction do not clearly present it as such, and that is a major detriment to the paper. If I was not reviewing the article, I would have given up reading it before completing the introduction, and probably at the abstract.

  • "The methods described herein... the same reproducibility strategy... Thus flmake shows that reproducibility... command line utilites"

    What is this saying? What strategy? The lack of detail in this section make is very confusing to the reader. Again too much eliptical language, where driving the point home is needed.

Source & Project Paths Searching

  • "classic Sedov problem"

    Cite?

Dynamic Run Control

  • "update the flash.par file"

    What is flash.par?

Example Workflow

  • "Oops, it died... clean 1"

    This doesn't correspond to the text, 'create and run the simulation'. Text should explain what is happening in the example, if 'in-code' documentation cannot be sufficient.

Why Reproducibility is Important

  • "True to its part of speech"

    What does that mean? What does that refer to? Poor grammar.

  • "However, most scientists choose to not utilize these technologies. This is akin to a chemist not keeping a lab notebook."

    Excellent point. Poor grammar.

  • "this is in fact no greater than what is currently expected from scientists with regard to Statistics"

    Another good point, however possibly a counter-example to your argument. Misuse of statistics is also a huge issue in reproducibility, and this reviewer would argue that a majority of scientific papers in the last 50 years have misused and/or incorrect statistics, and thus the conclusions may also be suspect.

Command Time Machine

  • Modules inside of... another in a manner relevant to reproducibility."

    Could use an explicit example to clarify.

Conclusions and Future Work

  • "no previous system included a mechanism to non-destructively execute previous command incarnations similar to flmake reproduce"

    For FLASH or in general?

  • "software-in-science project"

    projects should be plural

@mmckerns mmckerns changed the title (Reviewed) Total Recall: flake and the Quest for Reproducibility (Reviewed) Total Recall: flmake and the Quest for Reproducibility Jun 21, 2014
@ahmadia
Copy link
Member

ahmadia commented Jun 21, 2014

cc @scopatz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants