Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submission: phonfieldwork #385

Closed
13 of 31 tasks
agricolamz opened this issue Jun 20, 2020 · 39 comments
Closed
13 of 31 tasks

Submission: phonfieldwork #385

agricolamz opened this issue Jun 20, 2020 · 39 comments

Comments

@agricolamz
Copy link

agricolamz commented Jun 20, 2020

Submitting Author: George Moroz (@agricolamz)
Repository: https://github.com/agricolamz/phonfieldwork
Version submitted: 0.0.8
Editor: @melvidoni
Reviewer 1: @jonkeane
Reviewer 2: @nikopartanen
Archive: TBD
Version accepted: 2020-10-20


  • Paste the full DESCRIPTION file inside a code block below:
Package: phonfieldwork
Type: Package
Title: Linguistic Phonetic Fieldwork Tools
Version: 0.0.8
Depends: R (>= 3.5.0)
Imports:
    tuneR,
    phonTools,
    grDevices,
    utils,
    graphics,
    rmarkdown,
    xml2
Authors@R: c(person("George", "Moroz",
             role = c("aut", "cre"),
             email = "[email protected]",
             comment = c(ORCID="0000-0003-1990-6083")))
Description: There are a lot of different typical tasks that have to be solved during phonetic research and experiments. This includes creating a presentation that will contain all stimuli, renaming and concatenating multiple sound files recorded during a session, automatic annotation in 'Praat' TextGrids (this is one of the sound annotation standards provided by 'Praat' software, see Boersma & Weenink 2018 <http://www.fon.hum.uva.nl/praat/>), creating an html table with annotations and spectrograms, and converting multiple formats ('Praat' TextGrid, 'ELAN', 'EXMARaLDA', 'Audacity', subtitles '.srt', and 'FLEx' flextext). All of these tasks can be solved by a mixture of different tools (any programming language has programs for automatic renaming, and Praat contains scripts for concatenating and renaming files, etc.). 'phonfieldwork' provides a functionality that will make it easier to solve those tasks independently of any additional tools. You can also compare the functionality with other packages: 'rPraat' <https://CRAN.R-project.org/package=rPraat>, 'textgRid' <https://CRAN.R-project.org/package=textgRid>.
License: GPL (>= 2)
URL: https://CRAN.R-project.org/package=phonfieldwork, https://agricolamz.github.io/phonfieldwork/
BugReports: https://github.com/agricolamz/phonfieldwork/issues
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.1.0
VignetteBuilder: knitr
Suggests: 
    knitr,
    testthat,
    covr,
    dplyr,
    tidyr,
    DT,
    lingtypology
Language: en-US

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • workflow automataion
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):

phonfieldowrks is a tool that helps researchers create sound annotation viewer from multiple sounds. I believe that sound annotation viewer helps to make phonetic more reproducible, since it make sound data widely availible. It helps reducing the time that annotators spent during annotations in Praat, Elan and EXMARaLDA, so it partially replace its functionality.

  • Who is the target audience and what are scientific applications of this package?
    Linguists, phoneticians. Probably in the future specialists from bioacustics.

  • Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
    There are packages 'rPraat' and 'textgRid' https://CRAN.R-project.org/package=textgRid, that overlap in functionality with my package, but my package has wider coverage of tasks and especially created for specific workflow described in the docs (https://agricolamz.github.io/phonfieldwork/).

  • If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?
  • Do you intend for this package to go on Bioconductor?
  • Do you wish to automatically submit to the Journal of Open Source Software? If so:
JOSS Options
  • The package has an obvious research application according to JOSS's definition.
    • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
    • The package is deposited in a long-term repository with the DOI:
    • (Do not submit your package separately to JOSS)
MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

@annakrystalli
Copy link
Contributor

Hello @agricolamz and many thanks for your submission.

We are discussing whether the package is in scope and need a bit more information.

  • If a package has overlapping functionality with other packages, we require it to demonstrate in the documentation how it is best in class. Could you add a more detailed comparison to the packages you mention in the README so we can evaluate?
  • The package is to be used with data from human subjects and also mentions the publication of data. rOpenSci have specific policies on Ethics, Data Privacy and Human Subjects Research so we would expect care to be incorporated into package workflows to protect personal data and adhere to what subjects have consented to. Could you elaborate on any steps the package takes to ensure it conforms to these policies? A separate vignette might at some point be useful but a section in the README addressing these issues will be fine for now.

@agricolamz
Copy link
Author

agricolamz commented Jun 24, 2020

Hello @annakrystalli!

thank you for your response.

other packages

In the README file I've mentiond rPraat and textgRid. The first one get an access to the functionality of one particular program for phonetic/phonological research Praat and its annotation system (TextGrid). The second one is limited to operations with Praat's annotation system (TextGrid).

My package has another perspective. The main goal of the my package is to provide different tools for phonetic/phonological researcher. In order to do it I provided several functions that are not connected to Praat, but really important for the process of the phonetic/phonological research:

  • stimuli presentation creation (could be done with any programming language and probably without them)
  • soundfile renaming and automatic merge (could be done with any programming language)
  • automatic annotation (connected to Praat in a sense that the result is a Praat Textgrid)
  • importing data from other (linguistic) annotation programs (Elan (see the FRelan package), EXMARaLDA, Audacity, subtitles .srt, FieldWorks)
  • cut sounds according to annotation (could be done with Praat and probably some other software)
  • create a sound viewer that simplifies research process and could be a standard for data publication (actually is not a rocket science, but I haven't seen anybody doing it)
  • I have plans to enrich my package with the similar video functionality in order to help Sign language researchers

Summing up: my package is focused on the making process of a research a plain thing. rPraat and textgRid are focused on replication of the functionality of Praat in R.

All cutting and merging of sounds is based on tuneR, but I think there are more packages that have the same functionality.

It is also worth mentioning that my package is not the only one that creates spectrograms (Sueur 2018) list other options: signal, tuneR, seewave, phonTools, monitor, warbleR and soundgen.

Similar idea about representation of gathered data (without sound) could be found in the LingView package.

ethic policy

Honestly, I didn't thought about it. That doesn't mean that I don't care. This do mean that I need help on this topic and whatever path this submission will take, I will appreciate any help and ideas about it. I will try to list here what I know about the topic.

Simple linguistic research or documentation looks like this: researcher record some audio/video files of the speech events and then extract some linguistic features from it. Sometimes it is possible to conduct a research in a big cities with mostly literate population, then consent is not a problem. But there are also a hue amount of work that performed in a "field", where researches go to small settlements with mostly rural population. In most cases it is possible to receive a consent, but it is hard to explain what does it mean.

In some countries there are an ethic commission that decides whether it is possible to conduct a research, but in some countries there is no any policy on this question (e. g. in my own country, Russia).

Since my package make it easier to publish multiple speech samples that could be a problem, since people could be identified (especially if they are from the small community). What can I do? I can insert in my documentation and the create_viewer() function notes about what kind of consent should be gathered in order to satisfy research standards. It is easy. I would be happy if somebody help me, but I think I can gather this from other linguists.

I think that there are several things that participant should agree on:

  1. conduct scientific research on gathered data;
  2. publication of gathered data;

If participant agrees only on the first one, it would be useful to create an access restriction (e. g. password protection) on generated html viewer. This will limit users to academic circles. Which rmarkdown password protection strategy is better for this purpose? I have in mind encryptedRmd package, but for now it doesn't work with my viewer result (I opened an issue here, may be I will need to invent a work around).

Another thing that could be done is modulating speech signals in order to reduce an ability of person recognition.

I hope, I explained my position and my ideas, but I hope that rOpenSci community can help to elaborate this problem.

@elinw
Copy link

elinw commented Jul 6, 2020

I think this is an interesting question.

Essentially this is a tool. Could the tool be used by someone who has not adequately protected their human participants from being identified or who has not obtained voluntary informed consent? Yes, however I don't think there is anything that can reasonably be done about that by the tool maker because that would really require (from my understanding) changes to the data collection process. Complicating this is the issue that while US-based researchers at federally funded institutions are required to comply with the Common Rule as well as the Belmont Report principles it was derived from, researchers in other places are subject to their local regulations. And non-researchers have no such requirements at all.

The Common Rule and the Belmont Report do not create bright lines, they talk about balancing competing principles: beneficence, justice and respect for persons are sometimes not in agreement. The harm must be weighed against the benefit. In the case of this kind of linguistic data, as I understand it, the potential harm is very much going to be context-dependent. First there is the issue of participants being identifiable, but, second, there could also be larger political and social issues that we have not thought about (for example when a language or culture is being suppressed). This is why respect for persons requires that all of the potential risks be explained to potential participants before data collection.

Overall, I do think that having some kind of notice about the need to obtain voluntary informed consent will at least foreground the issue for researchers. There could be a link to a larger document that describes the range of possible issues in more depth.

@agricolamz
Copy link
Author

agricolamz commented Jul 6, 2020

I'm totally agree, but it means that the same kind of note should be put on each package that works with sounds, pictures and video. And for now our concern is only about humans, but we could think on it in a greater scale: imagine that I recorded and published audio with rhinoceros' mating signal that will make it easier for poachers to lure those endangered creatures... This note also need to have antypiracy note, in order to prevent publishing music peaces. Even further: rmarkdown need to have a note reminding that it is inappropriate to use hate speech and probably include some rules from the most popular social media.

@annakrystalli
Copy link
Contributor

Thank you for your feedback @elinw. And thank you for your patience and response @agricolamz. As we encounter new applications involving human subject data, it might take a little longer and input from domain experts like @elinw might be required to help ensure our policy is fit for purpose.

So I'll summarise where I think we are at the moment:

Overall I think just some more detailed documentation regarding data protection for privacy + the handling data according to the informed consent will be enough to start the review process. In particular drawing attention to what @agricolamz highlighted:

I think that there are several things that participant should agree on:

  • conduct scientific research on gathered data;
  • publication of gathered data;

I think you also made some good suggestions regarding encryption. Other simpler options could be warnings, asking for confirmation before performing an action (e.g. publishing data), adding particular files to .gitignore or even scanning data for potentially sensitive information (eg names).

I think a full review seems a better way to assess what strategy would be more useful and where throughout a typical workflow. So I suggest we ask reviewers to consider these points during review?

So, once some more documentation is added to the README on:

  • A short demo of package functionality (ideally the README should also contain a short demo so potential users can skim functionality in the repo, then go to the longer vignette (tutorial) for more details.
  • A summary of how the package compares to other packages and why it is best in class
  • Information and guidance on data privacy and informed consent (it's fine to just summarise the main points and link to more detailed information)

I am happy to pass it on to the handling editor.

@agricolamz
Copy link
Author

agricolamz commented Jul 10, 2020

@annakrystalli, @elinw,

I've added a message to the .html creating function that links to the vigniette, that I've created in order to emphasize ethical issues. It would be nice if you take a look.

Since my package is actually a bunch of small functions I don't know how to make a small demo of it, but there is a link to the whole tutorial.

I also added to README a small summary and links to other packages.

@elinw
Copy link

elinw commented Jul 10, 2020

Yes I think that the vignette is very good.

@annakrystalli
Copy link
Contributor

Thanks @agricolamz.

@melvidoni will be your handling editor. It would be good to point reviewers to the above discussion to consider while they are reviewing.

@melvidoni
Copy link
Contributor

melvidoni commented Jul 13, 2020

Editor checks:

  • Fit: The package meets criteria for fit and overlap
  • Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
  • License: The package has a CRAN or OSI accepted license
  • Repository: The repository link resolves correctly

Editor comments

Hello @agricolamz thanks for submitting! I'm Dr Melina Vidoni, and I'll be your handling editor. Here are some basic checks for your package:

> spelling::spell_check_package()
DESCRIPTION does not contain 'Language' field. Defaulting to 'en-US'

And the output for googpractice::gp:

── GP phonfieldwork ───────────────────────────────────────

It is good practice to

  ✖ write unit tests for all functions,
    and all package code in general. 0% of code
    lines are covered by test cases.

    R/add_leading_symbols.R:17:NA
    R/add_leading_symbols.R:18:NA
    R/annotate_textgrid.R:34:NA
    R/annotate_textgrid.R:35:NA
    R/annotate_textgrid.R:37:NA
    ... and 1105 more lines

  ✖ use '<-' for assignment instead of
    '='. '<-' is the standard, and R users and
    developers are used it and it is easier to read
    your code for them if you use '<-'.

    R\draw_spectrogram.R:73:5
    R\draw_spectrogram.R:74:16
    R\draw_spectrogram.R:75:31
    R\draw_spectrogram.R:76:32
    R\draw_spectrogram.R:78:11
    ... and 27 more lines

  ✖ avoid long code lines, it is bad for
    readability. Also, many people prefer editor
    windows that are about 80 characters wide. Try
    make your lines shorter than 80 characters

    R\add_leading_symbols.R:3:1
    R\annotate_textgrid.R:3:1
    R\annotate_textgrid.R:9:1
    R\annotate_textgrid.R:15:1
    R\annotate_textgrid.R:19:1
    ... and 131 more lines

  ✖ avoid sapply(), it is not type safe.
    It might return a vector, or a list, depending
    on the input data. Consider using vapply()
    instead.

    R\draw_spectrogram.R:93:11

  ✖ avoid the library() and require()
    functions, they change the global search path.
    If you need to use other packages, import them.
    If you need to load them explicitly, then
    consider loadNamespace() instead, or as a last
    resort, declare them as 'Depends' dependencies.

    R\create_viewer.R:113:1

  ✖ avoid 1:length(...), 1:nrow(...),
    1:ncol(...), 1:NROW(...) and 1:NCOL(...)
    expressions. They are error prone and result 1:0
    if the expression on the right hand side is
    zero. Use seq_len() or seq_along() instead.

    R\add_leading_symbols.R:18:40
    R\create_subannotation.R:46:10
    R\create_subannotation.R:54:23
    R\df_to_tier.R:52:31
    R\df_to_tier.R:61:31
    ... and 7 more lines

  ✖ fix this R CMD check ERROR: Running
    examples in 'phonfieldwork-Ex.R' failed The
    error most likely occurred in: > ### Name:
    srt_to_df > ### Title: Subtitles .srt file to
    dataframe > ### Aliases: srt_to_df > > ### **
    Examples > > srt_to_df(system.file("extdata",
    "test.srt", package = "phonfieldwork")) Error in
    strsplit(result[, "times"], " --> ") :
    non-character argument Calls: srt_to_df -> cbind
    -> do.call -> strsplit Execution halted
  ✖ fix this R CMD check WARNING: LaTeX
    errors when creating PDF version. This typically
    indicates Rd problems.
  ✖ fix this R CMD check ERROR: Re-running
    with no redirection of stdout/stderr. Hmm ...
    looks like a package Error in texi2dvi(file =
    file, pdf = TRUE, clean = clean, quiet = quiet,
    : pdflatex is not available Error in
    texi2dvi(file = file, pdf = TRUE, clean = clean,
    quiet = quiet, : pdflatex is not available Error
    in running tools::texi2pdf() You may want to
    clean up by 'rm -Rf
    C:/Users/e95207/AppData/Local/Temp/RtmpCIt1xR/Rd2pdfcf4c23af23e0'

Can you please fix these and advice when it is done? I will find reviewers after that.


Reviewers: @jonkeane and @nikopartanen
Due date: August, Friday 28th

@agricolamz
Copy link
Author

Dear @melvidoni,

I have added tests so there is a 82% coverage.

I decided to keep long lines greater then 80 characters when there are links (here is an example).

There is also one false alarm about library() function: this command is within a character string, so it is not part of the function.

@melvidoni
Copy link
Contributor

Hello @agricolamz, thanks for the revision. It is good now.

The first reviewer will be @jonkeane. I am searching for a second reviewer now. The review deadline will be updated once I have both reviewers confirmed.

@agricolamz
Copy link
Author

Wow, Sign Language specialist! @jonkeane, I'm going to extend my package for working with video for sign languages one day.

@jonkeane
Copy link

Hello 👋 I've already started (a very tiny bit) looking around in phonfieldwork, but I am excited to review it! I haven't done sign language research (or really linguistics in general) in a few years now, but when I saw this package in the issues it piqued my curiosity.

If you are thinking about video processing, you might want to checkout signglossR which might already have some of that functionality — I'm sure there's space for two packages and the technical needs for sign vs. spoken language phonetics/phonology is different enough it might make sense to keep them separate as well.

Anyway, I'm looking forward to what you've done already 😃

@agricolamz
Copy link
Author

agricolamz commented Jul 21, 2020

Hello! @jonkeane, I knew about this package, but it developed from last time I've seen it!

@melvidoni, may I propose my own reviewers? I know it is weird since I know them personally, but in case it is not weird:

@melvidoni
Copy link
Contributor

Hello @agricolamz, thanks for the suggestions. I had contacted another potential reviewer. I'll wait for their response and if it is negative, I'll contact them. Thank you!

@melvidoni
Copy link
Contributor

Hello @agricolamz. I am still searching for reviewers. I will provide a review deadline once the second reviewer is found.

@melvidoni
Copy link
Contributor

Thank you @jonkeane and @nikopartanen for agreeing to review the package.

The review due date is August, Friday 28th

@jonkeane
Copy link

Package Review

  • Briefly describe any working relationship you have (had) with the package authors.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README

I needed to install pandoc-citeproc in order to get started, this should probably be listed in the SystemRequirements field of the DESCRIPTION + it would be nice to include it in the readme.

  • Vignette(s) demonstrating major functionality that runs successfully locally
  • [-] Function Documentation: for all exported functions in R help
    Yes, but see my note below: most of the entries could use some more expansion / description of what the functions do.
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 10-12

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

Overall this was really interesting to review and I think {phonfieldwork} shows a lot of promise (and already goes along way!) to help make some of the more repetitive tasks involved with phonetic and phonological work much easier and more reliable. I threw a few elan files that I had from my dissertation research at it and was impressed it handled them easily. I even managed to read in a folder of elan files that ultimately had around 166k annotations in ~30 seconds with eaf_to_df() (and that was even over network storage, so some of that time is shuffeling the files from a server!).

I have a few larger/more sweeping comments and ideas for changes that are more architectural or more broad and then at the bottom I went through the package function by function and have specific comments and suggestions about the code. Small note: I did some of this reviewing in two batches somewhat far apart and there were a few pushes to the github repo between the two, so if the line numbers don't match up or make sense, let me know and I'll look at them again and make sure I have the most up to date ones.

Large(r) scale comments

Think about making classes+methods for some of the analysis types that get read in

There are a number of places where you detect if an object is a TextGrid (typically by checking that the second element has "TextGrid" in it, examples: extract_intervals(), tier_to_df(), create_subannotation(), annotate_textgrid(), df_to_tier(), textgrid_to_df(), set_textgrid_names()). I think it might be beneficial to make a class for TextGrids (S3 is probably sufficient, but there are other class systems in R as well) and then you can use inherits(object, "TextGrid") instead of grepl("TextGrid", textgrid[2]). Additionally seeing that check in a few places, you might also want to make a helper function that either returns the TextGrid object if it is a TextGrid or reads in the TextGrid if it's a path.

You could even make an object type for each of the analysis objects that you read in, and if you did this, you could also consider doing things like making as.data.frame() or write() methods for each of these so that their interface worked similar to other areas of R.

This isn't a requirement and what you have here shows that you don't need this to have the functionality you want, but I think it would clean up some of the duplicated code in the package.

Function restructuring

A few places you have a pattern of one function that creates some output (like an .Rmd file that is then rendered) I wonder if it would be better to split these functions apart so that you have one function to make the source (.Rmd, .csv, etc.) and then another that lightly wraps + renders + cleans up the source. Something like:

make_special_rmd <- function(...) {
  do <- things(here)
  file_path <- tempfile(fileext = "Rmd")
  writeLines(file_path)
  
  return(file_path)
}

render_special_rmd <- function(...) {
  file <- make_special_rmd(...)
  on.exit(unlink(file))
  return(rmarkdown::render(file))
}

This would give you the best of both worlds: being able to generate the source with a function call, and being able to render directly with one function call without needing to have an argument (render = TRUE) that changes lots of little things in the function and returns very different results (source files as opposed to rendered ones). These smaller functions would also be much easier to test and develop iteratively as you build on them as well.

Documentation improvements

Some of the help documents could have larger descriptions describing what they do, more details about how the arguments interact, what the function might be used for, etc.

It also might be nice to have some organization on the pkgdown website. Breaking help sections into topics would add a lot to discovery and seeing how the different components of the package fit together.

String manipulation and {glue}

There's a decent amount of string manipulation that I think might benefit from using the {glue} package. I know adding a dependency might not be worth it, I bet using {glue} would make some of the string manipulations that you do easier to read and operate with.

A few general style suggestions

A few places you use right assignment with lapply() (e.g. lapply(...) -> variable), This seems a bit out of style with the rest of the left-assign-only in the package as well as most style guides that I know of.

Though the package satisfies the lintrs specified in {goodpractice}, there are a number of linting rules from {lintr} that aren't followed consistently. This isn't a huge deal, but fixing them (possibly with {styler}) would make even more easily readable.

Comments on specific functions/files

add_leading_symbols() — Does this function need to be exported? I'm not totally sure it does, but maybe for use in markdown documents?

audacity_to_df() — You could replace:

source <- unlist(strsplit(normalizePath(file_name), "/"))
source[length(source)]

with basename(file_name)

concatenate_soundfiles()

  • on lines 52—60, 62—69, 73—76 it would be good to do the typical left assign into list, sound_attributes, and problems respectively.
  • On line 81: would message() be better than print() here?
  • On line 127 and 129: these print()s should probably be warning()s

create_glossed_document()

  • You might consider using a flextext class here (if you're using S3, you can make it so that are also data.frames, if you're using S4 or R6 or reference classes (all of which are probably overkill here) you could make this class inherit from data.frame). This way you could test that the object was of that class and write one method that checks that it has the column you have specified there.
  • When you're checking if packages are available, you might consider using requireNamespace("dplyr", quietly = TRUE) which is recommended by Writing R Extentions
  • For the format html versus docx, you might consider using the option "default" in the function definition as output_format = c("docx", "html") and then you can use match.args(output_format) which will handle the errors if it's not there. There is also arg_match() in rlang which does something similar with a few extra niceties
  • On line 56-57 you might want to use system.file("rmarkdown", "templates", "glossed_document", "skeleton", package = "phonfieldwork") instead of paste0(.libPaths()[1], "/phonfieldwork/rmarkdown/templates/glossed_document/skeleton/skeleton.Rmd")
  • On line 66 you might consider moving this up to just after line 53 and wrapping it in on.exit() this will ensure that the tmp file is removed even if the markdown rendering fails. (Though I admit there might be debugging reasons you want that to not happen, so you might want to wrap that in a condition if that's the case)

create_image_look_up()

  • It would be nice to describe in the documentation that img_src and img_caption should be the same length if they are both specified.

create_subannotation()

  • line 46 the phonfieldwork:: is not necessary: df <- phonfieldwork::tier_to_df(tg, tier = tier) can be df <- tier_to_df(tg, tier = tier)

create_viewer()

  • When you're checking if packages are available, you might consider using requireNamespace("dplyr", quietly = TRUE) which is recommended by Writing R Extentions
  • Here as well, I would consider using on.exit() to mark files for clean up immediately after creating them (with an option for debugging to leave them around).
  • Would it be possible to create a report where some sounds have images, but others don't? (Or even one where it's just sounds and no pictures, or just pictures and no sounds?)
  • Is it possible to have mp3 audio files in the viewer?
  • I'm not sure if this is using Safari, or something about the images, but the images never seemed to show up (or would show up on the map only in the example) when clicking on the eye. I didn't dig too much to see what was going on here, but you might want to try it out in a few different browsers / settings.

draw_sound()

  • I think it might be beneficial to break this function up into a few different functions (at a high level, the sounds_from_folder path at the beginning should probably at least be its own function). That will also make each of those preprocessing/option passing steps a bit easier to test as well.

eaf_to_df()

  • It might be nice to split out the conditional section when eafs_from_folder is not NULL to its own function. That way eaf_to_df() always operates on one file, and then eafs_folder_to_df() is a thin wrapper around eaf_to_df() (basically, what is now in that condition, but separated out)

exb_to_df()

  • Same as above: I would separate the one exb-to-df functionality and then wrap it for dealing with a folder of exbs in a separate function

extract_intervals()

  • tools::file_ext() might be more reliable for extracting the file extension on line 51

get_sound_duration()

  • Same with eaf_to_df() as above: I would separate the one sound duration functionality and then wrap it for dealing with a folder of sound in a separate function

get_textgrid_names()

  • If you do decide to use a TextGrid object, this should be a method of that get_names.TextGrid() or the like.

rename_soundfiles()

  • If I'm reading this correctly it looks like the translations are added on to the names of the stimuli (and don't replace them, for example) It might be nice to clarify that in the documentation.

srt_to_df()

  • On lines 44, 55, and 56, I see use of integers (e.g. c(3:4)) for indexing. This is probalby ok since you define the result that is being indexed, but it might be nice for readability to use names here.

textgrid_to_df()

  • Same with eaf_to_df() as above: I would separate the one textgrid to dataframe functionality and then wrap it for dealing with a folder of text grids in a separate function
  • The large anonymous function on lines 53—69 might be better off in its own function (which also might make testing it a bit easier as well!)
  • A few more comments about what each step is doing would be helpful here.
  • I had at least one textgrid that had nuls in it such that I had to add , skipNul = TRUE to the readLines() call on line 33 to let this work (the warning I got was line 50 appears to contain an embedded nul, though that will result in all of the lines being "". I wonder if you should pass all of the options (maybe via ... from textgrid_to_df() through to readLines).
  • You might also want to add a check that the lines in tg aren't all "" (like with the above error) and error proactively there. Otherwise, the error happens later on line 53 where split_text doesn't exist which is harder to debug that this was an encoding/input problem.

tier_to_df()

  • On line 25, I don't think that the phonefieldwork:: is necesary here.

The title() function in flextext_to_df.R this seems to be only used inside of draw_sound() I wonder if it should be moved to that file or a separate utils.R file or something.

Tests
  • You don't need library() in each test file, {testthat} will handle making sure that's available in the right environment for you. context() is also not necessary (and will be deprecated in the third edition of testthat).
  • If your test-*.R files are named the same (after the test- bit) as your *.R source files, you can run them individually with cmd+shift+i (most of yours are, but there are a few exceptions)
  • In test-flextext-to-df.R, is there any reason you're referencing a file on github? Even if this file shouldn't be tested on CRAN, it's still a good idea to have the fixture file in the tests directory. Also, you shouldn't need the second skip_on_cran() or the second test_that() embedding there, I would recommend either making a second test_that() block at the same level as the first or removing the embedding entirely.
  • In test-draw-spectrogram.R I would not embed those test_that calls and I would recommend take out the second skip_on_cran() which shouldn't be necessary there.
  • You might be interested in trying the {vdiffr} package for testing visual differences between runs for the various draw_* functions.
  • In test-df-to-tier.R you use expect_true(length(tg) == 104) when you could also use expect_length(tg, 104)

More minor comments

Formatting

I think you want an additional line header above this list to format as a list
https://github.com/agricolamz/phonfieldwork/blob/master/vignettes/phonfieldwork.Rmd#L97-L98

The tutorial link in the readme links to the pkgdown website, which is circular when it is used as the index page for pkgdown.

Installation

I needed to install pandoc-citeproc before R CMD check would run. Adding this to DESCRIPTION as a system dependency and a note in the readme would be nice!

@agricolamz
Copy link
Author

Thank you, @jonkeane!

@melvidoni
Copy link
Contributor

Tanks @jonkeane for the review! @agricolamz you can start checking it, otherwise, you can wait for @nikopartanen to finish their review. There are still 4 days left.

@nikopartanen
Copy link

nikopartanen commented Aug 28, 2020

Package Review

  • I have met George Moroz many times. I'm familiar with his work, and have used his another R package lingtypology often in my own work.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 10

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

phonfieldwork is a very interesting and useful R package, and is a very welcome addition to the tools we have available to work with speech data. I also did this review during a few weeks in several times, please let me know if some of my comments relate to something already outdated. I installed the package again from GitHub before submitting my review and tried to check that my comments are still valid.

How I understand it, the package has several larger areas where it can be used, and which form the pipeline described also in the documentation. 1) Creating stimuli sets for data collection, 2) Organising data collected with these stimuli, 3) Processing and further annotating the annotation files, 4) Reading into R resulting datasets, and 5) Visualizing this data.

The package is designed so that the different part of the pipeline work well together. Many of my recommendations relate to the ways how the package could be made more useful also in a little bit different scenarios, where the data is collected differently or the user already has some materials in a different format and wants to take advantage of functionalities of the package later in the pipeline.

Creating stimuli and organizing the files

This functionality is very useful and well thought. There aren't that many tools around for this kind of task. In the package documentation there is a good explanation of the intended workflow, with notes about recording practices.

I recorded some test audio with Audacity following the documentation, and ended up with files called 1.wav, 2.wav and 3.wav, containing words tip, tap and top. I used these in most of my testing within this part and visualizations.

One problem I encountered with rename_soundfiles function is that there doesn't seem to be any way to keep the original recording order intact. I end up with files tap.wav, tip.wav and top.wav. However, even when I have the backup, I don't have the information stored anywhere that 1.wav is now called tip.wav. It could be useful to have an argument that would rename the files, but add a number following the original order, i.e. 01_tip.wav, 02_tap.wav and 03_top.wav. Or maybe the function could also write a log file into the backup directory that lists how the renaming was done. I'm just thinking about a scenario where we realize much later that something has gone wrong, and for whatever reason we need to go back to original recordings. As far as I understand, now also suffix = paste0("_", 1:3) would not keep the original stimuli order? I think one could often want to record the same word under a somehow different stimulus, and maybe for analysis it would be clearer to keep the original order.

I understand from the documentation that the user creates the stimuli, shows them to a research participant, records stimuli as individual clips, and then ends up with a list of recordings that is the length of stimuli list, and has thereby already removed accidental double takes. This if fine, and in many cases a really well functioning workflow, and the package does provide a good help in doing this.

It can be useful to take a look into SpeechRecorder by BAS, which provides similar functionality. The advantage of this software is that one can record directly within a template, and use a microphone through an audio interface connected into a laptop. The main problem I see in using phonfieldwork's stimuli list is that it isn't entirely clear from the documentation how the researcher should handle repeated or missed elicitations, and when the number of elicitations is in hundreds of items, then managing this can be really complicated. This is of course a problem that documentation can solve, and there are already good recommendations related to the recordings.

Maybe it would be useful to add into a stimuli list a clear sequential id number, so the user can easily listen, for example, to the stimulus 75 and check that this is the same as 75th file in their recording directory. I understand the page number in stimuli file probably already fills this need, and maybe it is enough. Additionally, one could also specify in the rename_soundfiles new arguments missing, where the user could basically be able to say that "for whatever reason we don't have recordings for stimulus 21, 34 and 94, rename things accordingly". Otherwise the user gets the error Number of files is smaller then number of stimuli.

Processing data

Annotating TextGrid files works very well, and the functions are well thought and form a clear pipeline. Examples in the documentation are also clear.

It seems to me that there is no way to create a TextGrid file besides concatenate_soundfiles function. I think, however, that in many situations we have annotations in some of the formats the package can read, and one would like to work further with the package's spectrogram visualization methods. When we want to edit the annotations at this level of exactness, having it in Praat is very beneficial. So possibly it could be useful if there was a function create_textgrid that would create an empty TextGrid file that the user could populate with the functions annotate_textgrid and create_subannotation. Here I'm thinking a situation where the user would like to jump into the phonfieldwork's pipeline more in the middle, i.e. writing intermediate Praat files from segments read from another corpus.

Separating the folder processing side of concatenate_soundfiles and working with individual files could be good. Maybe user could also provide a list of audio files to be concatenated? It is possible this wouldn't fit to the logic of the pipeline the package is designed for, but I leave the idea here just in case.

Reading data into R

Reading datasets into R works really well. I tested extensively with both ELAN and Exmaralda corpora, and the resulting data frame is well structured, and the package works without problems with different ELAN files. I must emphasize that these functions worked better with .eaf and .exb files than anything I have seen before, so I'm very happy with them.

I used while testing Exmaralda-files from Nganasan Spoken Language Corpus and Selkup Language Corpus. I also used an individual FlexText file I found from here, which belongs to corpus The Documentation of Eibela: An archive of Eibela language materials from the Bosavi region. For testing ELAN I used various files I had at my laptop, as this is very common format it was easier to find different examples. Using ELAN files was also entirely free from adventures.

With some Exmaralda files I did encounter problems, this should repeat my error:

download.file(url = "https://corpora.uni-hamburg.de/hzsk/de/islandora/object/transcript:slc-1.0.0_BEP_1973_Itja1_flk/datastream/EXB/BEP_1973_Itja1_flk.exb", destfile = "BEP_1973_Itja1_flk.exb")

phonfieldwork::exb_to_df("BEP_1973_Itja1_flk.exb")

Which results in:

Error in data.frame(tier = i, id = seq_along(content), content = content,  : 
  arguments imply differing number of rows: 1, 0

Maybe there could be some generic error message that prints the path of the file and tells the file cannot be parsed, and then shows the underlying error message. In a typical corpus creation workflow the files we fail to parse are usually a very good indicator for the files that need to be checked. So that some files fail to parse is just typical, and no problem necessarily in itself. Especially if the file is broken it is better to fail than reading it wrong. I assume in some corpora there are just very unusual file structures, possibly related to old conversions from a format to format, so it is impossible to predict everything that is inside those files. But there may be some patterns that are so common that checking what is going on can improve the function.

I also encountered one FlexText file that I couldn't parse, which is here. This file doesn't seem to contain the word-level annotations, so that it doesn't work is not surprising, of course. If this kind of files are more common (I have no ideas, FlexText files are surprisingly hard to find!), it could help to have a specific error or warning message for this.

I think most of the time after the user reads these files into R, they would need to filter it a bit and probably use pivot_wider to get into right columns the content they want to work with. This isn't strictly speaking a matter of this package, but it could be nice to have in a vignette an example of such process, especially if the goal is to make the package beginner friendly. Something like:


download.file("https://corpora.uni-hamburg.de/hzsk/de/islandora/object/transcript:nslc-0.2_ChKD_72_ManyTents_flks/datastream/EXB/ChKD_72_ManyTents_flks.exb", destfile = "ChKD_72_ManyTents_flks.exb")

phonfieldwork::exb_to_df(file_name = "ChKD_72_ManyTents_flks.exb") %>%
    filter(tier_category %in% c("tx", "ps", "mp", "mc", "mb", "ge", "gr")) %>%
    select(-tier, -id, -tier_name, -tier_type) %>%
    pivot_wider(names_from = tier_category, values_from = c(content))

This way we end up with a nice data frame that has tokens, POS-tags and morphological analysis organized so that we have one observation (token now, could also be the utterance) per one row. It is impossible, of course, to predict which tiers the user wants to work with, which is why simple examples that give a direction to further possibilities could be useful. Or just to mention that pivot_wider is something to look into for further processing and analysis of the resulting data frame.

Note: These corpora are distributed under CC BY-NC-SA license, so using some other examples in the package vignette is certainly recommendable, as the license is not compatible with that of the package. I assume using these files in my review and testing is without problems. I originally wanted to include a screenshot of the resulting dataframe, but I excluded it now in order not to have actual corpus content shown.

I agree with the other reviewer that reading individual files and reading all the files in the directory should be somehow separated. Especially since when the user reads in whole directories there should be good control over things like finding the files recursively or not etc.

Visualizations

These work well, and as described. I'm also personally extremely happy to be able to create good-looking spectrograms in R. These are just small suggestions that possibly could improve the usability.

Adding the annotations from a data frame into a visualization made with draw_sound works really beautifully. It took me quite much time to understand that it is possible to pass a dataframe into annotations argument. Now the help text says a source for annotation files (e. g. TextGrid) – a mention of a data frame should also be here, this is now too easy to miss, so it should be prominently in the documentation as well.

As a technical note, using a tibble instead a data.frame seems to give an error here. I think it would be important that the package would not make a difference between data.frame and tbl_df classes.

Bonus points for Raven style annotations, that's a great addition.

Further comments

I think create_glossed_document is a very useful function, but I'm not entirely sure if this package is ultimately the best place for it. It's a bit aside from the current pipeline the package provides, although it is a really important tool. One way to connect it better would be to be able to give it instead of a FlexText file a data frame with specific columns. Then it would link more clearly to other capablities of the package: reading the corpus, manipulating and analyzing it in R, and then converting some specific parts to glossed documents that can be used elsewhere.

What it comes to the HTML-export, it could be worth checking out Leipzig.js JavaScript library, as the glosses are much more pretty when aligned correctly. However, especially Word export is something I've heard wished for a lot, so it's nice to see it exists and works well here.

@agricolamz
Copy link
Author

Thank you, @nikopartanen!

@nikopartanen
Copy link

You are more than welcome, @agricolamz! It was a pleasure to dive deeper into this package, and I haven't probably even got that far yet with it, but today was the deadline so I wanted to respect that. Have a good weekend!

@melvidoni
Copy link
Contributor

Thanks @nikopartanen for the review!
@agricolamz Please let us know when the changes are up. Write a comment explaining them, so the reviewers can review the changes.

@melvidoni
Copy link
Contributor

Hello @agricolamz, are there any updates in your reply to reviewers?

@agricolamz
Copy link
Author

only small updates, but I will make more till the end of this week

@agricolamz
Copy link
Author

agricolamz commented Sep 29, 2020

Dear @jonkeane and @nikopartanen, I've made a lot of changes. Thank you again for your comments.

I'm mostly finished. The only big changes that left are

  • to create an additional functions instead of ..._from_folder arguments;
  • fix all draw_... tests according to @jonkeane comments.

I will finish it during this week.

Here are my answers to some of your comments (for most of them, I just change everything according to them).

For @jonkeane

pandoc-citeproc

I needed to install pandoc-citeproc in order to get started, this should probably be listed in the SystemRequirements field of the DESCRIPTION + it would be nice to include it in the readme.

It looks like that pandoc-citeproc is a dependency of rmarkdown and not of phonfieldwork. So I'm not sure that it should be in the phonfieldwork DESCRIPTION, but I added it to Introduction section.

Think about making classes

I will keep it in mind, but for the time being I created a function read_textgrid() that is more general then previous, since it deals with encodings (thanks to @artemklevtsov for helping me with it).

Function restructuring

I need more time to think about it.

Documentation improvements

I added sections to pkgdown refference page

String manipulation and {glue}

I don't want an additional dependencies. But may be I will change my opinion in the future.

srt_to_df()

On lines 44, 55, and 56, I see use of integers (e.g. c(3:4)) for indexing. This is probalby ok since you define the result that is being indexed, but it might be nice for readability to use names here.

I'm not sure what kind of names could be used here, since the common

Tests

If your test-*.R files are named the same (after the test- bit) as your *.R source files, you can run them individually with `cmd+shift+i (most of yours are, but there are a few exceptions)

For some reason it doesn't work for me (I checked both with _ and - signs in function names). So I found and fixed only one mismatch.

Comments on specific functions/files

create_glossed_document()

You might consider using a flextext class here (if you're using S3, you can make it so that are also data.frames, if you're using S4 or R6 or reference classes (all of which are probably overkill here) you could make this class inherit from data.frame). This way you could test that the object was of that class and write one method that checks that it has the column you have specified there.

I'm not so big fun of classes, but I will keep it in mind.

create_viewer()

Would it be possible to create a report where some sounds have images, but others don't? (Or even one where it's just sounds and no pictures, or just pictures and no sounds?)

Not in the current version, but I will consider to make it possible in the future.

Is it possible to have mp3 audio files in the viewer?

I'm not sure if this is using Safari, or something about the images, but the images never seemed to show up (or would show up on the map only in the example) when clicking on the eye. I didn't dig too much to see what was going on here, but you might want to try it out in a few different browsers / settings.

It is a known problem. I will fix it soon.

textgrid_to_df()

I had at least one textgrid that had nuls in it such that I had to add , skipNul = TRUE to the readLines() call on line 33 to let this work (the warning I got was line 50 appears to contain an embedded nul, though that will result in all of the lines being "". I wonder if you should pass all of the options (maybe via ... from textgrid_to_df() through to readLines).

May I look at it? I've implemented some encoding cracker since this version, may be now it is work.

You might also want to add a check that the lines in tg aren't all "" (like with the above error) and error proactively there. Otherwise, the error happens later on line 53 where split_text doesn't exist which is harder to debug that this was an encoding/input problem.

I'm not sure about it. Since I think that I solved the possible encoding problems, I think this is not so important. But I think you are right, I need to consider all weird possibilities.

For @nikopartanen

Creating stimuli and organizing the files

I added autonumber and loging arguments to rename_soundfiles() function. They both can be switched off. The autonumber argument add prefix with number (and appropriate zero padding). The logging arguments creates a logging.csv file with the change correspondencies.

I added SpeechRecorder (for some reason, I can't install it on my PC) to the vignette. It would be nice to add once an R function that will create a recording script for them.

I added missing argument -- this is really nice thought.

Processing data

I added a function create_empty_textgrid().

Reading data into R

I fixed exb_to_df(), so your example now parsed into table.

I add an error for those flextexts that doesn't have a word-level annotations. It looks like kind of a bigger work, so I will fix it in the future.

I added a new vignette about manipulating with tables with tidyvers. Grammar and spelling will be corrected soon.

Visualizations

I changed documentation little bit, in order to say explicitly about df as annotation source.

As a technical note, using a tibble instead a data.frame seems to give an error here. I think it would be important that the package would not make a difference between data.frame and tbl_df classes.

Thank you for this spot! Fixed it.

Further comments

I think create_glossed_document is a very useful function, but I'm not entirely sure if this package is ultimately the best place for it. It's a bit aside from the current pipeline the package provides, although it is a really important tool. One way to connect it better would be to be able to give it instead of a FlexText file a data frame with specific columns. Then it would link more clearly to other capablities of the package: reading the corpus, manipulating and analyzing it in R, and then converting some specific parts to glossed documents that can be used elsewhere.

It is really nice thought, I added it as an issue.

About Leipzig.js -- it is nice, but it has its own issues that we started discussing long ago.

@agricolamz
Copy link
Author

So now in order to separate functions for reading individual files and reading all the files in the directory I created a function read_from_folder <- function(path, FUN){...} which reads each file format from folder according to some existing function in phonfieldwork, e. g. read_from_folder("test", textgrid_to_df). What do you think, should I also create a separate functions like textgrid_to_df_from_folder(), eaf_to_df_from_folder()?

@melvidoni
Copy link
Contributor

Hello @jonkeane and @nikopartanen. Please, take a look at the submitted revisions, following our guide. Let me know in about 3 weeks at the most if @agricolamz needs to complete another revision.

@jonkeane
Copy link

Hello again,

I’ve reviewed the changes and the responses and I’m satisfied with both.

For the encoding issues, I was using some of the files I found in http://www.ims.uni-stuttgart.de/phonetik/helps/praat-scripting/scripts.zip that gave me some of the encoding issues I mentioned.

@melvidoni
Copy link
Contributor

Thanks @jonkeane. @nikopartanen did you had time to look at the changes?

@nikopartanen
Copy link

Hello! Sorry for the delay! I tested the changes quite thoroughly last week when I used the package in one research task of my own, and I'm very satisfied with the way the package works now and how the ideas for changes were addressed. Thanks for your great work, @agricolamz!

@melvidoni
Copy link
Contributor

Since both reviewers approved... Approved! Thanks @agricolamz for submitting and @jonkeane @nikopartanen for your reviews!

To-dos:

  • Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You'll be made admin once you do. Tag me here so I can give you access.
  • Fix all links to the GitHub repo to point to the repo under the ropensci organization.
  • If you already had a pkgdown website and are ok relying only on rOpenSci central docs building and branding,
    • deactivate the automatic deployment you might have set up
    • remove styling tweaks from your pkgdown config but keep that config file
    • replace the whole current pkgdown website with a redirecting page
    • replace your package docs URL with https://docs.ropensci.org/package_name
    • In addition, in your DESCRIPTION file, include the docs link in the URL field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar (website) https://github.com/ropensci/foobar
  • Fix any links in badges for CI and coverage to point to the ropensci URL. We no longer transfer Appveyor projects to ropensci Appveyor account so after transfer of your repo to rOpenSci's "ropensci" GitHub organization the badge should be [![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname).
  • We're starting to roll out software metadata files to all ropensci packages via the Codemeta initiative, see https://github.com/ropensci/codemetar/#codemetar for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.

We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

@agricolamz
Copy link
Author

agricolamz commented Oct 21, 2020

@melvidoni, could you give me an admin access? I think that I've done the rest (except a blog post).

@melvidoni
Copy link
Contributor

You should have access now @agricolamz!

@agricolamz
Copy link
Author

Dear @stefaniebutland, I would like to make a blogpost about phonfieldwork. May I get a publication date?

@stefaniebutland
Copy link
Member

Excellent @agricolamz. There are a couple of options for publication dates, depending on how urgent it is for you.

  1. publish Tues Jan 19; submit draft by Tues Jan 12.
  2. publish Tues Dec 22; submit draft by Mon Dec 14, with agreement to have all edits done before Fri Dec 18

Blog post drafts are reviewed by rOpenSci Community Assistant @steffilazerte. Since your lingtypology post we have written https://blogguide.ropensci.org/ with detailed content and technical guidelines.

@agricolamz
Copy link
Author

@stefaniebutland, thank you! I have already written my first draft, so I choose the second option.

@agricolamz
Copy link
Author

@stefaniebutland thank you! I have already wrote a blogpost, so, I'd take the second option. Created a pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants