-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve add.vertex.attribute.first.activity #135
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for this neat implementation for the 'first.activity' attribute, @klaraschlueter. Nice work! 👍
Except for the comments on the code, please work on the following commits:
- Commit 59172de: Please remove this commit by squashing it with the next one (733fa79) – while using the second commit's commit message.
- Commit 0081a62: Please remove the 's' from all verbs.
- Commit 8f342c3: Remove the forward references to the upcoming commits, just use backward references. You can write: "Tests for the other two cases will be added in upcoming commits."
- Commit 3f8ef22: analogously to 8f342c3.
util-networks-covariates.R
Outdated
#' One of \code{mails}, \code{commits}, and \code{issues}. [default: "mails"] | ||
#' @param name The attribute name to add [default: "first.activity"] | ||
#' @param activity.types The kinds of activity to use as basis. | ||
#' One ore more of \code{mails}, \code{commits} and \code{issues}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value is missing: [default: c("mails", "commits", "issues")]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and or
instead of ore
util-networks-covariates.R
Outdated
#' @param default.value The default value to add if a vertex has no matching value [default: NA] | ||
#' @param default.value The default value to add if a vertex has no matching value [default: NA]. | ||
#' @param compute.over.all Flag indicating that one the first activity over all given | ||
#' \code{activity.types} is of interest (instead of one value per type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[default: FALSE]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, there is something wrong in the sentence structure: "Flag indicating that one the first activity..." Do you mean only
instead of one
?
util-networks-covariates.R
Outdated
#' A correctly formatted dataframe can easily be created by the function \code{compute.first.activities.dataframe}. | ||
#' | ||
#' @return A list containing for each person in the given dataframe a list containing the first activity of all activity types in the given dataframe. | ||
list.over.all = function(first.activity.dataframe) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some more inline documentation for this function.
util-networks-covariates.R
Outdated
#' A correctly formatted dataframe can easily be created by the functione \code{compute.first.activities.dataframe}. | ||
#' | ||
#' @return A list containing for each person in the given datafram a list containing for each activity type in the given dataframe the first activity. | ||
list.single.types = function(first.activity.dataframe) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some more inline documentation for this function.
util-networks-covariates.R
Outdated
#' Helper function for first activity: Converts the given dataframe in a list containing the first activity per activity type and person. | ||
#' | ||
#' @param first.activity.dataframe A dataframe with the first activity data. The rows are persons, the coloums are activity type functions. | ||
#' A correctly formatted dataframe can easily be created by the functione \code{compute.first.activities.dataframe}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*function
util-networks-covariates.R
Outdated
name = "first.activity", | ||
aggregation.level = c("range", "cumulative", "all.ranges", | ||
"project.cumulative", "project.all.ranges", | ||
"complete"), | ||
default.value = NA) { | ||
default.value = NA, | ||
compute.over.all = FALSE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to rename this parameter to something like single.value
or compute.overall.minimum
. @bockthom, what's your opinion on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike single.value
as it does not really state what it is about.
compute.overall.minimum
sounds better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think talking about a minimum is appropriate here. The function encapsulates the computation of the first activity. There is no need to reveal to the user, that this is done by calculating the minimum of all his activities. If you want me to append the "what-are-we-computing", then I'd like to say "compute.overall.first.activity".
But in general, by appending something to the parameter name, I think the question we should answer is "over all what are we computing?", and not "what are we computing?", as the answer to the second question is the semantics of the whole function. Hence, I suggest "compute.overall.activity.types".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, @klaraschlueter. In hindsight, "minimum" is the wrong suffix. 🤦♂️ Thanks for pointing this out.
I really like your first suggestion: compute.overall.first.activity
. Maybe, even the abbreviated version would suffice: compute.overall.first
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, @klaraschlueter. If we really want to be precise here, then we should describe what the "over all" is related to. Without explanation, "over all" could mean "over all developers" or "over all aggregation levels" or "over all ranges", ... - So, when willing to be more precise, then we should state that the "over all" refers to the activity types and nothing else. Hence, the most precise name would be compute.first.activity.over.all.activity.types
. However, that is way to long and contains information which is already present in the function name.
What about something like take.first.of.all.activity.types
- sounds still a little bit strange, but is not too long and contains all necessary information. Are there better ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like "take.first.of.all.activity.types"! Except, I think replacing "of" by "over" clarifies, that we are not computing a first activity of each type, but one over all. Would "take.first.over.all.activity.types" be okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that all information needs to be incorporated in the parameter name, that is the reason why we have documentation, right? However, overall, I am fine with @klaraschlueter's last suggestion.
util-networks-covariates.R
Outdated
compute.attr = function(range, range.data, net) { | ||
df = get.first.activity.data(activity.types, range.data) | ||
if(compute.over.all) { | ||
return(list.over.all(df)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely need to rename this function and alsolist.single.types
below: I propose to prefix both functions with get.first.activity.data.
to show the relation to the function get.first.activity.data
.
@bockthom, what are your thoughts on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarification: We are talking about the function list.over.all
, not the inner function here ;)
Regarding the list.single.types
and list.over.all
: I agree on renaming those functions by adding the proposed prefix.
@clhunsen We should think about adding an additional sentence to the README.md file. Currently we only state:
To add further vertex attributes [...] please see the file `util-networks-covariates.R` for the set of corresponding functions to call
(Btw, there are two typos in the sentence in README.md as the a in "please" is missing and there is a wrong n in "covariates". We should fix that anyway).
I suggest to add information which functions are designated to be used by the end user - there are so many helper functions now, that it is not that easy to find out which functions are designated for the end user. What about the following:
To add further vertex attributes – which can only be done after constructing a network –, please see the functions `add.vertex.attribute.*` in the file `util-networks-covariates.R` for the set of corresponding functions to call.
Do you have other ideas how to clarify which of those functions are designated for the end user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarification: We are talking about the function list.over.all, not the inner function here ;)
Yeah, that's the one. 😉😄
Regarding the list.single.types and list.over.all: I agree on renaming those functions by adding the proposed prefix.
👍
@clhunsen We should think about adding an additional sentence to the README.md file. Currently we only state:
To add further vertex attributes [...] please see the file
util-networks-covariates.R
for the set of corresponding functions to call
(Btw, there are two typos in the sentence in README.md as the a in "please" is missing and there is a wrong n in "covariates". We should fix that anyway).
I will do that in my upcoming PR. Your suggestion is fine. 👍
util-networks-covariates.R
Outdated
#' | ||
#' @return A data frame with rows named with persons and colums named with activity types, containing the time of the corresponding | ||
#' first activity as POSIXct. | ||
get.first.activity.data = function(activity.types = c("commits", "mails", "issues"), range.data) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please switch the two parameters in their order. Optional parameters should be always at the end of the parameter list.
util-networks-covariates.R
Outdated
|
||
#' Helper function for first activity: Converts the given dataframe in a list containing the first activity per activity type and person. | ||
#' | ||
#' @param first.activity.dataframe A dataframe with the first activity data. The rows are persons, the coloums are activity type functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*columns
util-networks-covariates.R
Outdated
#' @param first.activity.dataframe A dataframe with the first activity data. The rows are persons, the coloums are activity type functions. | ||
#' A correctly formatted dataframe can easily be created by the functione \code{compute.first.activities.dataframe}. | ||
#' | ||
#' @return A list containing for each person in the given datafram a list containing for each activity type in the given dataframe the first activity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*dataframe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for improving the computation of the first activity. Looks really good - and also looks like a lot of work, more than I would have expected to fix this vertex attribute.
However, one remark regarding your documentation of the functions:
Could you please move the default value to the end of the line. That is, even after the full stop?
It would be much easier to read if the default value is not part of the sentence but an individual part of the documentation. So, for example, I would expect the following format:
#' @param param.name The description of the param. [default: value]
I am sorry for being that picky.
util-networks-covariates.R
Outdated
) | ||
compute.attr = function(range, range.data, net) { | ||
df = get.first.activity.data(activity.types, range.data) | ||
if(compute.over.all) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here a space is missing after if
67a9925
to
8998645
Compare
Contains only basic computation. Tests aren't adapted. The function add.vertex.attribute.first.activity is able to handle multiple data sources now. The result is stored as a list containing all minimums, named with the corresponding data source. Signed-off-by: Klara Schlueter <[email protected]>
Concerns method add.vertex.attribute.first.activity. If no information is found for some person and data source, than the corresponding element of the list added as vertex atttribute is set to NA (instead of not being created). Signed-off-by: Klara Schlueter <[email protected]>
A flag of the function add.vertex.attribute.first.activity now indicates, if the first acticity should be computed and stored per single data source or "globally" over all given data sources. Signed-off-by: Klara Schlueter <[email protected]>
Improve naming and add helper methods. Signed-off-by: Klara Schlueter <[email protected]>
One of three cases is tested: given multiple data sources, compute the first activity per person and datasource. Tests for the other two cases are added in commit will be added in upcomming commits. Signed-off-by: Klara Schlueter <[email protected]>
Tests one of three cases: compute the first activity per person over all given data sources (one value per person). Tests for the other two cases are added in commit 92241506265c5d262946d71add5ed2d680217599 and one future commit. Signed-off-by: Klara Schlueter <[email protected]>
Signed-off-by: Klara Schlueter <[email protected]>
Tests one of three cases: given one data source, compute the first activity per person per data source. Tests for the other two cases were added in commit 92241506265c5d262946d71add5ed2d680217599 and commit d7ebb79dbb4a5b2dd8102255f1ebc7bea4513029. Signed-off-by: Klara Schlueter <[email protected]>
Rename function, rectify documentation structure, delete browser statement. Signed-off-by: Klara Schlueter <[email protected]>
Signed-off-by: Klara Schlueter <[email protected]>
Signed-off-by: Klara Schlueter <[email protected]>
Signed-off-by: Klara Schlueter <[email protected]>
1abdaee
to
8bc3236
Compare
Thank you very much for your work, @klaraschlueter! 👍 Sorry that this took so long. |
As pointed out in PR se-sic#135, the instructions on adding vertex attributes to constructed networks get more and more confusing due to the increasing number of helper functions in the file `util-networks-covariates.R`. To be more precise, the functions to add vertex attributes are now described with the pattern "add.vertex.attribute.*" in the README file. Additionally, some typos are fixed. Props to @bockthom for pointing this out. Signed-off-by: Claus Hunsen <[email protected]>
Addressing issue #92 , the function is now able to