You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: NEWS.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,10 @@
2
2
3
3
## Unversioned
4
4
5
+
### Added
6
+
- Add functionality to read and process commit messages in order to merge them to the commit data (see issue #180). Three values are available for the new attribute `commit.messages` in `ProjectConf`: `none`, `title` and `messages` (PR #193, 85b1d0572c0fb9f4c062bceb1363b0398f98b85f, fdc414ade1a640f533e809a25cfe012e42b3cffa, 43e1894998e18faff3a65114fa65ee54e1d2f66e)
7
+
- Add functions `cleanup.commit.message.data` and `cleanup.synchronicity.data` to remove commit hashes that are not any more present in the commit data from the commit message data or synchronicity data (PR #193, 98e83b037ecc88d9a29e8e4ca93598a9978e85a2)
8
+
5
9
### Changed/Improved
6
10
- Add `.drone.yml` to enable running our CI pipelines on drone.io (PR #191, 1c5804b59c582cf34af6970b435add51452fbd11)
@@ -123,6 +123,7 @@ Alternatively, you can run `Rscript install.R` to install the packages.
123
123
-`parallel`: For parallelization
124
124
-`logging`: Logging
125
125
-`sqldf`: For advanced aggregation of `data.frame` objects
126
+
-`data.table`: For faster data processing
126
127
-`testthat`: For the test suite
127
128
-`patrick`: For the test suite
128
129
-`ggplot2`: For plotting of data
@@ -179,11 +180,16 @@ There are two distinguishable types of data sources that are both handled by the
179
180
* Issue data (called `"issues"` internally)
180
181
181
182
- Additional (orthogonal) data sources (augmentable to main data sources, not splittable)
183
+
* Commit messages are available through the parameter `commit.messages` in the [`ProjectConf`](#configurable-data-retrieval-related-parameters) class. Three values can be used:
184
+
1. `none` is the default value and does not impact the configuration at all.
185
+
2. `title` merges the commit message titles (i.e. the first non white space line of a commit message) to the commit data. This gives the data frame an additional column `title`.
186
+
3. `messages` merges both titles and message bodies to the commit data frame. This adds two new columns `title` and `message`.
182
187
*[PaStA](https://github.com/lfd/PaStA/) data (patch-stack analysis, see also the parameter `pasta` in the [`ProjectConf`](#configurable-data-retrieval-related-parameters) class))
183
188
* Patch-stack analysis to link patches sent to mailing lists and upstream commits
184
189
* Synchronicity information on commits (see also the parameter `synchronicity` in the [`ProjectConf`](#configurable-data-retrieval-related-parameters) class)
185
190
* Synchronous commits are commits that change a source-code artifact that has also been changed by another author within a reasonable time-window.
186
-
191
+
192
+
187
193
The important difference is that the *main data sources* are used internally to construct artifact vertices in relevant types of networks. Additionally, these data sources can be used as a basis for splitting `ProjectData` in a time-based or activity-based manner – obtaining `RangeData` instances as a result (see file `split.R` and the contained functions). Thus, `RangeData` objects contain only data of a specific period of time.
188
194
189
195
The *additional data sources* are orthogonal to the main data sources, can augment them by additional information, and, thus, are not split at any time.
@@ -532,16 +538,23 @@ There is no way to update the entries, except for the revision-based parameters.
532
538
-`commits.filter.untracked.files`
533
539
* Remove all information concerning untracked files from the commit data. This effect becomes clear when retrieving commits using `get.commits.filtered`, because then the result of which does not contain any commits that solely changed untracked files. Networks built on top of this `ProjectData` do also not contain any information about untracked files.
534
540
*[*`TRUE`*, `FALSE`]
535
-
-`mails.filter.patchstack.mails`
536
-
* Filter patchstack mails from the mail data. In a thread, a patchstack spans the first sequence of mails where each mail has been authored by the thread creator and has been sent within a short time window after the preceding mail. The mails spanned by a patchstack are called
537
-
'patchstack mails' and for each patchstack, every patchstack mail but the first one are filtered when `mails.filter.patchstack.mails = TRUE`.
538
-
* [`TRUE`, *`FALSE`*]
541
+
-`commmit.messages`
542
+
* Read and add commit messages to commits. The column `title` will contain the first line of the message and, if selected, the column `message` will contain the rest.
543
+
*[*`none`*, `title`, `messages`]
539
544
-`issues.only.comments`
540
545
* Only use comments from the issue data on disk and no further events such as references and label changes
541
546
*[*`TRUE`*, `FALSE`]
542
547
-`issues.from.source`
543
548
* Choose from which sources the issue data on disk is read in. Multiple sources can be chosen.
544
549
*[*`github`, `jira`*]
550
+
-`mails.filter.patchstack.mails`
551
+
* Filter patchstack mails from the mail data. In a thread, a patchstack spans the first sequence of mails where each mail has been authored by the thread creator and has been sent within a short time window after the preceding mail. The mails spanned by a patchstack are called
552
+
'patchstack mails' and for each patchstack, every patchstack mail but the first one are filtered when `mails.filter.patchstack.mails = TRUE`.
553
+
* [`TRUE`, *`FALSE`*]
554
+
-`pasta`
555
+
* Read and integrate [PaStA](https://github.com/lfd/PaStA/) data with commit and mail data (columns `pasta` and `revision.set.id`)
556
+
*[`TRUE`, *`FALSE`*]
557
+
***Note**: To include PaStA-based edge attributes, you need to give the `"pasta"` edge attribute for `edge.attributes`.
545
558
-`synchronicity`
546
559
* Read and add synchronicity data to commits (column `synchronicity`)
547
560
*[`TRUE`, *`FALSE`*]
@@ -550,10 +563,6 @@ There is no way to update the entries, except for the revision-based parameters.
550
563
* The time-window (in days) to use for synchronicity data if enabled by `synchronicity = TRUE`
551
564
*[1, *5*, 10, 15]
552
565
***Note**: If, at least, one artifact in a commit has been edited by more than one developer within the configured time window, then the whole commit is considered to be synchronous.
553
-
-`pasta`
554
-
* Read and integrate [PaStA](https://github.com/lfd/PaStA/) data with commit and mail data (columns `pasta` and `revision.set.id`)
555
-
*[`TRUE`, *`FALSE`*]
556
-
***Note**: To include PaStA-based edge attributes, you need to give the `"pasta"` edge attribute for `edge.attributes`.
0 commit comments