Skip to content

Commit 44a005f

Browse files
committed
Highlight fixes
1 parent b361c8c commit 44a005f

20 files changed

+88
-945
lines changed

examples/evaluation/Gender.datadesc

+2-2
Original file line numberDiff line numberDiff line change
@@ -88,11 +88,11 @@ Dataset: Gender
8888
ofType: Categorical
8989

9090
Statistics:
91-
Categorical Distribution:
91+
Categoric Distribution:
9292
["": 0%]
9393
Quality Metrics:
9494
Sparsity: 8.9
95-
Noisy labels "Please considers some humans errors in the annotation process could be done"
95+
Noisy labels: "Please considers some humans errors in the annotation process could be done"
9696
Is sample:
9797
"It is a sample of all possible documents. It is not intended to be representative
9898
(in fact, it is known to be quite non-representative): it was specifically designed

examples/evaluation/Melanoma.datadesc

+7-8
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ Dataset: Melanoma_Classification_Dataset
123123
ofType: Categorical
124124
Statistics:
125125
Mode: "Famela"
126-
Categorical Distribution:
126+
Categoric Distribution:
127127
[
128128
"Male": 40%,
129129
"Famela": 55%,
@@ -161,9 +161,9 @@ Dataset: Melanoma_Classification_Dataset
161161
count: 33126
162162
ofType: Categorical
163163
Statistics:
164-
Categorical Distribution:
164+
Categoric Distribution:
165165
[
166-
"beningnant": 80%,
166+
"beningnant": 80%,
167167
"malignant": 20%
168168
]
169169
Statistics:
@@ -208,7 +208,6 @@ Dataset: Melanoma_Classification_Dataset
208208
(https://www.kaggle.com/c/siim- isic-melanoma- classification/discussion/161943)."
209209

210210
Related Instances: skinImages
211-
Social Issues: PrivacyIssue1
212211
How data is collected: Manual Human Curator
213212
When data was collected:
214213
"Images were originally collected by imaging centers during 1998 - 2019; this dataset was curated from those image
@@ -260,7 +259,6 @@ Dataset: Melanoma_Classification_Dataset
260259
Who collects the data: "Internal Medical staff"
261260
Type Internal
262261
Country/Region: "Australia"
263-
Social Issues: raceRepresentative
264262
Label Requirements:
265263
Requirement: "1) Images containing any potentially identifying features, such as jewelry or tattoos, or from patients without
266264
at least three qualifying images were excluded during quality assurance review."
@@ -303,12 +301,13 @@ Dataset: Melanoma_Classification_Dataset
303301

304302

305303
Social Concerns:
306-
Rationale: 'Rationale'
304+
Rationale: 'Dataset may not be representative of the real world data, and the cavenience sample is not representative of general incidence of melanoma'
307305
Social Issue: raceRepresentative
308306
IssueType: Bias
309-
Related Attributes: ImageId
307+
Related Attributes:
308+
attribute: ImageId
310309
Description: "Dataset is not representative with respect to darker skin types"
311310
Social Issue: generalIncidence
312-
IssueType: Social Impact
311+
IssueType: Social Impact
313312
Description: "Dataset is a convenience sample and is not representative of general incidence of melanoma"
314313

examples/evaluation/Polarity.datadesc

+8-7
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,11 @@ Dataset: Polarity
55
Version: v0001
66
Release Date: 13-12-2010
77
Citation:
8-
Raw Citation:
9-
"Bo Pang and Lillian Lee. 2004. A sentimental education: Sentiment analysis using
10-
subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual
11-
Meeting of the Association for Computational Linguistics. 271."
8+
doi: "10.5281/zenodo.569"
9+
authors: "Bo Pang and Lillian Lee"
10+
title: "A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts"
11+
publisher: "Springer, proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics"
12+
year: "2004"
1213
Description:
1314
Purposes:
1415
"The dataset was created to enable research on predicting sentiment polarity—i.e., given a piece of English text, predict whether
@@ -116,7 +117,7 @@ Dataset: Polarity
116117
description: "The label annotated by the reviewrs "
117118
ofType: Categorical
118119
Statistics:
119-
Categorical Distribution:
120+
Categoric Distribution:
120121
[
121122
"pos": 50%,
122123
"neg": 50%
@@ -126,7 +127,7 @@ Dataset: Polarity
126127
Quality Metrics:
127128
Sparsity: 00
128129
Completeness: 100
129-
Class Balance "attribute 'tag': 50% positive, 50% negative"
130+
Class Balance: "attribute 'tag': 50% positive, 50% negative"
130131

131132
Dependencies:
132133
Description: "The dataset is entirely self-contained."
@@ -191,7 +192,7 @@ Dataset: Polarity
191192
Label: Tag
192193
Description: "The label is the positive/negative sentiment polarity rating derived
193194
from the star rating"
194-
Mapping: tag
195+
Mapping: ImageId
195196
Label Requirements:
196197
Requirement:
197198
"- In order to obtain more accurate rating decisions, the maximum

fileicons/Documentation.svg

-48
This file was deleted.

0 commit comments

Comments
 (0)