Skip to content

Commit 5acb0ce

Browse files
committed
Update README
1 parent 2f82b73 commit 5acb0ce

File tree

1 file changed

+12
-13
lines changed

1 file changed

+12
-13
lines changed

README.md

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,13 @@
22

33
## Publication
44

5-
* Xiang Ren\*, Ahmed El-Kishky, Chi Wang, Fangbo Tao, Clare R. Voss, Heng Ji, Jiawei Han, "**[ClusType: Effective Entity Recognition and Typing by Relation Phrase-Based Clustering](http://web.engr.illinois.edu/~xren7/fp611-ren.pdf)**”, Proc. of 2015 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'15), Sydney, Australia, August 2015. ([Slides](http://web.engr.illinois.edu/~xren7/KDD15-ClusType_v3.pdf))
5+
* [Xiang Ren](http://web.engr.illinois.edu/~xren7/)\*, Ahmed El-Kishky, Chi Wang, Fangbo Tao, Clare R. Voss, Heng Ji, Jiawei Han, "**[ClusType: Effective Entity Recognition and Typing by Relation Phrase-Based Clustering](http://web.engr.illinois.edu/~xren7/fp611-ren.pdf)**”, Proc. of 2015 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'15), Sydney, Australia, August 2015. ([Slides](http://web.engr.illinois.edu/~xren7/KDD15-ClusType_v3.pdf))
66

7-
* Xiang Ren\*, Ahmed El-Kishky, Chi Wang, Jiawei Han, "**[Automatic Entity Recognition and Typing from Massive Text Corpora: A Phrase and Network Mining Approach](http://research.microsoft.com/en-us/people/chiw/kdd15tutorial.aspx)**”, Proc. of 2015 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'15 Conference Tutorial), Sydney, Australia, August 2015. ([Website](http://research.microsoft.com/en-us/people/chiw/kdd15tutorial.aspx)) ([Slides](http://hanj.cs.illinois.edu/kdd-15/UIUC-Tutorial.pdf))
7+
* [Xiang Ren](http://web.engr.illinois.edu/~xren7/)\*, Ahmed El-Kishky, Chi Wang, Jiawei Han, "**[Automatic Entity Recognition and Typing from Massive Text Corpora: A Phrase and Network Mining Approach](http://research.microsoft.com/en-us/people/chiw/kdd15tutorial.aspx)**”, Proc. of 2015 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'15 Conference Tutorial), Sydney, Australia, August 2015. ([Website](http://research.microsoft.com/en-us/people/chiw/kdd15tutorial.aspx)) ([Slides](http://hanj.cs.illinois.edu/kdd-15/UIUC-Tutorial.pdf))
88

99
## Note
1010

11-
"./result" folder contains typed entity mentions on a sample of 50k Yelp reviews.
11+
"./result" folder contains results on a sample of 50k Yelp reviews.
1212

1313
## Requirements
1414

@@ -62,7 +62,7 @@ RawText='data/yelp/yelp_sample50k.txt'
6262
```
6363

6464
Input: type mapping file path.
65-
- Format: "type name \tab typeId". "NIL" means "Not-of-Interest".
65+
- Format: "type name \TAB typeId \n". "NIL" means "Not-of-Interest".
6666
```
6767
TypeFile='data/yelp/type_tid.txt'
6868
```
@@ -73,17 +73,18 @@ StopwordFile='data/stopwords.txt'
7373
```
7474

7575
Output: output file from candidate generation.
76-
- Format: "docId \TAB segmented sentence". Segments are separated by ",". Entity mention candidates are marked with ":EP". Relation phrases are marked with ":RP".
76+
- Format: "docId \TAB segmented sentence \n".
77+
- Segments are separated by ",". Entity mention candidates are marked with ":EP". Relation phrases are marked with ":RP".
7778
```
7879
SegmentOutFile='result/segment.txt'
7980
```
8081

8182
Output: entity linking output file.
82-
- Format: "docId \TAB entity name \TAB Original Freebase Type \TAB Refined Type \TAB Freebase EntityID \TAB Similarity Score \TAB Relative Rank".
83-
- Seed file for Yelp dataset can be download from [here](https://www.dropbox.com/s/w628rwpb3kbmuea/seed_yelp.txt?dl=0).
84-
- Seed file for NYT dataset can be downloaded from [here](https://www.dropbox.com/s/k0qzsvbbpngptjt/seed_nyt.txt?dl=0).
83+
- Format: "docId \TAB entity name \TAB Original Freebase Type \TAB Refined Type \TAB Freebase EntityID \TAB Similarity Score \TAB Relative Rank \n".
84+
- Download [Seed file](https://www.dropbox.com/s/w628rwpb3kbmuea/seed_yelp.txt?dl=0) for Yelp dataset.
85+
- Download [Seed file](https://www.dropbox.com/s/k0qzsvbbpngptjt/seed_nyt.txt?dl=0) for NYT dataset.
8586

86-
NOTE: Our entity linking module calls [DBpediaSpotLight Web service](https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Web-service), which has limited querying speed. This process can be largely accelarated by installing the tool on your local machine. See [here](https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Installation) for details.
87+
NOTE: Our entity linking module calls [DBpediaSpotLight Web service](https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Web-service), which has limited querying speed. This process can be largely accelarated by installing the tool on your local machine [Link](https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Installation).
8788
```
8889
SeedFile='result/seed.txt'
8990
```
@@ -94,7 +95,7 @@ DataStatsFile='result/data_model_stats.txt'
9495
```
9596

9697
Output: Typed entity mentions.
97-
- Format: "docId \TAB entity mention \TAB entity type".
98+
- Format: "docId \TAB entity mention \TAB entity type \n".
9899
```
99100
ResultFile='result/results.txt'
100101
```
@@ -129,6 +130,4 @@ minSup='10'
129130
Number of relation phrase clusters.
130131
```
131132
NumRelationPhraseClusters='50'
132-
```
133-
134-
133+
```

0 commit comments

Comments
 (0)