-
Notifications
You must be signed in to change notification settings - Fork 0
/
index-en.html
123 lines (122 loc) · 6.06 KB
/
index-en.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
<title>Líonra Séimeantach na Gaeilge</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta http-equiv="Content-Language" content="en">
<meta name="description" content="Home page the Irish language semantic network">
<meta name="keywords" content="thesaurus, wordnet, semantic network, Irish language, Irish Gaelic, Gaeilge">
<meta name="author" content="Kevin P. Scannell">
<link rel="stylesheet" href="../kps.css" type="text/css">
</head>
<body>
<div class="content">
<h1>Líonra Séimeantach na Gaeilge:<br>
Home</h1>
<h2>
<a href = "/index.html">Kevin P. Scannell</a>
</h2>
<hr>
<h2><a name="Summary">Summary</a></h2>
<p>
This is the home page for <i>Líonra Séimeantach na Gaeilge</i>
(the "LSG", or, in English, the
<i>Irish Language Semantic Network</i>), a database consisting of
Irish words and the semantic relationships among them.
A semantic network of this kind is sometimes called a <i>wordnet</i>,
after Princeton's
<a href="http://wordnet.princeton.edu/">English-language WordNet</a>,
which was the first-ever full-scale semantic network
(dating back to the mid-1980's).
Semantic networks are much richer
than traditional thesauri, which generally only record (near) synonyms and
sometimes antonyms. The LSG, like most other wordnets, encodes
a richer set of relationships, including hypernyms and hyponyms (broader and narrower terms), meronyms and holonyms (part vs. whole), etc.
</p>
<p>
Semantic networks have many applications in Natural Language Processing.
They are used in systems for word sense disambiguation, document
summarization and indexing, and information retrieval. When a semantic
network in one language contains mappings to a second language (ours is
linked to the English WordNet), it can
be used in various ways to improve machine translation.
In general terms, from an artificial intelligence perspective,
a semantic network encodes some
of the "real-world knowledge" that is required for computers to understand and process
texts in a non-trivial way.
</p>
<p>
The image to the left is a depiction of the full LSG
(click it for a full-size version). In fact, this is a simplification
of the true picture — each node in the image represents a whole set
of synonymous words which could be added as additional branches.
Something like this network, but probably thousands of times more complex,
is contained in the brain of every Irish speaker — semantic connections like these are made instantly and intuitively.
The 3D graph browser (see below) allows you to "fly" through this network
and manipulate it in various ways.
</p>
<hr>
<h2><a name="Download">Download</a></h2>
Even if you are not interested in developing software for language
processing, the database
can still be quite useful, and for this reason I am offering access to
it in several different ways:
</p>
<ol>
<li>As an enriched "thesaurus" in PDF format. Note that
<em>each word</em> in the body of the text is a hyperlink cross-reference.
You can download the (nearly 50MB)
<a href="http://euler.slu.edu/~scannell/lsg-1.001.pdf" hreflang="ga">PDF</a> directly,
or else the
<a href="lsg-latex-1.001.tar.gz">LaTeX source</a> if you want to build the
PDF yourself (e.g. with different fonts). Please save the file locally on your
computer to conserve bandwidth.
<li>Integrated into the free <a href="http://ga.openoffice.org/">OpenOffice.org</a> office suite. You can install this directly using the "Install new dictionaries..." wizard ("Suiteáil foclóirí nua..." in the Irish version). Alternatively, you can
<a href="http://ftp.services.openoffice.org/pub/OpenOffice.org/contrib/dictionaries/thes_ga_IE_v2.zip">download the ZIP file</a> containing the thesaurus and install it manually. Here's a <a href="ooo.png">screenshot</a> of the thesaurus in action.
</ol>
<p>
And if you are interested in the source code that I used to create the
network, visit our <a href="http://code.google.com/p/wordnet-gaeilge/">development site</a> at Google Code.
</p>
<hr>
<h2><a name="Features">Features</a></h2>
<ul>
<li><b>Comprehensive database</b>. There are 32742 synsets,
36262 headwords
and 77596 individual word senses, including a great deal of
modern terminology, as well as literary and dialect forms, slang, etc.
<li><b>Free license</b>. Like the Princeton WordNet
(but, unfortunately, unlike nearly
<a href="http://www.globalwordnet.org/gwa/wordnet_table.htm">all other wordnets in existence</a>), the LSG is free software.
Specifically, all data, including the PDF thesaurus, are released under
the terms of the <a href = "http://www.gnu.org/copyleft/fdl.html">GNU Free Documentation License</a>. This means, in short, that you have the freedom to copy and redistribute the data, with or without modification, as long as you do so under the same license.
<li><b>English mappings</b>. Entries in the LSG are linked to synsets in the
Princeton WordNet. This is a key element in my ongoing work on English-Irish
machine translation.
<li><b>Frequent updates</b>. I am planning on providing regular updates,
incorporating corrections and refinements,
but also reflecting Irish as a living language via new terminology,
shifting usages, etc.
<li><b>Common lexicon</b>. The database used to generate the thesaurus
is the same one I use
to generate the <a href="http://www.gaelspell.com/">GaelSpell</a>
family of spellcheckers and the
<a href="http://borel.slu.edu/gramadoir/">Gramadóir</a> grammar checker.
Improvements to one project will be reflected in the others automatically.
</ul>
<hr>
<em>© Copyright 2007 Kevin P. Scannell</em><br>
</div>
<div class="navigation">
Home<br>
<a href = "details-en.html" hreflang = "en">Details</a><br>
<a href = "thanks-en.html" hreflang = "en">Thanks</a><br>
<a href = "/nlp.html" hreflang = "en">Other Projects</a><br>
<a href = "index.html" hreflang = "ga">As Gaeilge</a><br>
<p class="centered">
<a href="lsg-best.png"><img class="linked-image" src="lsg-thumb.png" alt="LSG graph image" height="184" width="184"></a>
</p>
</div>
</body>
</html>