-
Notifications
You must be signed in to change notification settings - Fork 2
/
index.html
271 lines (268 loc) · 15.1 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
---
layout: strapless
menu_item: home
---
<section class="section-alt pad-top">
<div class="container">
<h1 class="text-center headline">Glottobank</h1>
</div>
</section>
<section class="section-main">
<div class="container pad-top">
<div class="service-flexrow pad-top">
<div class="column-66">
<div class="gray-box">
<p class="lead">
Glottobank is an international research consortium established to
document and
understand the world’s linguistic diversity. Glottobank team
members are
pursuing this goal on two fronts. First, we have established five
global
databases documenting variation in language structure
(<a href="#grambank">Grambank</a>),
lexicon (<a href="#lexibank">Lexibank</a>), paradigm systems
(<a href="#parabank">Parabank</a>), numerals
(<a href="#numeralbank">Numeralbank</a>), and phonetic changes
(<a href="#phonobank">Phonobank</a>).
In doing so, we seek to develop new methods in language
documentation, compile
data on the world’s languages and make this data accessible and
useful. Second,
we are developing methods to use this data to make inferences
about human
prehistory, relationships between languages and processes of
language change. We anticipate data will begin to become available
in 2022.
</p>
</div>
</div>
<div class="column-33">
<div>
<img src="images/glottobank_all.jpg" alt="CLDF logo"
class="img-responsive">
</div>
</div>
</div>
<div class="container pad-top">
<div class="service-flexrow pad-top">
<div class="column-100">
<h2 id="grambank">Grambank</h2>
<p>
Grambank is a database of structural (typological) features of
language. It
consists of 195 logically independent features (most of them
binary) spanning
all subdomains of morphosyntax. The Grambank feature questionnaire
has been
filled in, based on reference grammars, for more than 2,000 languages.
The aim is to
eventually reach as many as 3,000 languages. The database can be
used to
investigate language prehistory, the geographical-distribution of
features, language universals and the functional interaction of
structural
features.
</p>
</div>
</div>
<!--
To find out more, visit the Grambank website.
-->
<div class="service-flexrow pad-top">
<div class="column-100">
<h2 id="lexibank">Lexibank</h2>
<p>
Lexibank is a <a href="https://github.com/lexibank/lexibank-analysed/">public database and repository</a> for lexical data from
the languages of the
world. Currently, Lexibank contains lexemes and cognate judgments
from ~2500 languages
spanning Africa, Europe, Asia, the Pacific, and the Americas. The
database will be used to
refine cognate judgments, infer language relationships, construct
language phylogenies,
test hypotheses about language history, investigate factors that
affect the mode and
tempo of language evolution, model sound change, and facilitate
quantitative comparisons
with other types of linguistic data. The initial focus of Lexibank
will be on compiling
basic or core vocabulary, but ultimately the database will be
expanded to include a full
range of lexicon from all the world’s languages.
<!--
For more information on Lexibank and how to use or submit data please see the project
website.
-->
</p>
</div>
</div>
<div class="service-flexrow pad-top">
<div class="column-100">
<h2 id="parabank">Parabank</h2>
<p>
Parabank is a large database of selected paradigmatic structures
found in the world’s
languages, focusing on the patterning of formal similarities and
identities (or
<i>syncretisms</i>) between cells in these paradigms (cf <i>I</i> vs <i>me</i>
but <i>you</i> vs <i>you</i>). It is
motivated by the observation that different languages and language
families have
significantly different patterns in their syncretisms and that at
least some of these are
stable through time. In addition, information arranged in matrices
gains additional power
because of the large number of values that can be calculated by
comparing every cell with
every other cell.
</p>
<p>
Because the paradigms we explore are ubiquitous across the world’s
languages, our working
hypothesis is that paradigmatic syncretisms can provide
significant signal to linguistic
relationships in time, and the database is designed to allow the
systematic
exploration of morphosyntactic features by linguistic typologists
and evolutionary
biologists. Additionally, Parabank will be an important resource
to assist in the
identification and quantification of some of the important
mechanisms in how the design
space of language evolves. Initially, the database will assemble
paradigms of free
pronouns, verb agreement, and a subset of kin terms, with
subsequent plans to incorporate
demonstratives/interrogatives/indefinite pronouns/negative
pronouns, numeral systems, and
other promising linguistic subsystems with paradigmatic structure.
</p>
<p>
Parabank will be led by Nick Evans, Simon Greenhill and Kyla
Quinn, all based at the
Australian Research Council Centre of Excellence for the Dynamics
of Language (CoEDL), at
the Australian National University (ANU), but welcomes the
participation of any interested
researcher. Funding will primarily come from the CoEDL.
<!--
To find out more, click here.
-->
</p>
</div>
</div>
<div class="service-flexrow pad-top">
<div class="column-100">
<h2 id="numeralbank">Numeralbank</h2>
<p>
Numeralbank is a public database and repository on numeral systems
in the world’s languages. It is motivated by the idea that number
words do not just form an important part of most languages, but
constitute systems that serve as essential tools at the
intersection of culture, language, and cognition. Numeralbank can
be used to classify numeral systems according to their properties,
to document the geographical distribution of system types, to
investigate commonalities and differences in system properties
across languages, to reconstruct the most likely ancestral states,
and to explore possible limits to and constraints on the striking
diversity in how people count. Initially, the database will allow
for analyses within and across systems, but the ultimate goal is
to support tests of hypotheses on linguistic, cognitive, and
cultural factors that may drive the emergence and evolution of
numeral systems.
</p>
<p>
Entries in Numeralbank are largely based on data collected by
Eugene Chan as part of the long-running project "Numeral Systems
of the World's Languages" that was hosted at the former Department
of Linguistics at the MPI for Evolutionary Anthropology in
Leipzig. The data is now hosted at the Department of Cultural and
Linguistic Evolution at the MPI for Evolutionary Anthropology
in Leipzig. The Numeralbank database is designed and maintained by
Hans-Jörg Bibiko. The Numeralbank team consists of (in
alphabetical order)
<a href="http://www.uib.no/en/persons/Andrea.Bender">Andrea Bender</a>,
<a href="http://www.shh.mpg.de/employees/42541/55811">Hans-Jörg Bibiko</a>,
<a href="https://www.eva.mpg.de/linguistic-and-cultural-evolution/staff/robert-forkel/">Robert Forkel</a>,
<a href="http://www.shh.mpg.de/employees/48696/55811">Simon Greenhill</a>,
<a href="http://www.shh.mpg.de/2923/russellgray">Russell Gray</a>, <a href="http://www.shh.mpg.de/employees/48214/55811">Harald
Hammarström</a>, <a href="http://www.bristol.ac.uk/school-of-arts/people/fiona-m-jordan/">Fiona
Jordan</a>,
and <a href="http://www.shh.mpg.de/employees/48689/25522">Annemarie
Verkerk</a>.
</p>
</div>
</div>
<div class="service-flexrow pad-top">
<div class="column-100">
<h2 id="phonobank">Phonobank</h2>
<p>
Phonobank aims to establish a cross-linguistic comparative
database of sound patterns,
sound correspondences, and sound shifts. Our starting point is
collections of multiple
phonetic alignments of cognate sets in language families. All
sounds are linked to a
cross-linguistic phonetic alphabet that provides distinctive
features and segment
descriptions. The ultimate goals of the database are to support
the computational
linguistic comparison of word forms and to serve as a basis for
improving the methods of
computer assisted cognate detection, sound reconstruction and
building linguistic
phylogenies from sound correspondences.
</p>
</div>
</div>
<div class="service-flexrow pad-top">
<div class="column-100">
<h2>Methods and Tools</h2>
<p>
The Glottobank team is developing a suite of methods and tools for
analysing comparative
linguistic data. For example, using the <a href="http://www.beast2.org">BEAST
2</a> software
platform, we have created a Bayesian framework for
<a href="http://language.cs.auckland.ac.nz/">phylogeographic inference of language expansion in space and
time</a>.
<a href="https://github.com/lmaurits/BEASTling">BEASTling</a> is a program
designed
to help linguists easily prepare Bayesian phylogenetic analyses of
linguistic data using the BEAST 2 platform. It automates many
tedious
data-preparation tasks, features close integration with the
<a href="http://glottolog.org">Glottolog language catalog</a>, and strives to
follow established best
practices for computational linguistic phylogenetics.
<a href="http://lingpy.org">LingPy</a> is a Python library for quantitative
tasks in historical
linguistics. It offers state-of-the-art algorithms for pairwise
and multiple phonetic
alignment analyses, automatic cognate detection, and various tools
to explore and curate
lexical data. Finally, <a href="http://cldf.clld.org">CLDF</a>
and associated standards
are aimed at providing an interface between databases and tools
which will enable easier
sharing of data and code.
</p>
</div>
</div>
<div class="service-flexrow pad-top">
<div class="column-100">
<h2>Funding</h2>
<p>
In addition to the time and energy of members of the consortium,
Glottobank is supported
by the Max Planck Institute for the Science of Human History,
a Royal Society of New Zealand Marsden Grant (grant #13-UOA-121)
and
the ARC Centre of Excellence for the Dynamics of Language.
</p>
</div>
</div>
</div>
</div>
</section>