Aligning Proto-Forms against their reflexes #12

LinguList · 2014-11-11T17:06:00Z

Now that I added the proto-forms as "simple languages" (language *PT in the Edictor), all proto-forms should be aligned to their reflexes. In this way, we can later on check and model how the proto-forms changed into the reflexes. We can use this to test,

how well the proto-language can predict the daughter languages, and
which sound changes frequently occur in the data, and
which cases of sound changes needed to model the data are problematic

For all of this, we'll need the alignments.

thiagochacon · 2014-11-12T18:35:55Z

Great. I think I can add here that suggestion I made by email. I suggested we should try to "hierarchize" the proto-form with the descendant forms. This could be helpful for two main problematic cases in the alignment: metathesis, phonological splits and mergers.

If we work with some sort of hierarchy, we could link the particular reflexes with a proto-form cell (i.e reconstructed sound). The normal/unmarked situation could be handled with the alignment proper. Otherwise, we could link a particular reflex to one or more proto-form cells.

Suppose we have the following scenario
Proto-L XYZ
L1 XYW
L2 XZY
L3 XYAB
L4 XT

In this scenario L1 W would be aligned, thus automatically linked with PL Z
L2 Z would be linked to PL Z.
L3 AB would be linked to PL Z.
L4 T would be linked to PL YZ.

Do you think this would be a good idea? How far/close are we to manage that with the current status of the alignment tech?

LinguList · 2014-11-12T19:00:26Z

Easiest and most straightforward approach here is to add another column containing the "linking". This would start from the proto-form in it's tokenized representation (that is, the "TOKENS" columns). Now, we could use some easy-to-define markup in which for each reflex the relation to the proto-form is defined. This would come close to Pauls solution he presented.

A possible example for markup would be:

PROTO X/1 Y/2 Z/3
L1 X/1 Y/2 W/3
L2 X/1 Z/3 Y/2
L3 X/1 Y/2 A/3 B/3
L4 X/1 T/2,3

Here, numbers in reflexes refer to numbers defined in Proto-Forms.

I could also write a tool similar to the alignment editor which would display these internal formats nicely or allow for quick editing.

But before starting to work on technical solutions here, I suggest we use this issue to collect the cognate sets where such a representation is actually needed. If, in the end, it is only two cases or so, we might come up with an easier solution. If not, the examples will help us to identify which functionality we need in the end.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aligning Proto-Forms against their reflexes #12

Aligning Proto-Forms against their reflexes #12

LinguList commented Nov 11, 2014

thiagochacon commented Nov 12, 2014

LinguList commented Nov 12, 2014

Aligning Proto-Forms against their reflexes #12

Aligning Proto-Forms against their reflexes #12

Comments

LinguList commented Nov 11, 2014

thiagochacon commented Nov 12, 2014

LinguList commented Nov 12, 2014