Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Culture List Malakula #446

Open
AvivaShimelman opened this issue Feb 24, 2017 · 47 comments
Open

Culture List Malakula #446

AvivaShimelman opened this issue Feb 24, 2017 · 47 comments

Comments

@AvivaShimelman
Copy link

We now have a second list of items for the Malakula recordings. The list is by design unique to Malakula, with items like 'edible flesh of a sprouted coconut' 'tickle the belly of pig in a fashion that makes it lie down and sleep', 'hole in a woven bamboo wall through which the sun shines inside,' 'spit-out fiber of sugarcane', and 'take the face of a man whom you have killed' (all elicit simple terms), along with the slightly more universal 'father-in-law', 'breadfruit' and 'wean'. List attached. It includes 300 items, although I am recommending that we use only a subset (decision yet to come from Russell on this). In any case, the work of marking it up is the same. Items are numbered, but that can be ignored. Renumbering wouldn't throw any wrenches in it for me. Elicitation is always in the order listed.

A second issue is that coverage is a bit different. We have all the major speech varieties (45), but each is represented, for now, with a single recording. In the modal case, the speaker has also made a BV (basic vocabulary) list, but that is just because I preferred to work with "known entities" in recording this one. I privileged those speakers from the first round who offered the clearest "central representations" of languages, and not, as I had done generally with the BV recordings, the oldest speakers of marginal varieties. So tagging the new items on to existing recordings is not really an option. Either the first set of items will come up empty with the culture list or the second will come up empty for the BV list. Maybe a second map?

Malakula "culture list" items 24 02 2017.xlsx

@AvivaShimelman
Copy link
Author

I forgot to mention -
I have photos and short (5-20-second) videos for many items. Video shot in gorgeous, memory-gobbling 4K.
A.

@PaulHeggarty
Copy link

Having spoken to Russell about this yesterday, he is happy for me to make some decisions on the practicalities of implementing this into SoundComparisons.

  • I can confirm that the culture list will not be merged into the same site as the BV list. It will be a separate section of the site, initially I presume just a separate entry in the drop-down studies list (the best way to do this longer-term is to be determined with @Bibiko).
  • The Culture list section will have its own 'word' list (meanings rather than cognates)
    preferably, not a new language list, but just a sub-set of the same language varieties used for the BV list.
  • For me, it's no problem if the speakers are different for the same language variety, because there will be no page on which both the Basic and Cultural lists are mixed up together. Far more important is not to start getting conflicts and confusions between language varieties for the different lists, and not to add new language varieties.
  • What we could do, where a Culture list is available, is to include, in 1 Language table view only, a button to 'show also cultural vocabulary', as a second table under the main one. This could be useful, for example, for searching for more tokens of a given sound within a single language variety.
  • To include photos (and videos?), I think the best means would be one parallel to the one that we already have for speaker photos assigned to individual languages. That is, in the word selector column on the right, a mini-thumbnail would appear whenever there is a photo for that term. Clicking on it would show the full photo.
  • To that end, @AvivaShimelman, could you let us know whether you have photos for all items on the cultural list? They will also have to be given appropriate filenames, but wait out on that until we work out the list of exact names (no special characters) that we'll need.

@PaulHeggarty
Copy link

PaulHeggarty commented Mar 1, 2017

Timetable for the Cultural list project.

  • Yes, preparation work, as far as possible, should be done while @AvivaShimelman and @LauraWae are still working on the project.
  • However, adding a new list is not trivial at all in how much time it will demand from @PaulHeggarty (a week or so) and @Bibiko (more). Neither of us will get any such free time for many weeks, if not months. So this system will not be up and running for months.
  • This also means that for @AvivaShimelman and @LauraWae too, more important is to finish off all work on the BV list, and any other general corrections and improvements to the site, as a higher priority than the Culture list.
  • I presume, by the way, by this stage, that we will not be getting a Bislama translation done at any time remotely soon.

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Mar 1, 2017 via email

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Mar 1, 2017 via email

@PaulHeggarty
Copy link

What is TLC, please? And do you mean the transcriptions are done, and just need to be uploaded? Or that you will start on the transcriptions once you hear from Laura?

The full listen-through sounds like a very good idea. In an ideal world, you would do this in 'editor mode' #425, but that won’t be working in time, I fear.

Yes, you’re right: the work on transcribing and pre-editing the CL files for me is the same whether the site is ready to receive them or not. So please go ahead, but once the other things are done. The idea of a quickie freelance contract, as and when necessary, sounds good, too.

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Mar 2, 2017 via email

@AvivaShimelman
Copy link
Author

@LauraWae
I've created files on my OC for the CL selected recordings and transcriptions and shared them with you. I leave it to you to create (or not) folders on "Malakula" as well. Please have a look at them and let me know what questions you have. It might be easier to Skype, depending. The prompts are written in English and Bislama. In general, I went through the list with consultants before hand and transcribed their answers, so when we went to the recording, you'll hear me prompt not just the Bislama, but actually the responses, too.

A.

@LauraWae LauraWae changed the title Malakula "culture list" Culture List Malakula Mar 29, 2017
@LauraWae
Copy link

Hi @AvivaShimelman and @PaulHeggarty

I have created this Praat glossing-list for the basic vocabulary. I think it should work quite fine, and I am happy to hear feedback. Thanks!

uncle_mother_older
uncel_father_older
aunt_father_younger
aunt_mother_younger
cousin_boy_boy_younger
cousin_boy_boy_older
cousin_girl_boy_older
cousin_boy_girl_younger
grandson_son
granddaughter_daughter
nephew_brother
nephew_sister
niece_brother
niece_sister
sister_boy_older
brother_girl_older
brother_boy_younger
father_in_law
mother_in_law
daughte_in_law
not_nameable
boy_firstborn
pregnant_bride
adopted_child
wife_sister_first
wife_sister_all
levirate_wife
respect
share
peace
discussion
consensus
collective_work
namangi
lowest_grade_name
chief
fight
resolve
fine
revenge
help
barter
gift
inherit
part_of_landowner
slave
village
gate
dance_ground
toilet_w
toilet_m
ritual_house_w
nakamal
post
horizontal_beam
rope_beams
natangora_bamboo
sunbeam_wall
hole_in_wall
instrument_laplap
stone_in_laplap
laplap_in_bamboo
leftovers
green_banana_feeling
soot
knife_clam_shell
cup_coconut
cut_before_cooking
elephant_taro
taro_swamp
wild_yam
wild_pig
island_cabage
banana
breadfruit
sugar_cane
sugar_cane_fiber
prepare_yam_garden
mound_yam
yam_sprouted
yam_to_harvest
first_yam
stake_yam
wilkin
garden_last_year
pig_eats_garden
drag_fire
stick_for_garden_digging
coconut_flower
coconut_before_food
coconut_skin
navara
coconut_dry
coconut_leaf_midrib
coconut_leaf_frond
coconut_milk
strong_bamboo
bamboo_ring
bamboo_length
bamboo_stand
tree_fern
hibiscus
natangora_bark
nangalat
bangan_tree
broken_wave
tidal_wave
swell
deep_sea
deepest_place
driftwood
wind_towards_sea
wind_towards_bush
cyclone
sun_shower
earthquake
moon_waning
moon_waxing
time_before_dawn
first_month
wet_season
coral
dead_coral
reef
where_water_enters_sea
star_constelation_to_plant_yam
place_of_thick_vines
indicate_trail_on_tree
horizon
fish_scale
flyingfox_black
flyingfox_white
spider_web
turtle_shell
hawk
owl
maggot
butterfly
octopus
fire_ant
shark
dolphin
firefly
gecko
pig_teeth_twice
pig_tooth_normal
pig_castrated
sow
tickle_pig_belly
hunt_birds_at_night
sling
arrow
arrow_three_points
arrow_miss_target
bow
bostring
catch_shrimp_with_hand
gather_seafood
bambbo_trap
axe_handle
girl_growing_breasts
girl_falling_breasts
girl_to_be_married
boy_before_nakamal
circumcision_ceremony
circumsize
boy_circumcized
shave
mourning_non_core
mourning_core
dead_ceremony
dead_ceremony_100_days
land_of_the_dead
wail
bed_for_dead_body
bury_dead
unnatural_death
cly_effigy
grass_skirt
nambas
belt_made_of_bark
belt
nose_ring
armband
tattoo
make_cord_by_rolling_fibers
pay_bride_price
reserve_a_bride
pullout_teeth_of_bride
dance_around_tamtam
womens_dance
ankelt_for_dancing
stick_beating_tamtam
slit_a_slit_gong
handdrum
conch
panpipe
mask
tall_mask
lisepsep
stone_dancingground
line_of_stones
stone_with_face
holy_place
weave
mat
mat_for_baby
basket
weave_bambu
finish_mat
join_two_mats
dye_leaves
print_mat
mat_from_coconut
leaves_from_coconut_mat
sanddrawing
erase_sanddrawin
cats_cradle
hand_pinching_bird_naming
paddle
figure_head
outrigger
wood_connecting_canoe_and_outrigger
canoe_hollow
sail
God
deads_person_spirit
spirit
devil
Creator
crazy
nightmare
happy
lazy
skull
nipple
thumb
lond_head
defecate
disembowel
albino
medicine
poison
doctor
black_magic
take_the_face_of_another_peson
sore_on_the_bottom
pick_lice
crush_lice
elephantitis
midwife
abortion
adultress
infertile
womb
menstruation
pregnant
give_birth
wean
love
teach
scold
bounce_a_baby
war
warrior
deadly_part_of_head
carry_on_back
carry_between_two_people
show_teeth
sit_to_keep_warm
sit_legs_outstretched
reach_hill
walking_stick
walk_about_at_night
hit_shell
taboos_cross
shout_out_in_the_bush
wooden_headrest
day_after_tomorrow
posion_fish_in_pool
grate
forget
rollers_for_canoe
bambu_to_knock
its_ok
sorry
kava_tray
kava
broom
pantanas
lie
cluck_chicks
pick_all_fruit
limp

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Mar 30, 2017 via email

@LauraWae
Copy link

Hi,
Thanks for replying so soon. This is only internal, for tagging in Praat (remember the words we filled in between boundaries? They can't have special characters and should be as simple and as specific as possible at the same time).
All the best,
Laura

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Mar 31, 2017 via email

@AvivaShimelman
Copy link
Author

@PaulHeggarty How do you want to deal with the metadata for the CL recordings? Do you want me to make a new "For pasting" sheet based on the one we currently use for the BV lists?

A.

@AvivaShimelman
Copy link
Author

@LauraWae Please give me a heads-up when you want to start in on the CLs. I'm kind of piling them up here until we figure out exactly how we want to organize ourselves.

A.

@PaulHeggarty
Copy link

@AvivaShimelman Aren’t the languages for the culture list just a subset of those we already have for the basic vocabulary list? We only need additions to the for pasting sheet if you want to add wholly new languages.

@PaulHeggarty
Copy link

As for the timing on the CLs, bear in mind that I have no time before I start my parental leave to prepare the new culture list for the website. So all that Laura can do until the end of her contract at the end of April is to mark up the lists. There'll be no sound file extraction or uploading for now, until I can get back to this in the autumn, and until we have a new Sound Comparisons administrator to take over from Laura. We just need to leave the recordings and textgrids in the cleanest possible state so that the extraction and uploading can be completed later by somebody new to the process.

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Apr 11, 2017 via email

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Apr 11, 2017 via email

@PaulHeggarty
Copy link

Better would be already to change all the file names to the SndComp filenames. (But just in case, first make a backup copy of the file with its original name.) Then there is no more renaming for to be done by anyone, especially people new to the project who might well be lost in all the filenames!

@PaulHeggarty
Copy link

We still need some clarifications from @AvivaShimelman on points where we cannot clearly understand what is meant in some of your notes:

"We have CLs for all and only those languages for which we have BV lists."

Do you mean 'languages' in the sense you often use it in, i.e. not as any language variety (including your 'dialect') the sens . Please put numbers on this, so it's clearer. Am I correct in paraphrasing this as:

We do not have CLs for all of the full 135 varieties for which we have BV lists. Rather, we only have CLs for about 40 of those 135 varieties, only for one variety ('dialect') of each of the 40 fully-fledged languages. In other words, we have a CL only for one representative 'dialect' (level 6) of each 'language' (level 5).

@PaulHeggarty
Copy link

Trying to understand what you want us to do with the Tape: Tautu and Tape: Tautu 2 recordings.

Should we:
(a) Keep these as two separate varieties on the database, distinguished only as 1 and 2.
(b) Merge these recordings into a single variety.

Since (if we understand correctly) these are the same speaker, albeit the second time with two others also commenting), I can’t see why we should do (a), but from how you phrased it ("supplement") it is not clear whether you mean (a) or (b).

@PaulHeggarty
Copy link

Please also advise us on what we are to do with the other cases where we have a 1 and a 2 recording and the rest of the variety name is identical.

@LauraWae
Copy link

LauraWae commented Apr 11, 2017

Hi also from my part :)
@AvivaShimelman This is my heads-up to start on the Cultural List. Please start uploading the files to

ownCloud\SndComp\Malakula\2 - LW DA - To Mark up in PRAAT.

Thanks!

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Apr 11, 2017 via email

@PaulHeggarty
Copy link

PaulHeggarty commented Apr 11, 2017

Worry: lect/speaker distinction will be lost.

When? By doing what?

@PaulHeggarty
Copy link

I don’t think ideolects is a reasonable, practical objective for Sound Comparisons, though.

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Apr 11, 2017 via email

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Apr 11, 2017 via email

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Apr 11, 2017 via email

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Apr 11, 2017 via email

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Apr 11, 2017 via email

@PaulHeggarty
Copy link

PaulHeggarty commented Apr 12, 2017

Obviously yes, you'll need to add a tag like _CL (for now), but that's not what I mean by renaming. What I mean is changing file names from your type to the Sound Comparisons type, for example .…

  • not: changing Novol-Dixon Reef BV to Novol-Dixon Reef CL
  • rather, changing your wl_0014_Novol_04_DixonReef_CL (or however you name the CL files) to Sound Comparisons format Oce_Van_Mal_West_Ctr_Novol_DixonReef_Dl_CL

Laura will do all this, in any case. You can just leave your original file names as you had them, plus _CL

@AvivaShimelman
Copy link
Author

@LauraWae
I think you're linked into my CL folder. Everything is really just as it was for the BV lists. Right now, you'll find the Najit, which we can start with, along with the group it is a part of (Malua Bay, Espiegel's Bay, Tirax, Siviti, Batarxopu, Nese). The Tirax 162 recording does have a "sister" R recording. That, too, will work just like the BV-Rs did, with new entries trumping old ones (although I think I've removed any from the first for which there would be any overlap). I will probably yet revise the transcriptions when I do a final listen, of course. The Wowo/Alavas file will require a bit of explanation, but we'll deal with that when we come to it. That'll be the only one that's funky, I think. Some of the recordings have more bird/child noise than others. Those of course, will have to be clipped to the very edges (not as if you didn't know). Again, in general, I've done almost as much as I'm comfortable with in terms of noise reduction, and I fear that more will distort the sound, but you, of course, can play with it and see if there is magic still to be worked. As before - no sound without transcription or transcription without sound. If either is missing, its pair gets thrown out. Beware of gaps/jumps. Anything else, just ask. We can Skype at some point if that would be helpful.

Good luck!

A.

@AvivaShimelman
Copy link
Author

@LauraWae
wrt file naming
In the future, I'll label files as Paul suggests in his last post. For the ones already uploaded, I'll depend on you. Metadata sheet (mini reference sheet on its way). The labeling system for the files already up is the same as it has always been for the BV lists, except for the prefix. BVs are prefixed with "wl" while CLs are prefixed with "wlc". As before, they all have unique numbers and are labeled for language and dialect (where they do indeed represent a unique dialect).

@AvivaShimelman
Copy link
Author

@LauraWae
Attached is a spreadsheet (also uploaded to "reference") listing the CL recordings along with their original file names (mine) and their future SC file names. In those instances (all but 8) where there is a BV recoding that corresponds perfectly (i.e., in both language and dialect) to the CL recording the name is the same, with the only difference being that the CL file is appended with a "CL." The index number is also the same. In the remaining 8 cases, the recording is of the same language but not necessarily the exact-exact same dialect (or, more frequently, not necessarily of a single dialect, being the work of a committee of speakers of different dialects). In these cases, I have followed out the names and indexes as far as is possible (i.e., to level 6 and left the rest to be filled in however Paul specifies (I imagine this will just mean assigning them their own level 5 codes). As all other information -- ISO/Glottolog codes, L&L ... -- will be the same as for the corresponding BVs, I didn't repeat this. I did, however, fill in recording place and date information as well as speaker names, as these may differ. Note that some recordings are appended with an "SC". This indicates a recording that supplements another. On the list, an "SC" recording will figure immediately underneath the base recoding that it ammends. There are 7 of these. They were made in those cases in which later consultations revealed either that that the original speakers had erred or items about which the original speakers had been unsure were ultimately retrieved. I hope I haven't been too opaque in this.
2017 04 12 CL metadata for SC.xlsx

@LauraWae
Copy link

LauraWae commented Apr 19, 2017

Hi,
Thanks for the sheets and also for the aclarations to the gloss list from two weeks ago. I have finished the first two CL-recordings now. This means that I have tagged them with Praat. You can have a look at them on

SndComp\Malakula\Culture List\02 Tagged Files and Textgrids.

You will notice that I have split the Malakula folder into two main subfolders - one for the BV, and the other for CL.

With those two files I have worked with by now there was nothing extraordinary happening. I used your transcriptions to check the spellings. Occasionally, I would delete transcriptions where there was no sound in the pre-selected recording. That's all.

As you have asked in #460, this is my answer: It takes me about 40 minutes to tag a recording.

@AvivaShimelman
Copy link
Author

AvivaShimelman commented Apr 19, 2017 via email

@LauraWae
Copy link

LauraWae commented Apr 20, 2017

Hi on day # 3.

Two things: I have just finished wlc_1630064_Siviti_Gonwar_selected. There was a big break between "land_of_the_dead" and "doctor", where transcriptions went in italics. Do I have to take special care about something in that case?

And, secondly: It looks like the recording wlc_1620034_Tirax_02_Mae_selected does not match the transcriptions named "Tirax Lani". I am confused because of that and I wanted to know if those are the right transcriptions I am choosing or if I need to look for them somewhere else.

Thanks a lot in advance.

@LauraWae
Copy link

Additionally:

Could please also indicate which of those labels

image

is

wlc_1620046-47 Siviti Batarxopu
wlc_1620048 Wowo.

It's somehow not clear to me and I am very grateful for you assistance. Thanks.

@AvivaShimelman
Copy link
Author

@LauraWae
Right. Don't worry about the italicized items in the Gonwar recording for now.
Tirax_Mae and Tirax_Lani are different recordings. The transcription for Tirax_Mae is labeled "Tirax 163"; the transcription for Tirax_Lani is labeled "Tirax_Lani". I appologize. That could have been more clear. Note that the the supplement/revision to the Tirax Mae recording is labeled with the same file number appended with an "R" and file name appended with "Revised".
A.

@AvivaShimelman
Copy link
Author

@LauraWae
I'll re-label the transcriptions to include the numbers (I hadn't because, in general, unlike with the BV lists, we only have one recording per language for the CV lists — we just hit the exceptions first, it seems). I'll be uploading those in a minute or two, so it will be clear which is Siviti and which is Batarxopu. My bad. Wowo is Wowo. More explanation following upload. I'm s glad we're starting in on these! Are you going to be working over the weekend? Do you want a fresh set?
A.

@AvivaShimelman
Copy link
Author

@LauraWae
So I've re-uploaded the north-center transcription sheet to number the transcriptions. I think it worked before when we went with the rule "Always go with the numbers." So I'll keep to that system. The authority is the metadata sheet. Whatever numbers it gives are the ones we're going with. I'm re-uploading now the Batarxopu and Siviti sound files re-labeled. Those are unique. The were actually done by the same speaker (mother spoke one language, father spoke the other; the two villages were separated by a stream). SO he gives each word two times, the first time in Siviti and then the second time in Batarxopu. It should always be clear. I prompt it. He gives the Siviti. Then I say, "Narasaid" ('other side') and he gives the Batarxopu. In any case, the responses are distinct enough that you should be able to read them off the transcripts.

@AvivaShimelman
Copy link
Author

@LauraWae
wrt to Wowo/Alavas files. That one was a challenge because the language is really in decline. The principal speaker/first group got a bit more than half. There are three older speakers who, in different sessions, did manage to agree on about 20% more of the list. I recorded two of them. So our entire set for Wowo/Alavas is a bit "composite". It should be pretty transparent, though. There are three recordings uploaded and there are three different columns in the transcriptions. I've grouped the transcriptions in the new "color" sets. In the metadata, these are separated by spaces.

@AvivaShimelman
Copy link
Author

@LauraWae
I've re-labeled and re-uploaded the Wowo/Alavas sound files. I don't know if things got better or worse! I've named the files as in the metadata sheet (for example wlc_1620050_Wowo/Alavas_Lesmarlas) but they appear on oc without the numerical prefix (example: Wowo_Lesmarlas_selected). They're still unique, so it shouldn't be hard to tell them apart (I hope). 48 is Wowo, 49 is Alavas and 50 is Lesmarlas

@LauraWae
Copy link

Aviva's comments from 23th and 24th of April in #458

@LauraWae
Mornin'! I uploaded the northern set (Wowo-Alavas, V'ao). V'ao has got a lot of bird noise in the background (semi-outdoor recording at 17:00). The speaker does generally manage to beat the birds when he's talking, but the darn chirpers can be heard immediately at the edges. (Sorry!). There's a set of about 20 items that I've reserved in a separate column (it'll be obvious). There's nothing that has to be done with those right now.

A.


@LauraWae
Morning! Are you going to need more recordings to play with today? I've put Avava, Fifti and Tasmbol in the common folder. Please, please, please do them in that order and take them one by one, i.e., only as you do them. That way, if there have to be changes, those can be made before anything gets into the gears.
I forgot to say with regard, in particular to the V'ao recording:
Sometimes consultants give two different forms. Both will appear in the transcription and on the recording. In the transcriptions, I separate alternative forms with commas. This is the only purpose for which I use commas, so a quick, old-fashioned search will pick them out if you worry you didn't catch them. I think I remember Paul saying that there is a way to register two different responses. If there isn't, I've put my first pick, not counter-intuitively, first. There re never too many of these (anywhere from 0-6 in any recording?). It can be important to record both in cases, for example, where one form is cognate with those in neighboring languages to one side and the other with neighboring languages on the other side. I'll be going to sleep soon so I'm afraid I won't be able to respond to emails until all the hustle and bustle of "Jena Tuesday" has slowed.

@AvivaShimelman
Copy link
Author

@LauraWae
Long story short: I'm adding a second Neverver recording. But -- don't be daunted! -- it's a partial file.

Right now we have the Limap dialect on our list. I was just reviewing it and I realized that I only recorded those items where Limap differed at all from the other dialect I recorded, Mindu. It was Mindu that I had recorded all completely. With Limap, we had reviewed the whole set and then, when it came time to record, the the man -- who was quite elderly -- just started to get very tired, so I picked out those where the two dialects different in any regard -- maybe a quarter. That said, It's already pre-edited and transcribed, so there's no reason not to use it, I don't think.

@AvivaShimelman
Copy link
Author

@LauraWae
Is today really the end of your contract? Do you know when you'll have a replacement? Whenever that will be, I guess, I'll just stockpile recordings and transcriptions for them. Good luck on your next stop! I'd be glad if you dropped a line every once in a while.
A.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants