Amaravati: Abode of Amritas

15.9.5.23:59: A FIRST LOOK AT SUN 2004 AND SHI 2013

Two days ago, I had the opportunity to glance at Sun Bojun's 金代女真语 (Jin Dynasty Jurchen; 2004), the first monograph on the subject that I have ever seen, and Shi Jinbo's 西夏文教程 （A Tangut Course; 2013).

I was disappointed to not see any examples of Jurchen script in Sun's book. Perhaps I just missed them.

I did, however, see many formulae of the type

*Jin Jurchen > [standard written] Manchu

which I think should have been rewritten as

*Jin Jurchen : [standard written] Manchu

since it is clear that the Jin Jurchen forms cited (which are not homogeneous: e.g., *weike/oho/uyike 'tooth') are often not directly ancestral to the standard written Manchu forms, though there is no doubt of cognancy: e.g., there is no regular sound change *s > j to justify deriving Manchu bujan 'forest' from Jin Jurchen *busan (transcribed in the History of the Jin Dynasty in Chinese as 僕散 *pusan*).

The section of Shi's book that interested me the most was the list of Tangut verbs with stem alternations on pp. 333-338. Shi grouped those 44 verbs into four classes which end in the following rhyme types in my Tangut transcription:

I. -i/-o (11 verbs)

II. -u/-o (12 verbs)

III. -i/-y (6 verbs)

IV. -e/-i (11 verbs), -on/-en (1 verb), -i/-o (again!; 3 verbs) -u1/-u2 (1 verb; the only case of grade alternation)

I don't understand why he grouped four types of verbs in class IV, including -i/-o verbs that should have been in class I.

*9.6.1:28: 僕 'servant' had *b- (< *N-ph-?) in the Middle Chinese lexicographical tradition but has [pʰ] in modern standard Mandarin. However, its initial was transcribed as

<b>

in the Khitan small script, implying that its Liao Chinese initial may have been *p-, the regular reflex of Middle Chinese *b-. Jin Chinese is presumably the descendant of the northeastern dialect that I call Liao Chinese, so I assume 僕 also had *p- in Jin Chinese. Liao and Jin Chinese voiceless unaspirated stops correspond to stops transcribed as voiced in Khitan and Jurchen. (Those Khitan and Jurchen consonants may have been voiced unaspirated despite their transcription: e.g., <b> may have been [p].) The modern standard Mandarin reading [pʰu] for 僕 may descend from a bare Old Chinese root *phok without a nasal prefix.

15.8.29.21:55: PÓKKÎI RÓT CHƆ́KKOOLƐ́T

Today I saw boxes of chocolate-flavored Pocky (Japanese Pokkii) labeled in Thai as

ป๊อกกี้รสช็อกโกแลต

Pókkîi rót chɔ́kkoolɛ́t 'Pocky flavor chocolate'

กูลิโกะ

Kuulíkòʔ 'Glico'

I wonder

- how tones are assigned to Japanese loanwords in Thai

- were the tones of Pókkîi assigned by analogy with the English loanwords like ร็อคกี้ Rókkîi 'Rocky'?

- why chɔ́kkoolɛ́t has a long oo absent in English

- is the long ee of Japanese chokkoreeto due to the assumption that chocolate rhymes with late?

- why Kuulíkòʔ 'Glico' (a compromise between Japanese Guriko and its Anglicized form Glico) has a long first vowel

15.8.22.23:55: TANGUT 'TALLY MARKS' AND PAHAWH KHMU

For nearly a century, Tangutologists have devised various systems of 'radicals' to index Tangut characters. These modern radicals do not necessarily correspond to the components implied by the analyses in Tangraphic Sea. No explicit premodern list of Tangut character components was known prior to the discovery of the book known in Mandarin as 擇要常傳同名雜字 Zeyao changchuan tongming zazi, translated by Andrew West as 'Essential Selection of Often Transmitted Homonyms and Mixed Characters'. Andrew has written an article on the publicly available pages of this book.

The most interesting aspect I have seen so far is the list of radicals (?) which may be titled

3583 0706 5865 1084 2403 0092 1ta4? 2vi1? 1soq1 2ghaq12di4 1ma4 'TOPIC? rhyme? three ten character mother'

The first two characters are difficult to identify. I have followed Andrew's intepretation here.

The use of a topic suffix as the first part of a title is baffling.. 3583 is a common fanqie speller for Tangut rhyme 1.20 1-a4. Could the title mean 'thirty letters for the rhyme 1-a4?' But why would there be thirty ways to represent the same rhyme? There is no obvious correlation between these thirty elements and the rhyme 1-a4, and as we will soon see, some of the thirty elements do not even occur in any Tangut characters. Another possbility is that 3583 is a phonetic symbol for a word ta, as the graph can transcribe the Sanskrit syllables ta and tā. But what would that ta mean/? What if ta is a non-Tangut word in the Tangut script?

The first eleven elements look like tally marks:

1-4: one dot, vertical stacks of two to four dots

5-8: one horizontal line, vertical stacks of two to four horizontal lines of the same length

9-11: one vertical ine, horizontal clusters of two to three vertical lines of the same length

Tangut characters do contain single dots (1), horizontal lines (5), and vertical lines (9), but the combinations of those elements (2-4, 6-8, 10-11) are unknown in Tangut. (When two horizontal lines are vertically stacked in Tangut, the top line is shorter than the bottom line, whereas 6 has lines of equal length.)

Why include these un-Tangut 'tally marks'? Andrew wrote (emphasis mine):

At present it is a mystery to me as to what these thirty "letters" are intended to represent, and whether they represent rhymes in Tangut or some other language.

What if 3583 1ta4 is the name of a language? A name related to the source of the exonym Tangut (which does not resemble any Tangut autonym)? What if the 'tally marks' were used to write 'Ta' but not Tangut? What if Ta and Tangut had scripts with partially overlapping components?

I am reminded of Shong Lue Yang who invented the Pahawh Hmong and Pahawh Khmu scripts for Hmong and Khmu. As far as I know, Pahawh Khmu has disappeared without a trace. William Smalley (1990: 195) wrote:

All we know about the form of the Pahawh Khmu' is that Chia Koua Vang has seen it, and that the letters were like the Pahawh Hmong letters, which leaves us no way of evaluating it as a writing system. It may not have been preserved.

Hmong and Khmu have very different phonologies, so I assume that Pahawh Khmu had graphemes absent from Pahawh Khmu and vice versa. Would a list of graphemes of both of Shong Lue Yang's scripts resemble the Tangut list of 'thirty letters'in the sense that it would be a mixture of familiar and alien components?

15.8.12.3:06: MONOMASTICS

Old Chinese, Written Tibetan, Old Burmese, and Tangut all have the same vowels in 'one' and 'name' (hence the title). Why is Pyu the odd man out?

Gloss	Old Chinese	Written Tibetan	Old Burmese	Tangut	Pyu
one	*Cɯ-tek	gcig	tac	1lew1 < *Cʌ-tek	taṃ
name	*Cɯ-meŋ	ming	mañ	2me'4 < *Cɯ-meXH	mi

The phonetic value of the Pyu grapheme that I transliterate as ṃ is uncertain.

Here are four possible explanations for why Pyu has different vowels in those two words.

1. The heights of a and i were conditioned by different presyllabic vowels. If pre-Pyu had *low and *high presyllabic vowels in 'one' and 'name', the main vowels might have harmonized with them: e.g.,

*Cʌ-tek > taṃ (*e lowered to *a after low *ʌ)

*Cɯ-meŋ > mi (*e raised to *i after high *ɯ)

2. Matisoff (2003) reconstructed Proto-Tibeto-Burman *tyak as well as *g-t(y)ik. Putting aside the problem of whether PTB even existed (I don't think it did), one might say that Pyu taṃ is from *tyak (which in Indo-European-like terms could be an a-grade form of *tik).

3. The different vowels in Pyu reflect the presence or absence of something corresponding to the mysterious pre-Tangut feature that I write as *X and conventionally place after the vowel, though I do not know its location.

4. Pyu had asymmetrical developments of vowels after homorganic codas. Such asymmetrical development later occurred in Burmese:

-ac > [iʔ]
-añ > [e] and [ɛ] as well as [ĩ] and [i]

Mandarin also has similar instances of asymmetrical development: e.g.,

*-ak > -e, -o, -uo (generally depending on initial, but note how 樂 *lak became le whereas 洛 *lak became luo; is this apparent split due to dialect mixture?)

*-aŋ > -ang

In these particular examples, vowels before *nasals are lower than those before *stops: cf. how French -in has a vowel lower than the vowel in French -it.

However, Burmese also has an example of the opposite phenomenon:

*-ak > [ɛʔ] (which has no nasal counterpart [ɛ̃])

*-aŋ > [ĩ]

Has anyone studied asymmetrical development across languages?

8.12.4:48: A fifth possibility is that Pyu preserves a vocalic distinction lost in the other four languages.

15.8.8.23:40: ORIENTALISTA ÉS MONGOLISZTIKA

I saw those words in the Hungarian Wikipedia entry on Kara György (a.k.a. George Kara) and wondered

- when is foreign s borrowed as Hungarian s [ʃ], and when is it borrowed as sz [s] - do the two correspondences indicate different strata?

- why is foreign s treated differently in orientalista 'Orientalist' and mongolisztika 'Mongol studies' - or mongolista 'Mongolist' and germanisztika 'German studies', etc.?

15.7.31.23:31: DID FINAL *-P CONDITON LABIAL FLIGHT IN TANGUT?

So far, I have been using 'labial flight' to refer to the loss of labiality in pre-Tangut syllables with labial onsets and codas such as *mbjvm 'to fly'. The labial codas in my examples - actual or hypothetical - have been either *-m or *-w (< *-k, *-ŋ). But pre-Tangut had one more labial coda: *-p. Did it also condition labial flight?

Here is the fate of pre-Tangut *-p according to Guillaume Jacques (2014: 206)

1. *-ap > -a (lost completely)

2. *-ip, *-up > (*-əp?) > -ə (= my -y; lost completely; I propose an merged intermediate stage *-əp)

3. *-op > -ew (lenited to -w; labial *o dissimilated to e before a labial - a vocalic variety of labial flight)

Both Guillaume and I reconstruct the same six vowels (*u *i *a *ə *e *o) for pre-Tangut. In theory, one might expect *-ep and an *-əp (that merged with *-ip and *-up?), but Guillaume only reconstructed three possible codas for *e (*-ej, *-en, *-eŋ; see p. 207) and no codas for *-ə. The unbalanced distribution of vowels and codas in his pre-Tangut deserve further study. (I never worked out all the possible combinations of vowels and codas in my pre-Tangut.)

If labial flight - of the consonantal type - is real, I would expect *Pop to become Pe1 (cf. *mew > 1me1 'eye'). *Pjop might merge with *Pjaŋ and become Cwo3 (cf. *mbjvm > 1jwon3 'to fly').

Guillaume identified only one example of *-op:

3299 1lwew1 < *P-lop 'vapor'; cf. Japhug tɤ-jlɤβ, Situ ta-jlôp < *jlɔp

The pre-Tangut prefix and onset are mine; Guillaume only reconstructed the rhyme *-op. My pre-Tangut *P- conditions Tangut medial -w-.

Next: Did syllables like *P(j)op exist in pre-Tangut?

15.7.30.23:59: A CHRONOLOGY OF LABIAL FLIGHT IN TANGUT

In my previous post, I made a few references to the order of changes that Guillaume Jacques (2014: 199) and I proposed. Perhaps a table would be easier to understand:

Type of 'labial flight'	*Pj-m ('to fly')	P-w < P-k ('eye')	*Pjaŋ (examples?)	*PV-jvm (examples?)	*PV-jaŋ (examples?)
Stage 1: early pre-Tangut	*mbjvm	*mek	*Pjaŋ	*PV-jvm	*PV-jaŋ
Stage 2: velar coda lenition		meɣ > meɰ	*Pjaɰ		*PV-jaɰ
Stage 3: labialization of glide		*mew	*Pjaw		*PV-jaw
Stage 4: labial initial-coda dissimilation	*ǰwvm	*mej	*Cwaw		*PV-jaw
Stage 5: presyllable-initial fusion	*ǰwvm	*mej	*Cwaw	*Pjvm	*Pjaw
Stage 6: Tangut	1jwon3	1me1	Cwo3	Pon4	Po4

Notes:

The five types: So far I only know of one example each for the first two ('to fly' and 'eye'). The other three are theoretical. *P represents any labial consonant.

Stage 1: *v is Guillaume's notation for a non-*i-vowel. Japhug has o and Proto-Lolo-Burmese has *a in 'to fly', so I think the pre-Tangut word might have been *mbjom or *mbjam.

Stages 2-3: The weakening of velar codas may have also occurred in the northwestern Chinese dialect known to the Tangut.

The rare velar glide *ɰ (only in 2.66% of UPSID's languages) which was only in coda shifted to the more common labial glide *w which could also occur in other positions.

Stage 4: Dissimilation only occurred within the same syllable. Presyllabic labial onsets followed by syllables ending in labials remained intact.

I use the symbol *ǰ to represent a pre-Tangut affricate that could have been [dʑ], [dʒ], or [dʐ]. I think pre-Tangut palatals became retroflexes at some point before stage 6. *C represents the consonants *č, *čh, and *ǰ that became Class VII initials in Tangut.

The glide in *Cw- from *Pj- could have been phonetically [ɥ] if preceded by a palatal onset.

Stage 5: *PV-j- fused into *Pj-, filling the void left by *Pj- that dissimilated to *Cw-.

Stage 6: *-m nasalized the preceding vowel before being lost. *-m also conditioned the rounding of nonlabial vowels preceding it: e.g., *-am > *-om > -on [õ].

1me1 might have still ended in a glide [j].

The monophongization of *-aw has partial parallels in the northwestern Chinese dialect known to the Tangut. See Gong (2002: 374-376) for details.

7.31.0:41: I forgot to explain the grades:

- Pre-Tangut syllables with *-j- generally became Tangut Grade IV syllables. Exceptions with Grade III had Class VII initials (either primary or secondary).

In the past I have reconstructed Grade IV with a medial -i-, and Gong reconstructed Grade III (equivalent to my Grades III and IV) with a medial -j-. However, Tibetan transcriptions of Tangut do not strongly support a palatal interpretation of Grades III and IV.

- Pre-Tangut syllables with *e developed Grade I unless followed by a high-vowel presyllable.

15.7.29.23:49: WAS DISSIMILATION THE MOTIVE FOR LABIAL FLIGHT IN TANGUT?

Last night, I forgot to mention why this unusual sound change proposed by Guillaume Jacques (2014: 199)

*mbj- > dʑ- (should this be dʑjw- = my jw-3?)

which could be formulated more generally as

*Pj- > Cw-3 or *Class I-j- > Class VII-w-3

might have occurred before any non-*i-vowel followed by *-m and *-aŋ.

With two exceptions of possible foreign origin below*, labials do not occur before -w in my Tangut reconstruction**:

2313 1pew4 'poor' (only in dictionaries) and 3412 2mew4 'the name Mew; transcription character for Sanskrit myak'

So I suspect that pre-Tangut had a constraint against *PVP syllables with labial onsets and codas.

Such a constraint also exists in modern Cantonese. Earlier *PVP sequences have become PVT: e.g.,

梵 Early Middle Chinese *buam > Cantonese faan

法 Early Middle Chinese *puap > Cantonese faat

In Cantonese, the coda became nonlabial, whereas in Tangut, the coda disappeared entirely (or at least became nonlabial) in

4684 1me1 ([mej]?) < *mew < *mek 'eye'

and the onset became a palatal-labial cluster in

2262 1jwon3 < *mbjvm 'bird/to fly'.

'Eye' indicates that dissimilation postdated the weakening of *-k to *-w.

Guillaume did not provide any examples of labials becoming palatals before *-aŋ. Given that *-aŋ became Tangut -o (Jacques 2014: 193), there might have been an intermediate *-aw phase that predated dissimilation:

Cwo3 < *Cwɔ < *Cwaw < *Cwaɰ < *Cwaŋ

The velar codas *-k (in 'eye') and *-ŋ may have merged into a velar glide *-ɰ that became a labial glide *-w conditioning dissimilation in labial-initial syllables.

*7.29.23:57: 2313 is a rare word without any known etymology. It may have been borrowed from my (hypothetical) substratum 'Tangut B' language after dissimilation (see above).

The name Mew written as 3412 may also be of Tangut B origin.

**7.30.0:18: Guillaume uses Gong's reconstruction which has far more -w than mine. Gong's -w corresponds to my -n (symbolizing nasalization and not a coda [n]) after o in his rhyme group XI (rhymes 56-60 and 97-98):

Rhyme	Gong	This site
56	-ow	-on1
57	-iow	-on2
58	-jow	-on3/-on4
59	-ioow	-on'2
60	-joow	-on'3/-on'4
97	-owr	-orn1
98	-jowr	-orn4

However, Gong and I agree that his rhyme group IX (rhymes 44-49 and 93-94) had -w:

Rhyme	Gong	This site
44	-ew	-ew1
45	-iew	-ew2
46	-jiw	-ew3, -ew4
47	-jiw	-iw3, -iw4
48	-eew	-ew'1
49	-jiiw	-iw'3, -iw'4
93	-ewr	-ewr1
94	-jiwr	-iwr4

2313 and 3412 are the only examples of labial-initial syllables in rhyme group IX.

15.7.28.23:57: SEVEN FROM ONE?: LABIAL FLIGHT IN TANGUT

In my last entry, I asked which meaning of

2262 1jwon3 'bird/to fly'

was older. It seems that 'to fly' might be older, since Guillaume Jacques (2014: 199) compared 2262 1dʑjow (sic; should be 1dʑjwow) = my 1jwon3 to

(the last syllable of?) Japhug nɯqambɯmbjom 'to fly'

Proto-Lolo-Burmese *(b)-yam 'to fly'

Written Burmese pyaṃ 'to fly'

and reconstructed pre-Tangut *mbjvm (in which *v could be any vowel other than *i). Although *mbj- would normally become bj- (= my b- + Grade IV), Guillaume proposed the sound change

*mbj- > dʑ- (should this be dʑjw- = my jw-3?)

before *-vm and *-aŋ. He noted there was no Tangut *bjow (= my *bon4) and few examples of -jow (= my -on3/-on4) after labials. I know of only two examples:

5954 2porn4 'luxuriant, exuberant' and 0421 2phon4 (mantra transcription character)

0421 is not for native words, so it has no pre-Tangut source.

On the one hand, if *mbj- became j-, wouldn't other *Class I (labial)-j-sequences also become Class VII initials*?

On the other hand, if Class I (labial)-j-sequences became Class VII initials, why does 5954 still have a labial initial?

I propose the following changes to solve that conundrum:

1. *pj-> chw-3 (2331 2chwon3 'to contribute'?)

2. *(m)bj- > jw-3 (2262 1jwon3 'to fly')

3. *pV-j- > *pj- > p-4 (5954 2porn4 'luxuriant, exuberant'?)

New *Pj-sequences from old *PV-j- sequences replaced old *Pj-sequences that became *Cw-sequences (*C = Class VII initial):

Stage 1	Stage 2	Stage 3
*Pj-	*Cw-	Cw-3
*PV-j-	*Pj-	P-4

*phj- and *mj- would hypothetically become chhw-3 and nw-4 (via *ɲw-), but Tangut has no *chhwon3 or *nwon4.

*(m)bV-j- would hypothetically become b-4, but Tangut has no *bon4 because presyllables probably did not have voiced or prenasalized initials.

7.29.12:19: If 0421 were a native word, I could propose

4. *phV-j- > *phj- > ph-4

but I doubt that aspirates were permissible in presyllables. I expect presyllables to only have a subset of segments that are permissible in the syllables that follow them.

*Guillaume follows Gong and reconstructs Class VII as palatal, but I prefer to regard it as retroflex. My notation is not IPA and can accomodate either interpretation: e.g., j- may be palatal [dʑ] or retroflex [dʐ].

15.7.27.22:30: ENGLISH FLIES FLY; TANGUT BIRDS BIRD

While looking for examples of the Tangut directional perfective prefix 1a0- 'up-', I found this example from volume 10 of the Tangut translation of the Golden Light Sutra in Li Fanwen (2008: 942):

1364 1136 5981 2262 4342 2511 1nga1-2gu1 1a0-1jwon3 2da4-2ryr4

'void-in PERF^up-bird/fly PERF^away-go out/arise' = '... flew up and away into the air'

It corresponds to Chinese 空中飛騰而去, lit. 'void-in fly-rise and leave' (see the context here).

2262 1jwon3 can be either a noun 'bird' or a verb 'to fly'. Which meaning is primary? Which meaning is older? (The answer to those two questions may not be the same; a newer usage can outnumber an older one.)

5981 1a0 can also mean 'one' before nouns. Can 5981 2262 1a0 1jwon3 ever mean 'one bird' instead of 'flew'? If we did not have the Chinese edition, would it be possible to translate that line as 'a bird rose into the air'?

I would like to see more examples of PERF-V PERF-V sequences.

15.7.26.23:56: A COLLECTION OF DESIRABLE DIRECTIONS

Thanks to Andrew West for drawing my attention to 3349 (last seen here) in this line from the preface to the Pearl in the Palm:

1319 3349 5981 4018 1326 0478 1tshi1 2rer4 1a0 2chhi3 1ky4-1sho'2

'? ? one root/basic/book PERF-collect'

Nishida (1964: 187):

必要な事柄を一根に集めた。

hitsuyō-na kotogara wo ikkon ni atsumeta.

'collected important matters into one root.' (my translation of his Japanese)

'All the important aspects have been gathered into this one basic text' (the English translation accompanying his Japanese translation)

The last four words are straightforward: 'collected a book'. 5981 1a0 here is 'one'* (or - by coincidence in English - 'a'!) and not the directional perfective prefix 'up'.

The first two words are more troublesome.

Nishida (and Kychanov and Arakawa (2006: 462) regarded 1319 1tshi1 as an adjective 'important' (though adjectives normally follow rather than precede nouns in Tangut!), whereas Li Fanwen (2008: 220) regarded it as a verb 'to desire, want'.

It doesn't make sense to interpret 3349 2rer4 as 'direction' after 'important' or 'to desire, want'. Nishida translated it as 'aspects' in English and treated it as the object of the verb in his Japanese translation. There is no Tangut postposition corresponding to the Japanese locative postposition ni in his translation.

I would like to see more examples of constructions like this.

*7.27.1:01: It is curious that Tangut shares a 'one' with the Qiang languages but not with Pumi which may be its closest living relative according to Jacques (2014).

15.7.25.23:05: MANUAL METAL ACTION?: TANGUT 2CHYR'3 'TO SHOOT'

When looking up 3468 in Li Fanwen (2008), I found his neighboring entry for

(=++?)

3471 2chyr'3 'to shoot' (= left of 3485 1laq1 'hand' + 'metal' (< top of 1shon3 'iron') + 5113 1vi3 'to do'?)

whose analysis is unknown. (Above is my guess. The combination of elements on the right side of 3471 is unique to that tangraph.)

Li listed 3471 as a Chinese loanword. But the closest word in the Chinese dialect known to the Tangut was 射 *3sha3 < *3zha3 < *m-lak-s 'to shoot', and it would have been borrowed as *sha3 or *zha3 (omitting unpredictable tones), not 2chyr'3 which I assume is a native word from pre-Tangut *RcəXH or *cərXH:

- *R- could be a dental stop or *l- that lenited to preinitial *r- as well as *r-; *r- and *-r conditioned retroflexion of the vowel before disappearing

- *c could have been *[c], *[tɕ], *[tʃ], or *[tʂ] (though I suspect retroflexion was a late phenomenon in Tangut)

- *-X symbolizes the source of the unknown phonetic quality that I transcribe with the prime symbol (-')

- *-H is the glottal source of the second ('rising') tone

I can't narrow down the possibilities because I can't find any strong candidate for an outside cognate. Pumi has khətʂhɑ (with tones depending on variety) 'to shoot' with aspiration absent in Tangut (my ch may have been unaspirated [tʂ]) and nothing corresponding to Tangut vowel retroflexion. Pumi may be Tangut's closest living relative (Jacques 2014); its sound correspondences with Tangut remain to be explored. (Unfortunately, there are no Pumi words ending in low back -ɑ with proposed Tangut cognates in Jacques' book.)

Does 3471 have any internal cognates in Tangut? Let's look at its (near-)homophones from Homophones A 35B43-35B54:

Homophones	Tangraph	Li Fanwen number	Reading	Gloss
35B43		1349	2chyr'3	first half of 2chyr'3-2lu1 'sage'
35B44		1783		'five' (in the 'ritual' language which I suspect was a non-Sino-Tibetan substratum language)
35B45		2803		the surname Chyr
35B46		3267		skill, artistry
35B47		3482	1chyr'3	to pare
35B51		3483	1chyr'3	to attack (only attested in Homophones?)
35B52		2321	2chyr'3	afraid, scared
35B53		3826		to twine, wind, tie up; < RcəXH or cərXH
35B54		5223		half of 2chyr'3 1geq4 ~ 1geq4 2chyr'3 'constellation'; first half of the name of the Tangut ancestor 2chyr'3 2jwa3

None have anything to do with shooting.

Near-homophones without 'prime' don't have any semantic similarity to 3471:

2176 1chyr3 'to tie' (< *Rcə or *cər; cognate to 3826 above)

1359 the second half of 2phy1 2chyr3 'conceited'

Ah, I think I found the root of 3471:

5245 1chy < *cə 'to draw a bow' (only attested in dictionaries?)

3471 may be 5245 plus a prefix *R- and an affix *X (I don't know if *X is a prefix or suffix, though I conventionally write it as a suffix since I have to put it somewhere). So I can reject my earlier *cərXH since *-r is not a suffix.

Lastly, how do we know 3471 means 'to shoot'? It is apparently only attested in Homophones, where it is preceded by the clarifier

5710 1liq4 'arrow' < *S-li (cognate to Old Chinese 矢 *l̥iʔ 'arrow'?)

so I suppose it's been assumed that 3471 is a verb since Tangut has object-verb order. However, I don't know how one can be certain that 3471 means 'to shoot' and not, say, 'to pull out of a quiver'. 3471 might even be a noun like 'quiver' modified by 5710.

I am skeptical of definitions of Tangut words known only from Homophones unless they have clarifiers like 'name' which leave no room for interpretation.

15.7.25.13:07: LOOKED AROUND AT DECORATIONS

I was wondering if the verb 2khu'4-2rer4 'watch-direction' from my last post was a hapax legomenon. Thanks to Andrew West for pointing out that it occurs on the last page of the last ode:

......

3468 3457 4342 2258 3349 1vir1 1siw4 2da4-2khu'4-2rer4 '(...decoration?) new PERF-watch-direction'

Unfortunately, the surrounding characters were lost due to damage.

I suspect the character before 3468 is 5371, as 3468 is the second half of

5371 3468 1taq4 1vir1 'decoration', 'to be decorated' (see Kychanov and Arakawa 2006: 634)

and I do not know if 3468 can occur by itself*.

In any case, 4342 2258 3349 looks like a perfective verb 'looked around', and 3457 'new' modifies its object (5371?) 3468 'decoration' (?).

Nishida (1986) interpreted 4342 as 'inward', but Gong's (2003) 'away from the speaker' fits 'looked around' better.

It occurred to me today that 3349 might be a verb ('to direct'), so 2258 3349 would be a verb-verb rather than a verb-object sequence. But Kychanov and Arakawa (2006: 313) nor Li Fanwen (2008: 543) list it only as a noun. Does 2258 3349 reflect an earlier period when 3349 could also be a verb?

*Li Fanwen 2008: 562 lists no examples of 3468 in isolation other than dictionary definitions. Li Fanwen 2008: 847 gives the impression that 5371 is almost always followed by 3468； the one exception is

0542 5371 2shwo3 1taq4

which he defined as 嚴飾, interpreted as a verb 'to decorate' by Kychanov and Arakawa (2006: 429).

15.7.24.23:54: LOOKING IN FOUR DIRECTIONS: THE TANGUT VERB-OBJECT COMPOUND 2KHU'4-2RER4

While preparing for part 3 of "Grokking Up", I saw this phrase in Li Fanwen (2008: 374)*:

4684 2205 3349 2258 3349 1me1 1lyr'3 2rer4 2khu'4-2rer4 'eye four direction watch-direction'

It caught my eye (pun unintended!) because 2rer4 'direction' appears twice, though there is only one 'direction' in Li Fanwen's Chinese translation 目視於四方, lit. 'eye look in four direction'.

Kychanov and Arakawa (2006: 316) regard 2khu'4-2rer4 'watch-direction' as a verb 'look from side to side; look around'. Those glosses make sense in this context. Does this verb occur in other texts?

Tangut is a verb-final language, so I am surprised that a compound verb would have a verb-object structure instead of an object-verb structure. Are there other verbs of that type? Could the first four words be modifying the noun 'direction': 'the direction from which the eye watches the four directions'?

*7.25.0:25: Li Fanwen gives the source of this phrase as Tangraphic Sea 67.113 (i.e., the third entry in column 1 of side 1 [= the right side] of page 67), but it's not there. I assume 67.113 is a typo.

15.7.23.23:53: GROKKING UP TANGUT PERFECTIVE PREFIXES (PART 2: WRITING 'UP')

Given that the Tangut script has a reputation for being largely semantically based, it is curious that the seven characters for directional perfective prefixes do not share a common graphic denominator. Nor do they incorporate parts of characters for directions. For instance,

=+

5981 1a0- 'up-', 'one' = left of 5951 1a0, first half of 1a0 1chwa3 'boots worn in mud' + 3654 1a0, first half of 1a0 1shy2 'monk' / a surname / kinship term prefix

does not have any components in common with, say,

1890 2be4 'high', 2612 2phu4 'up, above, over', or 2750 1ghu2 'head' (i.e., something on top)

The Tangraphic Sea analysis of 5981 (above) is circular, as its supposed sources 3654 and 5951 are in turn derived from 5981:

=+

3654 = left of 3119 1i4 'many' + all of 5981

=+

5951 = left of 5981 + left of 1321 1ziq4 'boots'

Both 3654 and 5951 are phonosemantic compounds: 'person' (the left side of 'many') + a and 'boots' + a.

5981 in turn shares a phonetic left-hand component

with 5951. Could that component (Boxenhorn alphacodes: cil/cur) be derived from the left-hand side of Chinese 阿 (1a1 in the northwestern dialect known to the Tangut)? That would make it a distant cousin of the Japanese katakana character ア which is also derived from the left-hand side of Chinese 阿.

7.24.13:27: Both 5981 and

4541 1a0 (Sanskrit a)

transcribed Sanskrit long ā (Arakawa 1997: 112). However, there was also a special character

=+

4623 2a'2 = 4541 + 0443 'long'

for Sanskrit long ā, and 4541 normally represented Sanskrit short a. Moreover, 5981, 4541, and 4623 belonged to different homophone groups in Homophones and had different fanqie in Tangraphic Sea. I conclude that 4541 sounded most like Sanskrit a* and that 5981 differed somehow: e.g., it may have been 1a4 whereas 4541 may have been 1a1 or even 1a2 (if it had the same grade as 4623 2a'2). (-0 in the readings of 5981 and 4541 indicates an unknown grade. The grades of the fanqie final spellers of 5981 and 4541 are unknown:

0165 1ha0 [Sanskrit hi, he, hye - sic!] and 4475 1ha0 [Sanskrit ha and hā].

The use of 0165 for Sanskrit front-vowel syllables implies that its rhyme - and hence the rhyme of 5981 - was palatal: i.e., Grade IV.)

It is tempting to assume that some aspect of 2a'2 absent from 1a0 - the second tone, the 'prime' quality of the rhyme (transcribed as -'), and/or Grade II - was associated with length, but many Tangut transcriptions of Sanskrit syllables with long vowels lack most or all of those qualities: e.g.,

3948 and 3985 1ka'4 for Sanskrit kā and 5299 1ta1 for Sanskrit tā

Unlike Gong and Arakawa, I doubt that vowel length played a role in the complex Tangut vowel system. If Tangut had long vowels, they would have systematically corresponded to Sanskrit long vowels in transcriptions.

*Or to be more precise, the pronunciation of Sanskrit a known to the Tangut. In the Indian phonetic tradition, Sanskrit a was [ə], but the Tangut probably heard something like [a] because they would have borrowed [ə] as the central vowel that I transcribe as y, not a.

15.7.22.23:59: THE <ɃI⁝>-GINNING OF THE MYAZEDI INSCRIPTION

Writing about Tangut perfective prefixes made me wonder if Pyu had a perfective prefix. An obvious candidate for such a prefix could be transliterated as <ḅi⁝>.

The word is problematic even on the level of transliteration:

- does Pyu have a <b> : <ƀ> distinction?

- are three dots in Pyu equivalent to an overdot-'colon' sequence?

- is the overdot a nasal like anusvāra or something else? The fact that it also occurs in <tȧ> 'one' and <hrȧ> 'eight' corresponding to Old Burmese <tac> and <het> suggests that it might stand for a stop.

- is the 'colon' a fricative like visarga or something else?

Here are the first two occurrences of <ḅi⁝> in the Pyu A text of the Myazedi inscription (following Blagden's 1919 analysis):

|||

siri

dathagạda

ƀa

dọ

ƀȧ:

ƀi⁝

pdụ̄

sgu

dạ:

ƀa

tva

prosperity

Tathagata

HON?

achieve or enter [nirvana]? establish [a religion]?

thousand

six hundred

hrȧ

sni:

ƀi⁝

tvạ:

thada

twenty

eight

GEN?

year

elapse

PAST?

The verb after the first <ƀi⁝> (if that <ƀi⁝> is a verbal prefix and not an unrelated homophone) does not seem to correspond to any of the verbs in the other three languages of the Myazedi inscription (see Appendices 1-3 below).

Maybe <pdụ̄ sgu dạ: ƀa tva> is a sequence of a verb followed by 'since'. Perhaps that verb was intransitive: e.g., 'die' (a multi-word honorific euphemism?) or 'rise'. I don't think that verb was transitive because I would expect its object to precede it, and I would not expect 'nirvana' or 'religion' to have an honorific suffix which is generally otherwise an honorific prefix (!) for people (or images of them) in this inscription. Given the large number of Indic loans in Pyu, I would be surprised if there was a native term for 'nirvana' or 'religion' (unless the latter were 'teaching'). I think <ḅȧ:> might have originally been a noun, and <ƀa dọ ḅȧ:> might be a native title for the Buddha corresponding to Old Burmese <purhā skhaṅ> (see Appendix 2).or Old Mon <kyek ... tirley> (see Appendix 3).

Could <pdụ̄ sgu dạ: ƀa tva> be a prefix-object-verb(-'since'?) sequence with 'nirvana' or 'religion' somewhere in it?. (7.23.2:48: Cf. noun incorporation between directional perfective prefixes and verbs in Tangut. See Jacques 2014: 266. If Pyu did have incorporation, it is likely to have developed independently, as Pyu <ƀi⁝-> does not look like a cognate of any Tangut directional perfective prefix other than 2vy3- whose v- may or may not be from a lenited labial stop.)

Could <ƀi⁝-tvạ:-thada> be analogous in structure to Russian pro-sh-lo 'PERF-go-PAST' = 'passed'?

Does <ƀi⁝-> correspond to the Old Burmese indefinite past suffix <liy> (see Appendix 2)?

Does <-thada> end sentences, or is it a continuative suffix like Old Burmese <brī rakā> (see Appendix 2)?

APPENDIX 1: THE START OF THE PALI A TEXT (Duroiselle 1919; the glosses are mine; I don't know what anārikaṃ means or what vā [normally 'or'] is doing)

1	\|\|	śrī	\|\|	buddhādikaṃ	vatthuvaraṃ	namitvā	puññaṃ	kataṃ	yaṃ	jinasā-
1	\|\|	prosperity	\|\|	Buddha-beginning with	object-excellent	bowing	merit	work	REL	conquered-
2	-sanasmiṃ	anārikaṃ	rājakumāranāmadheyyena	akkhā-
2	-religion	dispensation?	in the name of Rājakumāra	relate
3	-mi	sunātha	me	taṃ	\|\|	nibbānā	lokanāthassa	aṭṭhavī-
3	-I	hear	me	CORREL	\|\|	nirvana	world-lord	eight-
4	-sādhike	gate	sahasse	pana	vassānaṃ	chasate	vā	pare	ta-
4	-twenty-and	gone	thousand	and	years	six-hundred	or?	before	thus
5	-thā	\|\|

Duroiselle's translation: 'Prosperity! Having bowed to the Buddha and the other (two) Excellent Objects, I shall relate the noble work of merit performed, in the Conqueror's dispensation, by Rājakumāra. Hearken to me! When one thousand six hundred and twenty-eight years had elapsed after the Nirvāṇa of the Lord of the World [...]'

APPENDIX 2: THE START OF THE OLD BURMESE A TEXT (Duroiselle 1919)

1	\|\|	śrī	\|\|	namo	buddhāya	\|\|	purhā	skhaṅ	sāsana	anhac	ta-
		prosperity		honor	Buddha		exalted personage	lord	religion	year	one
2	-c	thoṅ	khrok	ryā	nhac	chāy	het	nhac	lon
		thousand	six	hundred	two	ten	eight	year	elapse
3	liy	brī rakā	\|\|
	INDEF-PAST	CONT

Duroiselle's translation: 'Prosperity! Honour to the Buddha! One thousand six hundred and twenty-eight years of the Buddha's religion having elapsed [...]'

APPENDIX 3: THE START OF THE OLD MON A TEXT (Blagden 1919)

śrī

[n]amo

b[u]ddhāya

śrī

sās

kyek

buddha

tirley

prosperity

honor

Buddha

prosperity

religion

worshipful person

Buddha

lord

kuli

ār

moy

lṅim

turow

k[l]aṃ

ḅār

cwas

diñcām

cnām

go on

go/AUX

one

thousand

six

hundred

two

ten

eight

year

tuy

PAST

Blagden's translation: 'Prosperity! Honour to Buddha! Prosperity! After the religion of the Lord Buddha had gone on for one thousand six hundred and twenty-eight years [...]'

15.7.21.23:57: GROKKING UP TANGUT PERFECTIVE PREFIXES (PART 1: OVERVIEW)

Tangut has a set of directional perfective prefixes that remind me a bit of perfective prefixes of prepositional origin in Slavic and adverbs of prepositional origin in English. Arakawa's Studies on the Tangut Version of the Vajracchedikā-prajñāpāramitā (2014: 149) reproduces Nishida's (1989) list of directional perfective prefixes with the addition of a seventh prefix in a footnote:

Direction (Arakawa 2014: 149, based on Nishida 1989)	Direction (Gong 2003)	Tangraph	Arakawa reading	This site	*Arakawa's (2014) notes on usage in the Tangut version of the Vajracchedikā-prajñāpāramitā* summarized**	*Frequency in Sea of Meaning* (Arakawa 2015: 18)**	Arakawa's (2015) notes summarized
upward			1a?-	1a0-	high frequency (p. 149)	16	" 'upward' in many cases"
downward			1na:-	1na4-	low frequency (p. 149)	12	" 'downward' in most cases"
here, toward the speaker	here, inside		1kI:-	1ky4-	low frequency (p. 150)	40	"might be 'inside' in some cases"
there, away from the speaker	there, outside		2wI:-	2vy3-	used with adverbs indicating the past (e.g., 1pI: 2no: 'long ago' = my 1py4 2no4) and in the word 2wI: 2rar 'past' (= my 2vy3 2rar1; prefixed to a verb 'to pass'; cf. English past; p. 150)	29	"Probably [...] 'outside' "
upriver; inward	away from the speaker		2da:-	2da4-	used with various verbs without any common denominator (p. 150)	43	"Here, the tendency is 'away from the speaker or agent', 'not accessible', and 'to leave, not to return' [...] In some cases, the verb following 2da:- seems to be 'unhappy'."
downriver; outward	"direction not found"		2rI:r-	2ryr4-	often with verbs of speaking but also with 'to come' and 'to arrive' (p. 151)	26	"difficult to determine the direction"; "precedes some verbs related to vocal acts"
(not given)	towards the speaker		2dI:-	2dy4-	often with verbs of taking unlawfully by force; rare (p. 152)	4	"so rare"; direction "uncertain"

(Thanks to Mahādātṛ for Arakawa's 2015 article "On the Tangut verb phrase in The Sea of Meaning Established by the Saints".)

Arakawa and Gong agree on the functions of the first two prefixes. Their interpretations of the second two partially overlap, and their views on the last three are very different.

The -0 in my reading of the first prefix indicates that its grade is unknown. The other prefixes all belong to Grade IV (indicated by -4) except for the third prefix which has Grade III (indicated by -3; Grade IV cannot occur with v-). The significance of this skewing is unknown. (7.22.0:50: If Tangut and Chinese grades have similar origins, then Tangut Grades III/IV may have developed in unmarked syllables, just as Chinese Grade III and chongniu Grade IV developed in nonemphatic [i.e., unmarked] syllables. I expect affixes to tend to be phonologically unmarked. I believe Old Chinese grammatical morphemes tend to be nonemphatic.)

The prefixes either have a or y (phonetically a nonlow central vowel like schwa). They may have been unstressed and therefore only had the achromatic (i.e., neither palatal nor labial) subset of the Tangut vowel system.

Arakawa (2014) did not supply frequency statistics for perfective prefixes in the Vajracchedikā-prajñāpāramitā. Nonetheless it is clear from his text that their frequencies do not match those in the Sea of Meanings: e.g., the most common prefix in Vajracchedikā-prajñāpāramitā is 1a0-, whereas it is 2da4- in The Sea of Meaning. I do not know whether this difference is due to geography, chronology, and/or genre. It would be interesting to see how individual verbs are prefixes in those two texts and others. My dream is to have a Tangut verb dictionary containing all attested affixes with text-specific frequency data. We are still far from being able to say that we have

1a0-2tse4-2ni4 'understood' = lit. 'up-understand-PL' (= the "Grokked Up" of the post title*)

how Tangut verbs work. The outlines have been established; the details remain unclear.

*7.22.0:28: Based on Vajracchedikā-prajñāpāramitā 18.4

1a0-2tse4-2nga1 'I understood' = lit. 'up-understand-1S'

with the suffix changed.

15.7.20.23:55: WHAT IS THE ORIGIN OF ROHINGYA TONES?

Are the three tones described in this Unicode proposal for the Rohingya script due to Burmese influence? What conditioned each tone? The low frequency of the tonal signs suggests that tones may have arisen as compensation for lost low-frequency segments or segmental features. What is the pitch associated with the absence of a tonal sign?

Do Burmese loanwords have tones, and if so, do they retain their original tones, have Rohingya approximations of those tones, or have yet other tones?

Might Rohingya have pitch accent instead of Southeast Asian-style tones?

I just learned that Unicode has six characters called "ARABIC TONE" (08EA-08EF) corresponding to the Rohingya tone characters. Are those six characters used in Arabic-script Rohingya? (7.21.0:03: Yes. I was thinking they might have been invented for some African language.)

Wikipedia states that the Rohingya script has "a few borrowings from Roman and Burmese", but I can't find them.

15.7.19.23:57: WHY DOESN'T MON <ṄA> LOOK LIKE BURMESE <ṄA>?

Mon and Burmese are written in variants of the same script. Hence the Mon and Burmese spellings of 'Tenasserim' are similar:

Mon: တနၚ်သြဳ <tanaṅsrī>

Burmese: တနင်္သာရီ <tanaṅsārī> [tənɪ̀ɴθàjì]

One difference is not as great as it seems. Burmese <ṅ> is written as a superscript ɛ-like shape atop <sa>. When written on the line (with its inherent vowel restored), Burmese င <ṅa> looks like Mon ၚ <ṅa> except for the lack of a bottom stroke. (The top stroke of Mon ၚ် indicates that ၚ <ṅa> is to be read without its inherent vowel.) Burmese င <ṅa> is clearly a rounded descendant of Brahmi 𑀗 <ṅa>. The Mon character in the Myazedi inscription from nine centuries ago looks like modern Burmese င <ṅa>. When did the Mon add a stroke, and why? And does that bottom stroke have anything to do with the bottom parts attached to C-shapes in <ṅa> in other Indic scripts: e.g., Devanagari ङ?

7.20.1:36:: IndoSkript shows ㄷ-shaped characters for <ṅa> from c. 100-200 AD. Then it displays a <ṅa> resembling Tibetan ང <ṅa> atop a ㅅ shape from c. 350-375 AD in Kuchar - in what is now northwestern China, quite far from the Mon. Is that ㅅ-shape relevant to the reversed S-shape at the bottom of Mon ၚ <ṅa>? That shape resembles the <ṅa> of the Manur inscription (c. 840-880 AD) in what is now Tamil Nadu. Could Mon ၚ <ṅa> originate from a stack of two <ṅa>?

15.7.18.23:45: WHY DOES TENASSERIM END IN M?

I wonder what the etymology of the name Tenasserim is. Seri looks like Sanskrit Śrī, but what is the first half? Is it Mon?

None of the major languages of the area have e or m in their versions of the name:

Burmese: တနင်္သာရီ <tanaṅsārī> [tənɪ̀ɴθàjì]

Mon: တနၚ်သြဳ <tanaṅsrī>

Thai: ตะนาวศรี <taḥnāvśrī> [tanaːwsǐː]

Malay: تانه ساري <tānh sāry> Tanah Sari

I think e may reflect a schwa and an epethentic vowel between s and r. But why does the name have a final -m in Western languages?

Why does the second syllable have different codas (nasal, [w], or [h])?

15.7.10.19:40: VOWEL LENGTH AND INTRUSIVE NASALS IN SANSKRIT VS-STEMS

I was puzzled by Sanskrit Vs-stems when I first learned how to decline them in 1992, and I remain puzzled today. Most forms (singular and dual / plural) can be generated by adding endings to -Vs stems and applying the following rules which apply to Sanskrit in general:

1. s > ṣ after i or u and before a vowel: e.g., havis-ā > haviṣ-ā 'oblation' (inst. sg.)

2. -as > -o before voiced consonants: e.g., manas-bhyām > manobhyām 'minds' (inst./dat./abl. du.)

3. -s > -r after i or u and before voiced consonants: e.g., havis-bhyām > havirbhyām 'oblations' (inst./dat./abl. du.)

4. s > ḥ before s: e.g., manas-su > manaḥ-su 'minds' (loc. pl.)

But those rules cannot explain a few forms:

Why do the m./f. nom sgs. have long vowels before the stem-final -s?

5. sumanās 'favorably minded' (m./f. nom sg.)

cf. sumanas 'id.' (n. nom. sg.)

Why do the n. nom./acc./voc. pls. have long vowels and an anusvāra nasal (written here as ṃ and in Whitney's grammar as ṅ) before the stem-final -s?

6. manāṃs-i 'minds' (n. nom./acc./voc. pl.) instead of *manas-i

Is 5 by analogy with mant/vant-stems that also have lengthening in the m. nom. sg.? (The feminines of mant/vant-stems are ī-stems: paśumatī instead of *paśumān.)

sumanās : paśumān < *-ēn < *-en-s < *-ent-s? 'rich in cattle' (m. nom. sg.; is the long vowel of -mān due to Szemerényi's law?)
cf. the acc. sg.: sumanas-am : paśumant-am

Why do ant-participles and an-stems work somewhat differently?

bhavan < *-ont-s 'being' (m. nom. sg.); why isn't this *bhavā < *-ō < *-ōn < *-on-s < *-ont-s?; cf. rājā below

rājā < *-ō < *-ōn < *-on-s 'king' (m. nom. sg.); final *-n was lost after *ō but not *ē

the m. acc. sg. bhavant-am has a short vowel like sumanas-am and paśumant-am, but rājān-am has a long vowel even though Szemerényi's law can't apply to a word without *s or *laryngeals!

I think 6 and similar neutral plurals of vowel stems are by analogy with an-neuters; they all share the pattern long stem vowel + nasal + -i:

manas > manāṃs-i 'minds' (as-stem; -ns- > -ṃs-)

asya-m > asyān-i 'mouths' (a-stem)

vari > varīṇ-i 'waters' (i-stem; n > ṇ after r)

madhu > madhūn-i 'honeys' (u-stem)

nāma > nāmān-i 'names' (an-stem)

The stem-final nasal in nāmāni was restored by analogy and the regular neuter plural ending -i was added:

*ʕʷneʕʷmon-ʕ > *nōmō > (*)nāmā (attested in Vedic?) > nāmān-i (cf. nom./acc./voc. du. nāman-ī and voc. sg. naman, but the second a in those forms isn't old; they go back to *ʕʷneʕʷmn-ʕi and *ʕʷneʕʷmn)

This restoration must have predated the split of Indo-Aryan from Iranian, since the restored nasal is present in Avestan nāmə̄n-i 'names'. However, the nasals in the Sanskrit Vs and vowel stem neuter plurals have no parallels in Avestan, so they must be Sanskrit innovations.

15.7.10.13:25: POSSESSING SIMILAR ENDINGS IN THE PRESENT

Last week I was puzzled by the stems of Hungarian van 'is' and megy 'goes'. Now I want to look at their present tense endings which partly overlap with possessive endings:

Person/number	Present indefinite verb endings	Present definite verb endings	Possessive endings for singular nouns	Possessive endings for plural nouns
1S	-ok/-ek/-ök e.g., lát-ok egy 'I see a ...' -om/-em/-öm (optional for -ik verbs): e.g., játsz-ok ~ játsz-om 'I play'	-om/-em/-öm e.g., lát-om a(z) 'I see the ...'	-(V)m e.g., órá-m 'my clock'	-(j)(a)im/-(j)(e)im e.g., órá-im 'my clocks'
2S	-sz [s] -ol/-el/-öl (-s, -sz, -z, -dz verbs) e.g., játsz-ol 'thou playest'	-od/-ed/-öd	-(V)d	-(j)(a)id/-(j)(e)id
3S	-Ø -ik (-ik verbs): e.g., játsz-ik 'he/she/it plays'	-ja/-i	-(j)a/-(j)e	-(j)(a)i/-(j)(e)i
1P	-unk/-ünk	-juk/-jük	-(u)nk/-(ü)nk	-(j)(a)ink/-(j)(e)ink
2P	-(o)tok/-(e)tek/-(ö)tök	-játok/-itek	-(V)tok/-(V)tek/-(V)tök	-(j)(a)itok/-(j)(e)itek
3P	-(a)nak/-(e)nek	-ják/-ik	-(j)uk/-(j)ük	-(j)(a)ik/-(j)(e)ik

Questions:

1. Why is the pattern of overlap between verb and possessive endings so complex?

Person/number	Possessive endings for singular nouns	Possessive endings for plural nouns
1S	ends in -m like present definite
2S	ends in -d like present definite
3S	-ja looks like present definite but presumably linking -j- + 3S possessive suffix -a	-i looks like present definite but presumably an unrelated plural possessive suffix -i
1P	similar to present indefinite	ends in -nk like present indefinite
2P	definite, indefinite, and both types of possessives all of the tVk type
3P	unlike verb endings aside from plural -k	-ik looks like present definite but presumably an unrelated plural possessive suffix -i + plural -k

Is the unity of the 2P endings original or the result of a merger? Can a more consistent system be reconstructed for an earlier stage?

2. Why do -ik verbs have optional indefinite endings that look like definite endings only in 1S?

3. Did 2S present indefinite -sz and -Vl originally have different functions before being reinterpreted as allomorphs for different stem types?

4. Why do -ik verbs have a special 3S ending?

5. Why do possessive endings for consonant-final plural nouns have 'bridges' that look like the third singular possessive endings for singular nouns?

kert-je-im 'my gardens' (not *kert-im); cf. kert-je 'his/her/its garden'

6. Why do definite 3S, 2P, and 3P have a jA ~ i alternation instead of a jA ~ je alternation?

7. Why doesn't definite 1P end in -nk?

8. Why do definite 2P and 3P have long á instead of short a?

BONUS: Why isn't játsz- [jaːts] spelled jác [jaːts]? I think I can answer that one myself. The spelling is etymological; játsz- is from ját- plus -sz-.

15.6.29.17:39: VAN MEN

What is the story behind the irregular conjugations of Hungarian van 'is' and megy 'goes'?

Number/person	Ending(s)	'to be' < Proto-Finno-Ugric* *wole-	'to go' < Proto-Uralic *mene-
1st singular	-ok/-ek	vagy-ok	megy-ek
2nd singular	-sz [s]	vagy-Ø	mé-sz ~ mégy-Ø
3rd singular	-Ø	van-Ø	megy-Ø
1st plural	-unk/-ünk	vagy-unk	megy-ünk
2nd plural	-tok/-tek	vagy-tok	men-tek
3rd plural	-nak/-nek	van-nak	men-nek

The list of endings is not exhaustive and only includes endings that would normally be expected for these two verbs.

1. Why do the two verbs have -gy [ɟ] even though their roots lack palatal consonants?

2, Why does that gy have different distributions in the paradigms of the two verbs: e.g., van and vagytok (not *vagy and *vantok) but megy and mentek (not *men and *megytek)?

3. Why do 'thou art' and one form of 'thou goest' have a zero ending?

4. Why do the forms of 'thou goest' have long vowels? Is length in mész compensating for a root-final consonant lost before -sz?

5. Why does 'to be' have a instead of o which is still in other forms like volt 'he/she/it was'?

6. Why does 'to be' have n instead of l which is still in other forms like volt 'he/she/it was'?

I could ask even more questions about the rest of the paradigms of those two verbs (e.g., why is the potential of 'to go' me-het with the stem reduced to an open syllable?), but I'll stop here.

*Although Proto-Finno-Ugric may not even exist (cf. Tibeto-Burman in Sino-Tibetan), I cite this form merely to indicate that the source of the Hungarian verb had *l which is still in some other forms of the verb (e.g., volt 'he/she/it was') as well as in related languages: Finnish olla and Estonian olema.

15.6.22.2:08: WHAT IS THE INDIC SOURCE OF THAI NATTA?

That question came to mind when I saw the name of this restaurant. I assume the Natta of Natta Thai is from the name [náttʰaː] which I've seen spelled ณัฏฐา <ṇaṭṭhā> and ณัฐฐา <ṇaṭhṭhā>. The letters ณ <ṇ>, ฏ <ṭ>, and ฐ <ṭh> are for retroflex consonants that were never in Thai and usually signal Indic origin. Yet I cannot find any Sanskrit or Pali words beginning with ṇa- other than Skt ṇakāra 'the sound ṇ' which is not relevant here. Is ณ <ṇ> a hypercorrection for น <n>? There is a Pali word naṭṭha ... but it means 'destroyed'!

Since I mentioned ณ <ṇ>, here is a question I've had for a long time: why is the Thai preposition [náʔ] spelled ณะ <ṇḥ>? Was that an attempt to dress up a native word in Indic-like guise? The use of a low-frequency letter also makes the word stand out. Was that intentional? How far back does the retroflex spelling go? Was the word ever spelled with dental น <n> as นะ <nḥ> like Lao ນະ <nḥ> [nāʔ]?

15.6.20.23:59: TONOGENETIC CLUES IN MIZO 'DECLENSION'?

I first heard of Mizo (as 'Lushai') back in the late 90s when I learned that Starostin had found a correlation between Mizo short vowels and Middle Chinese Grade III (going back to Old Chinese 'type B' syllables which he reconstructed with short vowels and which I reconstruct as nonemphatic). See Sagart's (1999: 42-43) summary of proposals concerning the origin of Grade III which is often reconstructed as a medial *-j-.

I didn't look at Mizo again until tonight when I took a good look at its Wikipedia entry. Normally, I expect Asian tonal languages to be 'isolating' like Chinese, but Mizo nouns decline! Or is 'declination' an artifact of looking at Mizo through an Indo-Aryan lens and/or Mizo orthography? Would it be better to analyze the suffixed case forms as noun-postposition sequences as in DeLancey (2004)? In any case, the ergative and instrumental both end in -in but have different tones. Does that tonal alternation reflect one or more lost final consonants? Are the two -in from a single original suffix (or postposition) with or without a following glottal suffix that conditioned a different tone?

*-in > -in + tone

*-in-H > -in + a different tone

6.21.23:36: Segmental affixes may also be the source of tone changes in derived verbs (though some derivations may postdate tonogenesis and be by analogy with existing pairs of verbs).

6.21.23:57: How many tones does Mizo have? Wikipedia lists eight. But Khoi Lam Thang (2001: 40) listed five, and Lorrain (1940) in Namkung (1996: 234) listed only three! How can these different descriptions be reconciled? And where did these tones come from? Wikipedia makes it sound as if Mizo had Chinese-style tonogenesis:

Tone systems have developed independently in many of the daughter languages [daughters of which language?] largely through simplifications in the set of possible syllable-final and syllable-initial consonants. Typically, a distinction between voiceless and voiced initial consonants is replaced by a distinction between high and low tone, while falling and rising tones developed from syllable-final h and glottal stop, which themselves often reflect earlier consonants.

I hoped to see the details in this process in Khoi Lam Thang's (2001: 98) dissertation. Unfortunately, his reconstruction of Proto-Chin, the ancestor of Mizo and its sisters, lacked a tonal component.

This analysis shows that there are comparatively clearer tonal correspondences between Tedim, Mizo and Hakha. However, tone in Mara, Khumi and Kaang are split within the Patterns [established by Gordon Luce for Chin languages such as Mizo], tremendously complicated and without predictable environments. Thus, while a reconstruction of proto Northern Chin may be proposed from this data, a reconstruction of Proto Chin tone is incomplete and cannot at present be proposed. Therefore this thesis will be limited to a segmental reconstruction for Proto Chin. A Chin tonal analysis is in progress by Dr. Fraser Bennett and Ajarn Noel Mann. Their initial findings seem much closer to Luce’s Tonal Patterns.

I wonder what their final findings were.

15.6.8.23:40: REPLICATING GRAINS OF GOLD

I like Andrew West's English title for the Tangut text that I have beencalling the Golden Guide. His latest post is about manuscript copies of the Grains and practice pieces in which characters from the Grains were written repeatedly.

He used my notation to transcribe Tangut readings with a twist: he wrote tones as superscript numerals and grades as subscript numerals: e.g., he wrote the reading of

'moon, month'

as ²lhiq₄ = lhiq with tone 2 and Grade IV. I write it as 2lhiq4 because superscript and subscript numerals are difficult for me to type and to read.

He linked to my notes on the Grains whenever they were available. I still have 96 lines left to translate and annotate. (I stopped at line 104 in January.) Now I want to finish so Andrew can add more links to his entry.

15.6.8.1:58: DID KHITAN AND JURCHEN SHARE A WORD FOR 'GRANDSON' (PART 2)?

I forgot to make a few points about Khitan

191 'grandson'

in my last entry, and I've thought more about the topic since, so here's a follow-up I didn't plan.

Why does 191 mean 'grandson'?

I don't know. I haven't seen Liu Fengzhu and Chengel's (2003: 18) explanation for that gloss. If I can find it, I might write a part 3.

191 occurs four times in the epitaph of Field Marshal Yelü, but none of those four occurrences unambiguously mean 'grandson' (Wu and Janhunen 2010: 159, 161, 190).

191 also functions as a phonogram: e.g., in the female name

191-236-372-361 <191.ur.û.en> (Xiao Dilu 26.26; see Wu and Janhunen 2010: 106-107).

How was 191 pronounced?

Lu Yinghong & Zhou Feng (2000: 49) read it as [mu] because they regarded it as a transcription of Liao Chinese 睦 *muʔ. However, the rest of what they regarded as a transcription of a Chinese phrase is not a good match (Wu and Janhunen 2010: 107). Given that other Chinese final glottal stops may have been Khitanized as -ɣ (= -h in Kane 2009), perhaps 191 was <muɣ> (which resembles Written Mongolian omuɣ 'clan', though I am skeptical of apheresis; see below).

The fact that 191 is often followed by u-graphs (e.g., 236 <ur> above; see Qidan xiaozi yanjiu 312 and Wu and Janhunen 2010: 317 for others) suggests that its reading may have ended in -u. Perhaps the name above was something like Mu(u)ruen.

Kane (2009: 302) transcribed 191 as <mú>, but his entry for the character on p. 58 is blank, so I do not know his reasoning.

Wu and Janhunen (2010: 264) transcribed 191 as <mó>, presumably reflecting Wu Yingzhe's (2007: 46-47) which I haven't seen. Maybe by part 3 ...

Does the Khitan word written as 191 has external cognates?

If the reading of 191 began with an m-, I doubt it can be connected to Manchu omolo 'grandson', since I don't know of any cases of Khitan C- corresponding to VC- in other languages. Hence I don't think Khitan underwent apheresis. (Is there any language that lost all initial vowels?) A reading mu would make a link even more problematic since I would not expect Khitan u to correspond to Manchu o.

In part 1, I proposed that 191 may have been <om>. Such a short form - if valid - raises other issues. Manchu omolo has apparent cognates throughout Tungusic with the shape omol(g)V (Cincius 1975 2: 17-18). Therefore the word might be reconstructed at the Proto-Tungusic level. Is the word a loan from pre-Khitan (prior to monosyllabic reduction) into ((pre-)Proto-)Tungusic or vice versa? It cannot be a loan from Khitan into Jurchen or any other Tungusic language, since that scenario cannot account for final -l(g)V. Gorelova (2002: 114) analyzed Manchu omolo as omo-lo with a noun suffix -lo. That analysis seems to be synchronically correct since the plural of omolo is omosi with the plural -si replacing -lo before the root. But is it diachronically correct? Was the Proto-Tungusic root *omo- rather than *omol(g)V, or was the word reanalyzed within Manchu? The Jurchen plural

<omo.lo.shi> (Kyŏngwŏn inscription 3:2)

could either be analyzed as omo-lo-shi with double suffixes or as omolo-shi with a trisyllabic root that was later reanalyzed as a root-suffix sequence omo-lo by analogy with other -lo nouns in Manchu.

Starostin's online Altaic database treats Proto-Tungusic *omu- (sic) 'offspring, descendant, grandchild' and *umu- 'to lay eggs' as one and the same root. I reject that identity for three reasons. First, the supposed initial vowel alternation looks like an ad hoc device to tie the two roots together. Second, all evidence points to *o as the second vowel of the om-root; *u is another bridging device to make the child root look like 'to lay eggs'. Finally, *umu- was apparently reconstructed solely on the basis of Evenki umū-. A form in a single language cannot be projected back to the proto-language.

All that effort enables Starostin to connect the Tungusic omo- (not omu-!) words to various um-words elsewhere in 'Altaic':

Old Turkic umay 'name of a goddess' < 'placenta'?

If I am reading Clauson 1972: 164-165 correctly, the word is first attested in the 8th century AD as the name of a goddess "whose particular function was to look after women and children, possibly because this object [the placenta] was supposed to have magic qualities". The first attestation of the meaning 'placenta' that I can see in his entry was in the 11th century AD. I assume 'placenta' is the earlier meaning even though it is actually found later.

Written Mongolian umai 'womb'

Korean um 'sprout'

Japanese um- 'to give birth'

The Turkic and Mongolian words must share a common source; one language probably loaned the word to the other.

The semantics of the Korean word are distant from 'womb'. Um may be an -m-suffixed nominalization of an extinct verb 'to sprout'.

The Japanese word may be a chance lookalike like English womb [wum]. Is English 'Altaic'?

Starostin reconstructed Proto-Altaic *úmu 'to give birth'. According to the rules in Etymological Dictionary of the Altaic Languages (2003 1: 18), the first vowel of the reflexes of a Proto-Altaic word with the vowel sequence *u-u should be *U in Proto-Tungusic. The cover symbol *U enabled Starostin et al. to regard both the improbable *umu- and the incorrect *omu- to be descendants of *úmu.

15.6.5.23:59: DID KHITAN AND JURCHEN SHARE A WORD FOR 'GRANDSON'?

Last weekend, I opened Wu and Janhunen (2010) at random and saw this passage about the Khitan small script character

191

on p. 107:

Even so, assuming that the value [mu], here romanized as mó, is approximately correct, the Khitan item for 'grandson' may perhaps be compared with Manchu omolo id., suggesting that the actual pronunciation might also have been [omo] (Wu Yingzhe 2007f: 46-47).

I wonder if 191 was [om] ~ [mo] with a reversible reading like other Khitan small script characters such as

222 [iń] ~ [ńi].

The Jurchen word for 'grandson' was omolo as in Manchu. I suspect that the character variously written as

was originally a logogram for omolo (though it is not attested alone) which later acquired a following <lo> (see my posts from 6.1 and 6.3) in the attested spellings:

or
(Kyŏngwŏn inscription 3:2, mid-12th century; the spelling on the left is from Jin 1984: 205 and the spelling on the right is from Jin and Jin 1980: 336)

(Deshengtuo inscription 14, 1185; Jin 1984: 205 also reports this in Yongning 12, but Jin and Jin 1980: only list the second of the next two spellings in Yongning 12.)

or
(Yongning temple inscription 12, 1413; the spelling on the left is from Jin 1984: 205 and the spelling on the right is from Jin and Jin 1980: 364)

(Hua-Yi yiyu Berlin ms. people section 14, before c. 1500?)

I have not seen any of the originals, so I am not certain about the details.

I have not yet been able to find an exact match for Jurchen <omo> in the Khitan large script. Characters 0170, 0204, and 0205 in N4631 are vaguely similar, but until their readings and/or meanings are known, I cannot regard them as prototypes for Jurchen <omo>.

15.6.3.23:45: KHITAN SMALL SCRIPT CHARACTER 346 IN QIDAN XIAOZI YANJIU

Qidan xiaozi yanjiu (1985), the foundation of current studies on the Khitan small script, only lists four instances of 346 in the texts it covers:

244-346-273 <s.?.un> (道 14.11, 24.16, 仲 17.37) and 251-346-273 <n.?.un> (許 57.33)

Are those genitives of nouns, or is <un> part of the stem? If <un> is a genitive suffix, the vowel of 346 should be u according to the present understanding of Khitan vowel harmony. So perhaps that is partly why Kane (2009) transliterated it as <uŋ> and Wu and Janhunen (2010) transliterated it as <ung₂>. The final nasal reflects the assumption that 346 is a variant of single-dotted 345 <ung> from my last post:

345 is much more common. Qidan xiaozi yanjiu lists 72 occurrences of 345 which can appear by itself (on the murals where characters are often not grouped into blocks) and in first, third, and fourth position: e.g.,

345-041 <ung.us> (興 25.3), 334-019-345 <g.iu.ng> for Liao Chinese 宮 *giung (or *güng?) (道 6.33), 048-092-261-345-341 <?.ud.l.ung.er> (許 61.2)

Is 346 simply a variant of 345 (Kane 2009: 77), or is it a distinct character? If it is the latter, was its reading similar to <ung> (e.g., <üng>) or was it something else with an u-vowel? 346 coexists with 345 in all three texts where it was found (道, 許, 仲). Was the number of dots on the bottom random like the dots in the three variants of Jurchen <lo>?

The fact that 346 only occurs in blocks of the type <C.346.un> suggests a deliberate choice, though it could also be an artifact of extremely limited data. Qidan xiaozi yanjiu does not list the blocks <s.ung.un> and <n.ung.un> with 345 instead of 346. Is this complementary distribution accidental or meaningful? Have any such blocks been found in the three decades following the publication of Qidan xiaozi yanjiu? The closest block with 345 is

244-345 <s.ung> 宋 'Song (dynasty)' (仁 8.13)

which might be the stem of

244-346-273 <s.?.un> (道 14.11, 24.16, 仲 17.37）

if 345 and 346 really are equivalent and if <un> is a genitive suffix.

If 244-346 is also 'Song', could 251-346 be a loan of a Liao Chinese word *nung?

15.6.2.23:59: AN 'ETERNAL' LINK BETWEEN THE KHITAN SMALL SCRIPT AND THE JURCHEN (LARGE) SCRIPT?

Tonight I noticed that the Jurchen (large) script character

<üng>

for the transcription of Ming Chinese 永 *yüng 'eternal' resembles a cross between the Khitan small script characters

106 ~ 345 ~ 346

which are slightly different ways to transcribe Liao Chinese *-ung. (I assume 106 is an abbreviation of 345. The function, if any, of the extra dot in 346 is unknown.)

Was the Jurchen character derived from 106/345/346, or is the similarity a coincidence? Normally Jurchen characters are thought to be derivatives of Khitan large script characters or 'sisters' if not descendants of those characters. So I would expect Jurchen <üng> to be somehow related to Khitan large script characters such as these two (1692 and 0555 in N4631):

N4631 glossed 1692 as 'first' and listed the reading [tʰur] (= <tur> in my Khitan transcription). There is no semantic or phonetic resemblance to 永 *yüng 'eternal' or its (near-)homophones.

Nothing is known about 0555. Was it pronounced üng?

The Khitan small script character for Chinese transcription

181

that Kane (2009) transcribed as <iúng> may have been pronounced üng. It of course does not look anything like Jurchen <üng> unless one is imaginative. I doubt the Jurchen - who were literate in Khitan - overlooked it and chose a small script character with a somewhat different reading (106/345/346) as the basis for their <üng>.

6.3.1:06: Maybe I am wrong about 181 being üng. The Liao Chinese rhyme that it transcribes was also transcribed in the small script as 019-345 <iu.ung>: e.g.,

334-019-345 <g.iu.ung> for 宮

So was 181 <iung>? (I see no reason to add an acute accent, as there is no <iung> distinct from <iúng> in Kane 2009.)

Another possibility is that the Liao Chinese rhyme was -üng, and the Khitan had two strategies for writing it: a spelling reflecting a partially nativized -iung (if Khitan had no ü) and a spelling with a character specifically designed for -üng. The degree of phonetic mismatch between Liao Chinese and Khitan must have been considerable, though it eludes precise measurement.