Followup on my previous overview of comparative Cushitic: a slightly more involved look at Ehret’s Proto-South-Cushitic from 1980, and some readily observable issues in it. To reiterate slightly, his view of South Cushitic includes four basic units:
- West Rift, a generally-accepted cluster comprising three or four languages (Iraqw–Gorowa, Alagwa, Burunge);
- East Rift = two recently extinct-or-moribund languages (Kwʼadza and Aasax / Asa);
- Ma’a, still treated by Ehret as a Bantuized Cushitic language;
- Dahalo.
There’s a couple easily observable typological features that all four share, maybe most prominently
- the presence of labialized consonants; in most just dorsal consonants like /kʷ gʷ qʷ ŋʷ/, in Dahalo also a couple labialized coronals like /dʷ ɬʷ/;
- the presence of lateral obstruents; fricative /ɬ/ in all, also an ejective affricate /tɬʼ/ in most with the exception of Aasax and Ma’a, in some descriptions of Dahalo further also voiced /dɮ/ and/or palatals /cʎ̥ʼ ʎ̥/.
These are not a priori guaranteed to be innovative though — those reconstructed by Ehret for PSC he indeed goes on to later treat as Proto-Cushitic (and even Proto-Afrasian) archaisms, and his argument for the genealogical unity of South Cushitic is more involved, for one part of which see later. It might also seem premature for him to have focused on the “full” South Cushitic instead of just the clear Rift subgroup, but that’s what we do have and can therefore review.
Newer research has already put quite a bit more work into the comparison of the still more or less thriving West Rift languages (Kießling & Mous 2003, The Lexical Reconstruction of Proto-West Rift), and it seems like combining this with Ehret’s work could make a productive project. As noted in my previous post, there’s been also a fair bit of debate in the literature on what to do with Dahalo. Most of the discussion that I’ve seen, though, is unfortunately sort of typology-oriented and relies on methods like looking where cognates might be found and which of them look the most surface-similar to Dahalo. But even if we suppose the language is e.g. ultimately instead East Cushitic, there is no rule saying that sometimes e.g. a proper Proto-Cushitic etymon could not have survived just in, or mainly in, Dahalo + Rift, instead of being something like a Rift loanword in Dahalo. More detailed work on its historical phonology might be able to sometimes make this distinction, and Ehret’s proposals on this should not be wholly ignored. That Dahalo’s consonant system is one of the most “kitchensinky” on the planet (it has a little bit of almost anything you could ask for: clicks, ejectives, implosives, prenasalized consonants, a dental/alveolar distinction…) should surely also help: clearly it has been absorbing loanword phonemes for a while now, and then perhaps not only from unrelated Bantu or extinct Paleoafrican languages, but also from other branches of Cushitic? Ehret’s work already leaves a few clear openings for this kind of a hypothesis, I think. More on this further along.
The situation of research on Kwʼadza, Aasax and Ma’a looks much less satisfying. As for East Rift, R. Kießling, one of the more active West Rift researchers, seems to have deemed their closest eastern relatives not worth looking into, with claims appearing in overview works such as “The position of Qwadza and Asax is dubious, since there is not enough data (and probably never will be) to prove that they belong to a different subbranch within Southern Cushitic“. [1] I have not found substantiation for this claim of “not enough data” (how would one prove such a negative anyway?), and that Ehret already has given in his work reasons to think that they form a distinct East Rift branch looks to be simply swept under the carpet. Any idea that a language could not be even classified if it is extinct and its available documention is short of state-of-the-art modern linguistic methodology is, of course, absurd. Historical linguistics can demonstrate just fine that languages such as Oscan or Umbrian are not merely Indo-European, they’re indeed Italic and further form a separate subgroup of it in contrast to the well-attested Latin. This is no folly or privilege or just Indo-Europeanists either, the same has been done again just fine e.g. with plenty of extinct poorly attested Semitic languages (Ammonite, Edomite, Samalic, Ugaritic, all of Ṣayhadic / Old South Arabian…), further all sorts of extinct poorly attested Algonquian or Tupian or Uto-Aztecan languages, etc etc.
Even single wordlists can yield fair evidence for a detailed classification, if a language is not too far removed from well-attested relatives, and can be thereby linked with the framework of historical phonology, lexicon and, perhaps, morphology that they allow setting up. For a Uralic example, consider Yurats (and I could cite also plenty of cases discussing the detailed dialectological positions of Finnic, Samic, Mordvinic, Mari, Mansi etc. varieties known again only from single wordlists). This is more or less what Ehret does too, even if the work is kind of buried within his corpus of South Cushitic etymologies and his overarching South Cushitic phonological reconstruction. Clearly missing though is any synthesis of the different sources of Kwʼadza and Aasax; the former known from five or six primary collections, the latter from three, many of both unpublished. Work on this might be desirable also for helping with steering people like Kießling away from outright denying the studyability of East Rift. [2] And really not even any more humble general description of either variety seems to have been published at all! Ehret, too, is mostly content to simply assert overall phonological inventories, without commenting much on the primary sources. This might be fair in the case of Kwʼadza, since he has himself conducted fieldwork with its last speakers (and there seems to be an unpublished manuscript from this that’s cited in various later work; even on Wikipedia amusingly enough), but less so with Aasax. Ehret passingly notes e.g. having converted the last field records from 1974 from phonetic into phonological transcription, but one would like to know some details on this too.
It might be less clear if East Rift is really entirely a sister group of West Rift, or just a divergent member, or even, some kind of an areal within it. At least one of the more distinctive common West Rift innovations that Ehret proposes, the shift of the ejectives *kʼ, *kʼʷ to uvulars *q⁽ʼ⁾, *q⁽ʼ⁾ʷ, is alas trivial within Cushitic, appearing in several other groups or languages including Agaw, Somaloid, Konsoid (here as an implosive [ʛ]!); and maybe indicating that some stage of Common Cushitic did not quite have *kʼ, but rather something slightly different such as an also-pharyngealized *kˤʼ. [3] The same holds also for one of the more immediately obvious common East Rift innovations, the merger of pharyngeals *ħ *ʕ with glottals *h *ʔ respectively, which is again attested in a large number of other basic groups, e.g. Agaw, Oromoid, Highland East, and per Ehret indeed also in Ma’a. But also many minor conditional developments are posited for both West Rift and East Rift. These would require actual review rather than blanket dismissal.
Several yet further differences between WR and ER involve weakly attested sound correspondences that Ehret simply reconstructs as additional Proto-Rift segments. Some do involve specific attestable segments in ER (e.g. WR *d ~ ER prenasalized *nd < Ehret’s Proto-Rift dental *d̪, oddly enough); some others, just “crossed” correspondences between more basic segments. This is where we start getting into clearly dubious territory. These segments do not really flesh out any highly sensible phonological subsystem. Some of them could be fitted into “empty slots” seen in the West Rift and East Rift consonant systems, but only with further assumptions about their development: e.g. a correspondence WR *b ~ ER *p is reconstructed by Ehret as Proto-Rift *pʼ, [4] even though there are no other examples of either ejective voicing in WR or ejective devoicing in ER.
Still, before continuing this thread, a few words also on Ma’a. For this variety we already have clear criticism of Ehret’s position of treating it as a third basic branch of South Cushitic: Mous 1996, “Was there ever a Southern Cushitic Language (Pre-)Ma’a?” The most basic argument, which I take no issue with, is that attested Ma’a should not be itself even treated as a Cushitic language, but merely a largely-Cushitic lexical register of a Bantu language otherwise known as Mbugu. This then opens the option that Ma’a might not originate by language shift from some highly distinct Cushitic variety, but rather, from adoption of Cushitic vocabulary from at least two sources, in his view one of them probably a member of West Rift, the other closer resembling Oromo. So far, so good; Mous even admits that language shift from Cushitic still remains on the table (though he thinks this pre-Ma’a to have been more probably East Cushitic), which to me at least would still sound like a compelling reason for why modern Ma’a-as-a-register arose at all. However, other problems remain. As in research on Dahalo, Mous too seems to deem various lexemes to be either “West Rift” or “East Cushitic” mainly by their etymological distribution, without detailed attention to comparative phonology. For a simple example, West Rift /ħ ʕ/ correspond with Ma’a /h ʔ/. While this could arise as a sound substitution by Bantu speakers unfamiliar with pharyngeals (but would they not have been also unfamiliar with the glottal stop at least?), it could also indicate borrowing, instead, from East Rift, where as mentioned, loss of pharyngeals appears natively; at least if we admit that things about the East Rift languages are knowable. Even the geographically closest certainly-Cushitic language to Ma’a is indeed Aasax! (None are in direct contact with it.) For that matter, at least a few broad but specifically Ma’a–Aasax isoglosses seem to be proposed in Ehret’s work too: *x > h and *tsʼ > s in at least some cases, again simple enough to be plausibly just sound substitutions by foreign speakers, but plausibly also real common innovations, especially if we no longer require that these should be reflected everywhere in the Cushitic component of Ma’a.
Furthermore, all this is complicated also by the proposals in literature that Dahalo, too, has ended up with lexicon of both South (≈ Rift) and East Cushitic origin. Cushitic vocabulary in Ma’a may not force the existence of a single substratal pre-Ma’a as an independent South Cushitic branch, but it definitely forces the existence of at least a Cushitic language in contact with older Mbugu; if two different Cushitic lexical strata are accepted, then at least two such contact languages. We must then still ask: what was the internal history of this / these Cushitic varieties? Already the geographic separation of Ma’a and Rift has implications for this. Even if we supposed there was simply an originally Rift variety that wandered further towards the coast along the Pangani river, this does not rule out a possibility that the “East Cushitic” component surfacing today was not borrowed independently into Bantu Ma’a, but rather, already into this Rift variety — just as the proposed two-stratum theory of Dahalo would also require the existence of South / East Cushitic language contacts. (Trivial if Dahalo is really South Cushitic, it’s already right next to southern Somali and Oromo(id) varieties, but less so if it’s supposed to be “just another” East Cushitic branch.) The opposite scenario can be considered as well: a lost East Cushitic language in the area, which had at some earlier point in history absorbed also some Rift influence, before itself contributing a chunk of vocabulary into Ma’a.
Even moreover, the general problem that “East Cushitic” remains without a good definition by shared innovations (and might even contain all of the putatively South Cushitic languages in it) keeps on the table also the option that some of the Ma’a vocabulary is not narrower East Cushitic-isms, as much as archaisms, lost from the attested Rift languages. This holds even if Ma’a is indeed analyzed to contain vocabulary from two different Cushitic sources! After all, it is not very economical to assume two completely separate Cushitic spreads south into Tanzania, only for one of them to then disappear completely except for a few loanwords into Ma’a. Instead, it would make geographical sense for the “East Cushitic” component to be still really para-Rift, as per the family tree followed by Ehret. Something of this sort is also readily suggested by theories of East African prehistory which posit the Rift languages and maybe Dahalo (if it’s been in Kenya longer than its immediate “East Cushitic” neighbors) to not represent any kind of an outpost of Cushitic, as much as a remnant of a continuous Cushitic belt that would’ve once, before the newer expansions of South and East Nilotic and Northeast Bantu, stretched all across the areas of modern Kenya and northern Tanzania.
All this leaves a large number of “moving parts” available for any reanalysis of the historical phonology of Ma’a. As we will see, various reanalyses are probably required regardless; but it also seems to me Mous’ idea of mixture of basically modern West Rift and modern East Cushitic is too simplistic and, above all, geographically implausible in an environment where nothing West Rift nor classically East Cushitic has been attested. Prehistory hides many lost languages, and there is nothing a priori implausible in proposing one where evidence so suggests.
As I’ve mentioned recently on Twxttxr, any review of Ehret’s phonological scheme of South Cushitic should probably begin from the reconstruction of Proto-West Rift. There is fairly good overall agreement between Ehret’s reconstruction and the later work of Kießling & Mous (henceforth K&M), to be expected since also the modern languages retain the makeup of the system almost intact. PWR comes out with a reasonably distinctive system containing at least:
- all six basic stops *p *b *t *d *k *g, plus labiovelars *kʷ *gʷ and uvulars *q *qʷ;
- two ejective affricates *tsʼ *tɬʼ, interestingly without voiceless or voiced equivalents;
- an almost full system of voiceless fricatives, *f *s *ɬ *x *xʷ;
- the full original Cushitic (perhaps already original AA?) laryngeal system, *ħ *ʕ *h *ʔ;
- all six basic sonorants *m *n *r *l *w *y, plus palatal and velar nasals, *nʲ *ŋ;
- a bog standard Cushitic vowel system *a *e *i *o *u.
K&M add to this the labiovelar nasal *ŋʷ (actually mostly corresponding to Ehret’s *ŋ), vowel length, and, appearing mainly in clear loanwords, postalveolar affricates *č *ǰ. Ehret adds ejective *čʼ, supposedly distinguished from *tsʼ only in older Alagwa records (but see below). As it turns out, comparison already with Ehret’s own Proto-Cushitic reconstruction shows that most of these segments can be easily equated with identical precedessors also there — only *nʲ and K&M’s *č *ǰ seem to be entirely novel. A few call for other notes, but give no reason to doubt their PWR or Proto-Rift existence:
- PWR *q, *qʷ: as noted above, clearly from older velar ejectives *kʼ, *kʼʷ.
- PWR *tsʼ corresponds most prominently with Ehret’s PC *tʼ, suggesting spontaneous affrication as the original explanation of the phonological asymmetry of /t tsʼ/ without **tʼ **ts in West Rift (or East Rift). The same occurs in the neighboring Sandawe, and a partly similar /t tsʼ ts/ without **tʼ in the neighboring Hadza (both of them “Khoisan” candidate isolates, presumably ancient in the region), suggesting that this has been an areal innovation, arising on-site in the Rift Valley, i.e. at least not at any extremely early time during the Cushitic expansion southwards.
- PWR *tɬʼ: various distinctive correspondences identified by Ehret, enough of them that there was probably a distinct PC precedent, even if not necessarily a lateral obstruent (I’ve heard of some recent work suggesting secondary lateralization of earlier palatals).
- PWR *x, *xʷ supposedly correspond with both *x, *xʷ and *ɣ, *ɣʷ in Agaw, but also stops in various other parts of Cushitic (most consistently in Beja). Probably these are real inherited segments too, but I would wonder about options like reconstructing “old” PC uvulars instead, later then either fricativized or merged with velars.
- The labialization contrast usually goes with /o/, /u/ vocalism elsewhere in Cushitic and might be secondary, especially if East Cushitic does not hold up as a subgroup; Ehret posits its most prominent innovation to be *Cʷa > *Co, *Cʷ > *C elsewhere, but perhaps this is archaic rather than innovative. Labiovelars occur also in Beja and Agaw, but they might have been independently innovated, especially since neither is an especially old group by itself. Same might go for labiovelars in Dahalo and Ma’a (if we don’t think they simply get all their “South Cushitisms” thru loanwords from Rift proper). Clearly needs further research though.
- PWR *p: compared with /p/ also in East Rift, Ma’a and Dahalo, probably correctly. From the rest of Cushitic, Ehret however mostly finds comparanda with /b/. Already on general typological grounds I suspect these are mainly dubious, and that original Proto-Cushitic *p was instead shifted to *f early and just about everywhere, leaving a **p gap in most of the Cushitic languages. A new *p then would have arisen at some point in the development towards Proto-Rift. If this is per hypothesis mostly newer areal vocabulary, appearence of /p/ also in Ma’a and Dahalo probably won’t suffice as a defining PSC feature though.
After this things start getting worse. Already the bare numbers suggest bloat in Ehret’s deeper phonological reconstructions: his system of 29 consonants in Proto-West Rift expands to 33 in Proto-East Rift, 36 in Proto-Rift, and finally balloons to 49 in Proto-South Cushitic. A priori this might not be a completely terrible amount, when compared with an impressive 60+ in Dahalo, where many of these find unique reflexes… though then just 30-ish in any other South Cushitic variety. More alarming is that even Ehret himself finds no Proto-Cushitic source for most of these additional segments. This could be all still OK, maybe open to various kind of reanalyses, if these reconstructions were based on good robust data. Alas, they are not. Most are based on few etymologies, often with semantic stretches or other irregularities. Often also with major distributional gaps, such that an asserted overall correspondence pattern really comprises e.g. individual Rift ~ Ma’a and Rift ~ Dahalo correspondences lumped together (or perhaps even weaker correspondences like Kwʼadza ~ Dahalo, West Rift ~ Ma’a, etc.), with no or almost no evidence of the implied Ma’a ~ Dahalo correspondences even existing. — This strategy of “farming” or “lumping” rarer proto-segments from disjoint correspondences probably needs a name for it, I keep seeing it in many long-range or otherwise dubious reconstruction proposals; e.g. it’s all over the place in versions of Nostratic. (Cf. also footnote 4.)
The most distinctively poor set of Ehret’s extra segments are prenasalized stops and affricates in word-initial position. (Word-medial cases do not look distinguishable from plain old nasal + stop clusters, well-attested thruout Cushitic. I’ll also skip over *nɬ, which does not yield anything prenasalized and which Ehret in his later Proto-Cushitic work readjusts to *dɮ.) These are a regular but small part of the phonology of Dahalo, which should be probably assumed to mainly originate as intrusive vocabulary, maybe some also from irregular nasalization of former plain stops. And almost all of the South Cushitic etymologies Ehret finds for them are weaksauce. To roll out the data as he cites it (abbreviations: I = Iraqw, B = Burunge, A = Alagwa, Q = Kwʼadza, S = Aasax, M = Ma’a, D = Dahalo; transcription should be mostly obvious but I maintain Ehret’s ṯ ḏ for dental stops in D):
- *mpats- ‘to be strewn’: A pasit- ‘to scatter’, pisari ‘seed’, B pisagariya ‘seed’ ~ D mbàttsì ‘potsherd’. Poor semantics, and also *ts is a suspicious PSC segment.
- *mparoxʷ- ‘egret’: Q palaʔeto ‘crested crane’ (-l- < *-r- is regular) ~ D mbórogo ‘young egret’. Poor distribution, uncompelling semantics (mixing different species’ names is always an easy way of farming junk etymologies), apparently irregular *xʷ > ʔ in Q, and even supposedly regular *xʷ > D g sounds phonetically suspicious.
- *mpee- ‘little, mean, scanty, slight’: Q paʔali- ‘narrow’ ~ M -bí ‘to shorten’ ~ D mbííṯ- ‘to scorn’ (→ mbííṯe ‘bad’). Very short CV comparison, semantics at least in D too divergent to put any trust on.
- *mpuux- ‘sprout, shoot’: M -buká, -buxá ‘greens’ ~ D mbùùku ‘vine, tendril, creeper’. Poor distribution and semantics.
- *mpɨnde-: M -púnde ‘penis’ ~ D mbéne ‘vagina’. Poor distribution, uncompelling semantics (possible, but not strong evidence by itself to believe in the comparison).
- *ntaakʷ- ‘small carnivore’: I taweramo, B takoraymo, A tokoraymo (K&M: PWR *takʷerimo) ‘wild dog’ ~ D nḏááge ‘aardvark’. Poor semantics, irregular voicing and delabialization of *kʷ in D.
- *nteekʼʷ- ‘incisor’: I taqesamo ‘jaw’ ~ D nḏéégi ‘canine tooth’. Poor distribution, uncompelling semantics, irregular voicing of *kʼ⁽ʷ⁾ in D (and why is the labialization reconstructed at all?)
- *nʈarag- ‘Orthoptera species’: Q tsʼelemayo ‘cricket’ ~ D nḏàràgì ‘mantis’. Poor distribution and semantics, ad hoc metathesis of a presumed suffixal *-m- in Q.
- *nʈaŋa ‘beestings’: Q tsʼangayiko ‘fresh milk’ ~ M dáŋá ‘beestings’. Limited distribution; semantics fine; no direct evidence of prenasalization though.
- *nʈif- ‘food stirring stick’: I tsʼifraŋ, B čʼufara, A tsʼufara, S šeferank ‘tongue’ (K&M: PWR *tsʼufiraaŋʷ; *tsʼ- > š- in Aasax is regular) ~ D nḏufuro [‘food stirring stick’?]. Uncompelling semantics, even if this is likely an innovative lexeme in Rift compared to the rest of Cushitic.
- *nʈoh- ‘to clear the throat’, *nʈoh-aala ‘phlegm’: B čʼohod- ‘to cough’ ~ Q tsʼalahet- ‘to curse’ ~ D nḏwààlà ‘mucus’. Uncompelling semantics in Q (requires also metathesis); irregular contraction *-ohaa- > -waa- in D; plausibly simply a recent onomatopoetic verb in B.
- *nʈuu- ‘hawk’: S šuʔununu ~ D nḏúúma. Poor distribution; very short comparison, requires ad hoc morphology. Semantics apparently fine.
- *ntsaaw- ‘reeds’, *ntsoomari ‘straw’: I tsʼawo ‘reeds’ (K&S: also B A, PWR *tsʼaaboo ‘sisal, bushy end’) ~ Q tsʼemaliko ‘straw’ ~ M izumari ‘flute’. Vocalism problems in Q, uncompelling semantics in M, no direct evidence of prenasalization.
- *ntsew- ‘small bird sp.’: M -zewe ‘carmine bee-eater’ ~ D ndzòmò ‘barbet sp.’ Poor distribution, uncompelling semantics.
- *ntsi- ‘spleen’: I tsʼi-daʕa ‘heartburn’ ~ Q tsʼiyale ‘spleen’ ~ D ndzóne ‘spleen’. Very short comparison with ad hoc morphology, irregular vocalism in D, poor semantics in I.
- *ntsom- ‘to shout’: Q tsʼamaʔato ‘happiness, joy’ ~ M -zo ‘to cry’. Poor distribution and semantics.
- *ntsoom- ‘kind of bee’: Q tsʼamayituko ‘bee’ ~ D ndzóóme ‘honey of ḿpeele bee’. Poor distribution, Q morphology unexplained.
- *ntʲaduʕ- ‘bog’: M -darú ‘swamp’ ~ D ndodóʕo ‘mud’. Poor distribution, uncompelling semantics.
- *ntʲodi- ‘grasp, grip’: M -dóri ‘to take; marry (a wife)’, -doríwe ‘to be married’ ~ D ndódi ‘thumb’. Poor distribution and semantics.
- *ntʲooʕ- ‘gravelly soil’: B čʼiʕaramo ‘pebble’ ~ Q čʼaʔamuko ‘small streambed’ ~ D ndóóʕo ‘sand’. No immediate major flaws, still not an obvious etymology either though.
- *ŋkara-: Q kalaʔeto ‘stork’ ~ D ŋgára ‘crested crane’. Poor distribution, uncompelling semantics.
- *ŋkexine- ‘eyebrow’: I gine ~ D ŋgikine. Looks decent except for loss of *-x- (or indeed maybe of *-k-) in I. K&M have instead two different PWR etyma for ‘eyebrow, eyelid’, neither certain to be old inheritance though.
- *ŋko- ‘flea’: Q koyimaye ~ D ŋgúnewe ‘spirillum tick’. Poor distribution, uncompelling semantics, short comparison, ad hoc morphology. Maybe the worst etymology here, despite stiff competition!
- *ŋkol- ‘steer’: I B A karama ‘bull, steer’, Q kolawatu ‘bull’ (K&M: PWR *karaama) ~ D ŋgólome ‘bull buffalo’. Would look decent except for a l/r mismatch; also with parallels in East Cushitic that again have just plain *k- and also *-r- rather than *-l-, e.g. Borana (Oromoid) korma ‘bull’.
- *ŋkum- ‘fog’: M -gónónó ~ D ŋgúmine ‘raincloud’. Poor distribution, ad hoc *u > o and *m > n required in M; semantics fine.
- *ŋkʷaa- ‘rainbow’: B ilakʷekʷiya ~ D ŋgòòwi. Extensive ad hoc morphology (including reduplication) required in B; semantics fine.
- *ŋkʷaal- ‘to impoverish, leave poor’: I kʷalaʔo, B A kʷaʔalitoʔo, Q kalaʔay ‘widow’ (K&M: PWR *kʷaʔalaʔoo) ~ M -gwa ‘to steal’, -gwaló ‘thief’. Poor semantics, and Ehret’s assumption of original *l plus metathesis in B A might not hold. No direct evidence for prenasalization.
A few of these maybe might still be at least related areal vocabulary, but it should be clear that the low number of comparanda, both overall and especially in the otherwise well-documented West Rift, reliance on semantically off-field comparisons, and a common need for additional morphological or phonological assumptions, does not add up to a corpus supporting the existence of an already Proto-South Cushitic consonant series. E.g. the example of ‘bull’ could instead suggest a borrowing that is ultimately of Cushitic origin but was passed through some other intermediates before getting to Dahalo. Ehret also has one other similar case just in Dahalo and so not formally reconstructible for PSC: D ŋgaasið- ~ Somali kas- ‘to explain’ (besides prenasalization, vowel length also not matching; also s ~ s cannot be native).
There should be also a general suspicion of anything with prenasalized stops more likely coming from Bantu. Looking over Mauro Tosco’s 1991 A Grammatical Sketch of Dahalo, however (which includes a glossary with some loanword notes), fairly few cases of that have been identified: from Swahili there’s mbona ‘why’, nḏuugo ‘kinsman’, ŋgúúfu ‘strong’; from Northern Swahili nḏani ‘inside’, nḏigad- ‘to bury’, nḏoo ‘come!’ and tentatively nḏupa ‘bottle’, ŋgúúko ‘cock’ (~ NSw thupa, khuku). [5] Nothing from other nearby Bantu languages like Pokomo, but maybe that simply has not been (was not?) studied yet. But note also Dahalo’s nasalized dental click nǀ, which might also count as “prenasalized” phonologically, but almost surely can’t originate as is from anything Bantu.
It would be possible to go over similar problems in base etymological data also for some of Ehret’s other more poorly attested segments. As my tally of prenasalized consonants already shows, he in particular adds a few additional place of articulation series for PSC, which aren’t really reflected as such anywhere: retroflexes and palatalized dentals, most of them scantily attested and probably rejectable entirely. He proposes PC origin for two of them though: the voiced retroflex *ɖ and the palatalized ejective *tʲʼ. These show different issues, maybe worth discussing in more detail.
If taken at face value, Ehret’s PSC *tʲʼ probably should be first of all reconstructed instead as a postalveolar affricate *čʼ; since that is both his alleged PC source and its Proto-Rift reflex. Also the asserted Ma’a reflex is č, but most data for that are again poor etymologies, e.g. M -čá ‘to be crafty’ is compared with Q salimuko ‘coward’, D tʼar- ‘to practice witchcraft’; M hečéri ‘yet, not yet’ is first analyzed to have a prefix he- continuing a fossilized demonstrative, then compared with Q sel- ‘to straighten’, to gain a PSC root supposedly having meant ‘to make ready, prepare, put in order’. A few comparisons between Rift and Dahalo look better, e.g. *tʲʼatʼ- ‘soil, earth’ > B čʼečʼeʔiya, A tsʼatsʼaʔi ‘dust’ (K&M: PWR *tsʼatsʼaiʔya) ~ Q saʔamuko ‘earth’ ~ D tʼattʼe ‘mud’. But as the Dahalo reflex is just the alveolar ejective tʼ, it also turns out that most of the data would fit simply as cases of Ehret’s plain *tʼ! He relies most often on Kwʼadza data on making the distinction, where supposedly *tʲʼ > s (as in all three examples above) — but these cases generally leave again room for doubts about their validity, e.g. irregular *tʼ > ʔ in ‘earth’ above. A few also have Ehret’s PWR *čʼ on the grounds of čʼ in older Alagwa data, but these are firstly few, and secondly, they mostly occur in a palatal environment (e.g. čʼiraʔa ‘bird’; K&S’s PWR *tsʼiraʔa) where we might suspect this was actually just a dialectalism or a lost allophonic feature. Hard to tell though from the current presentation. Ehret still gives also words like PSC *tʼah- ‘to be pregnant’ >> A tsʼihay ‘pregnancy’, but does not state if this comes from older or newer Alagwa data. Either way, Ehret’s supposed distinction *tʼ | *tʲʼ looks like it has been really “farmed” together from disparate sources, most prominently
- older Alagwa tsʼ | čʼ;
- Kwʼadza tsʼ | s;
- Ma’a s | č;
that actually do not correlate well with each other. I’ve already tentatively suggested that the supposed Ma’a cognates with č are just wrong, and that older Alagwa čʼ might be secondary. For Kwʼadza I’m not sure if either of these approaches works entirely. Not all data with s looks easily dismissable, e.g. PWR (K&M) *tsʼaaʔas- ‘to shine, shed light on’ (probably with the common Cushitic causative *-as- suffix) ~ Q saʔ- ‘to burn’; PWR *tsʼitsʼaʕiya ~ Q sasaʔamo ‘star’. [6] A third option though might be internal loaning: in Aasax, the regular word-initial reflex of Proto-Rift *tsʼ is a sibilant š, and this would probably be reflected as s (Q has itself no š) if some words had been borrowed into Kwʼadza from an Aasax-like variety. — Ehret’s later proposed distinct Proto-Cushitic sources also do not look very strongly established, but that would be more of a tangent that I want to get into in detail; though again, the same general types of problems recur as in Ehret’s PSC reconstruction.
As the last stretch for this blog post, Ehret’s PSC *ɖ proves to be relevant in several ways. Word-initially, this is proposed to be distinct from plain *d in that Dahalo would have an implosive ɗ for the former, a dental plosive ḏ for the latter; Ma’a would have mostly ɗ- from both, but sometimes also some *ɖ- > z-. A few do look plausible (e.g. I A deʔem-, B Q deʔ-, M -zéʔu ‘to herd’). All of Rift, however, shows just *d- (retained in I B A Q, > ɗ in S). Furthermore, comparison with Proto-Cushitic supposedly shows *ɖ- < *d- versus *d- < *z-. Both correspondences indeed have some decent etymologies for them, e.g. PC *dar- ‘to increase, add to’ > D ɗar- (Ehret only lists other cognates from Agaw and Somali; I’d consider adding also Highland East *darš- ‘to swell’); PC *zab- ‘to grasp’ (in Beja, Somaloid) > D ḏáβa ‘hand’. (Both of these also well represented in Rift languages.)
In his Proto-Cushitic book, Ehret claims that this chain shift of *d and *z would be evidence for the unity of South Cushitic. The fortition *z > *d is distinctive at first sight, but this sound change is again widespread in Cushitic, e.g. Beja, Saho–Afar, Oromoid, Konsoid; it may have started already at an early date, perhaps diffusing to Pre-Proto-Rift already from some neighboring East Cushitic dialect area. Thus, if Rift does not even show the distinction of Ehret’s *ɖ- and *d-, to me this supposed isogloss seems worthless. A real chainshift can be only really set up for Dahalo, and also without any shunts in place of articulation. The language probably has *d- >> ɗ- simply as a part of an areal innovation of implosion of voiced stops, appearing already also in Aasax (only ɓ- ɗ-) and Ma’a (all of ɓ- ɗ- ɠ- ɠʷ-), as well as in the local Bantu languages, including Swahili. Initial P(S)C *b-, too, gives Dahalo ɓ-. [7] The most likely reason for why Proto-Cushitic *z was not affected would be that it still remained a fricative at this point, if maybe already [ð] — which still exists in Dahalo as the intervocalic allophone of /d̪/ — and only fortiting to a stop ḏ- later: thus, in particular, independently of the similar change in Rift.
Besides implications on classification, another corollary is that if the distinction between Ehret’s *ɖ and *d was in the last common ancestor of Rift and Dahalo rather in manner and not place of articulation, there is little reason to expect the existence of Ehret’s other, more poorly evidenced retroflexes *ʈ, *ʈʼ (or *nʈ, which I believe I’ve already demonstrated above to be spurious).
The Ma’a initial correspondences, then, may have been assigned the wrong way around. It seems to me that we should suspect z- reflexes to be archaisms continuing PC *z- and not *d-. Most data could be in fact swapped around with ease, since Ehret has very little evidence for the correspondence M z– ~ D ɗ-; the only decent-looking case is I daʔ– ‘to penetrate’, A daʕ- ‘to thrust into’, D ɗaʕ- ‘to insert’ ~ M zaʔá ‘inside’ (apparently a native Cushitic root further cognate with Beja da- ‘to enter’). Another example, I daqaw- ‘to go’, D ɗakʷ- ‘to be going’ ~ M -zuxu ‘sandal’, is semantically divergent enough to not be immediately reliable. And if we allow for the existence of a few Rift or para-Rift loanwords in Dahalo, these could be chalked as examples of that. I can even note from Dahalo the preposition ḏa ‘in’, which could be taken as the real native reflex of PC *za(ʕ)- ‘inside’! There also might be other explicit evidence that Ma’a z- originates from PC *z-: the above-mentioned -zéʔu ‘to herd’ looks comparable with Highland East Cushitic *zoh- or *zoʔ- ‘to roam, wander’ (whence Hadiyya doʔ-, Kambaata zoh-, Sidaamo do-). Ehret’s proposed correspondence *d- > M ɗ- ~ D ḏ- can be also dealt with similarly. Most data is weak, but at least Q daʔas- ‘to scoop into fingers (e.g. porridge)’ ~ M -ɗaʔá ‘to pick, pluck’ ~ D ḏaʕaað- ‘to catch hold of’ looks like a decent comparison, but we could however suggest that this is simply a Rift-type word in Ma’a. The absense of data showing the opposite correspondences, M z- ~ D ḏ- and M D ɗ-, will have to remain a weakness; but not a major one, if we will not claim the two to be relatively close relatives within a South Cushitic group.
Word-medially, Ehret’s *-ɖ- behaves differently. It is now supposed to reflect both of PC *-d- and *-z-; and to yield in Ma’a -ɗ- or -r-, in Dahalo -ɗ- or -ṯṯ- (yes, dental!). In just one enviroment, at the end of noun stems, *-z- and now also *-s- are supposed to instead give *-d-, yielding Ma’a -r-, Dahalo -ḏ- or -r-. Most of Rift has *-r- for both, with the exception of Burunge, where *-ɖ- > -r-, but *-d- > -d-. Looking up the full data behind any of this would be more work than any survey of word-initial correspondences (though would not need to be done from scratch: Ehret’s PSC lexicon helpfully lists also non-initial occurrences of each consonant). Given from earlier examples that Ehret is maybe particularly prone to bad etymologizing with Ma’a, and that variable reflexes there could have a complex background involving Cushitic-internal loaning, I will simply ignore it for now to save my efforts. Exclusive Rift ~ Dahalo vocabulary will also not be the most interesting here. However, if Ehret’s argument for the unity of South Cushitic from alleged common development of *d- and *z- does not hold, will word-medial evidence work any better? Again there is no evidence for a common shift in place of articulation at least. Cases of Dahalo -ḏ- [-ð-] from earlier *-z- would not need to have ever gone thru a stop at all; cases of -ɗ- might be again independent development from plain *-d-. The supposed development of *-s- to *-z- is a bit less trivial — maybe not in a general typological light, but any medial voicing of fricatives seems rare across Cushitic. And there’s again indeed evidence decent enough to think that this is a real correspondence, e.g. Somali gus ‘penis’ ~ PWR (K&M) *gu(d)doo ‘testicles’ ~ D giḏḏa ‘semen’; Somali ħaas ‘wife; family’ ~ PWR (K&M) *hadee ‘wife’ (for which Ehret reports instead reflexes with ħ-). This is complicated, however, by Ehret finding that in Dahalo also verb-stem-final *-s- is voiced to [-ð-] (and *-f- to [-β-]), which changes do not appear in Rift (cf. e.g. the example of Q daʔas- ~ D ḏaʕaað- above). If the conditions of fricative voicing in fact differ, this innovation thus seems to be at most areal in nature, not inherited from common PSC. Also, the cases in noun stems could be even interpreted differently: as really devoicing of PC *z in original word-final position in a few languages like Somali — i.e. not even an innovation at all! Thus, still no particularly clear evidence here to set up a South Cushitic (Rift–Dahalo) group independent of wider East Cushitic or general Cushitic.
Many other issues remain that I’ve not even touched so far (e.g. Ehret’s PSC central vowels *ɨ, *ə which have no individually distinct reflexes in any of the languages). But I hope to have demonstrated that a better understanding of the South Cushitic hypothesis, and the history of its constituent languages, requires 1. actually engaging with Ehret’s PSC and PC reconstructions, preferrably to some extent also with the spottily documented Kwʼadza, Aasax and Ma’a; 2. being regardless prepared to throw out plenty of weak etymologies in the process; 3. not taking older literature’s assumptions about the history, reconstruction or existence of “East Cushitic” for granted either. A big wide playing field… but probably not insurmountable.
Postscript. I’ve recalled I have around also one further work from Ehret on overall Cushitic reconstruction: a 2008 article “The primary branches of Cushitic: Seriating the diagnostic sound change rules”. [8] He has in this come to accept a few similar conclusions on SC reconstruction as I do here — he retracts the reconstruction of the distinct prenasalized stops (though mostly hangs on to the etymologies, claiming that they instead arise from the reduction of some semantically unspecified prefix *(h)in-), as well as the weaker retroflex and palatalized stops (*ʈ, *ʈʼ, *tʲ, *dʲ); former *ɖ adjusted now to *ɗ and *tʲʼ indeed to *čʼ. He still insists on what I see here as a just-Dahalo chainshift of *d *z to constitute a defining feature of South Cushitic though, with no word on the Rift merger. Also, if the most obvious junk phoneme issues have already been admitted, it might have been a good idea to move to further issues, such as Ehret’s non-ejective affricates *ts *dz *dɮ. All three attested phonemes in Dahalo, but Rift and Ma’a correspondences look more problematic.
His delineation there of East Cushitic looks dubious as well. The proposed defining features include e.g. the rise of a substantial implosive series: *pʼ > *ɓ, *dɮ > *ɗ, *tɬʼ > *ʄ, *ɣ⁽ʷ⁾ > *ɠ⁽ʷ⁾… clearly nonsense, when several putative EC languages / basic subgroups actually have implosive reflexes for at most one or two of these. But this was not a post for reviewing EC reconstruction; that would be a different topic entirely, a bigger one that has had several more people working on it too.
[1] “Some salient features of Southern Cushitic (Common West Rift)“. Unclear to me from the academia.edu page where, or even if, this is published.
[2] Ultimately I suspect Kießling’s position to have been influenced by the occasionally seen overblown claim that “morphology is the only way to classify languages”, some kind of a broken-telephone exaggerration of the fact that morphology can often provide very strong evidence for classification (but is in no way the sole possible type of evidence).
[3] Or even already a uvular *qʼ, which could have later on reverted to a more neutral / less marked *kʼ in languages in closer contact with Omotic or Ethio-Semitic. But this all surely still requires a good areal survey of the whole of Cushitic and environs. I do suspect this is not entirely Proto-Cushitic anyway, but has ultimately spread from Semitic somehow, though this is complicated by the *ḳ > q shift being in modern Semitic limited to Central Semitic (Arabic etc.), absent from both Ethio-Semitic & Modern South Arabian as far as I know. Would it be a plausible or in any way investigable hypothesis that *kʼ ~ *q variation existed earlier in Ethio-Semitic too, and was just later levelled out in favor of the ejective? If contact with EthSem. is hypothesized to have brought about *q > kʼ in modern Bilin (in Agaw), then it is at least conceivable that the same could have happened also in some of the currently–smaller Ethio-Semitic languages; and perhaps not before them having passed uvularization “on to” various more southern Cushitic languages.
— A second, still more speculative idea that I could consider is that perhaps spontaneous kʼ > qʼ is in fact a natural sound change, and is merely blocked in most of the world’s language stocks with ejectives due to the fact that they happen to already possess also distinctive uvulars? This is after all the case almost universally in the Caucasus, quite widely also in western North America, and common even for several more isolated lineages with ejectives, e.g. Aymara, Itelmen, Mayan, Tuu.
[4] Projected also further to PSC and PC, and finally, in his PAA reconstruction, proposed to correspond with Proto-Omotic *pʼ, itself projected from the North Omotic branch. Actually, no actual cases of a correspondence of North Omotic *pʼ ~ West Rift *b ~ East Rift *p seem to exist: Ehret’s book on PAA lists 18 cognate sets with initial *pʼ-, but only five are attested in both Omotic and Cushitic, of these none in South Cushitic. His *pʼ thus really breaks down into disjoint sets of cases rooted in a South Cushitic etymon vs. cases rooted in a North Omotic etymon (and several reconstructed from still more indirect considerations like an Egyptian p ~ Semitic *b correspondence — six cases, two of them with a South Cushitic *pʼ cognate and one with a North Omotic *pʼ cognate).
[5] I don’t know OTTOMH if these comparisons are supposed to suggest earlier *nt, *ŋk in Bantu or the development of prenasalization within Dahalo itself.
[6] Per Ehret a derivative from the same root as the previous; per K&M instead derived from *tsʼaʕ- ‘to appear’, looking more likely since they document -ʕ- and not -ʔ-. Both seem to agree on reduplication.
[7] Ehret fails to recognize even this change as areal, and instead operates with allophonic implosion of word-initial *b- *d- *ɖ- already in PSC + its later reversion in most Rift languages.
[8] From the collection In Hot Pursuit of Language in Prehistory: Essays in the four fields of anthropology, ed. John D. Bengtson.
Junk phonemes in Proto-South-Cushitic, and some possible fixes
Followup on my previous overview of comparative Cushitic: a slightly more involved look at Ehret’s Proto-South-Cushitic from 1980, and some readily observable issues in it. To reiterate slightly, his view of South Cushitic includes four basic units:
There’s a couple easily observable typological features that all four share, maybe most prominently
These are not a priori guaranteed to be innovative though — those reconstructed by Ehret for PSC he indeed goes on to later treat as Proto-Cushitic (and even Proto-Afrasian) archaisms, and his argument for the genealogical unity of South Cushitic is more involved, for one part of which see later. It might also seem premature for him to have focused on the “full” South Cushitic instead of just the clear Rift subgroup, but that’s what we do have and can therefore review.
Newer research has already put quite a bit more work into the comparison of the still more or less thriving West Rift languages (Kießling & Mous 2003, The Lexical Reconstruction of Proto-West Rift), and it seems like combining this with Ehret’s work could make a productive project. As noted in my previous post, there’s been also a fair bit of debate in the literature on what to do with Dahalo. Most of the discussion that I’ve seen, though, is unfortunately sort of typology-oriented and relies on methods like looking where cognates might be found and which of them look the most surface-similar to Dahalo. But even if we suppose the language is e.g. ultimately instead East Cushitic, there is no rule saying that sometimes e.g. a proper Proto-Cushitic etymon could not have survived just in, or mainly in, Dahalo + Rift, instead of being something like a Rift loanword in Dahalo. More detailed work on its historical phonology might be able to sometimes make this distinction, and Ehret’s proposals on this should not be wholly ignored. That Dahalo’s consonant system is one of the most “kitchensinky” on the planet (it has a little bit of almost anything you could ask for: clicks, ejectives, implosives, prenasalized consonants, a dental/alveolar distinction…) should surely also help: clearly it has been absorbing loanword phonemes for a while now, and then perhaps not only from unrelated Bantu or extinct Paleoafrican languages, but also from other branches of Cushitic? Ehret’s work already leaves a few clear openings for this kind of a hypothesis, I think. More on this further along.
The situation of research on Kwʼadza, Aasax and Ma’a looks much less satisfying. As for East Rift, R. Kießling, one of the more active West Rift researchers, seems to have deemed their closest eastern relatives not worth looking into, with claims appearing in overview works such as “The position of Qwadza and Asax is dubious, since there is not enough data (and probably never will be) to prove that they belong to a different subbranch within Southern Cushitic“. [1] I have not found substantiation for this claim of “not enough data” (how would one prove such a negative anyway?), and that Ehret already has given in his work reasons to think that they form a distinct East Rift branch looks to be simply swept under the carpet. Any idea that a language could not be even classified if it is extinct and its available documention is short of state-of-the-art modern linguistic methodology is, of course, absurd. Historical linguistics can demonstrate just fine that languages such as Oscan or Umbrian are not merely Indo-European, they’re indeed Italic and further form a separate subgroup of it in contrast to the well-attested Latin. This is no folly or privilege or just Indo-Europeanists either, the same has been done again just fine e.g. with plenty of extinct poorly attested Semitic languages (Ammonite, Edomite, Samalic, Ugaritic, all of Ṣayhadic / Old South Arabian…), further all sorts of extinct poorly attested Algonquian or Tupian or Uto-Aztecan languages, etc etc.
Even single wordlists can yield fair evidence for a detailed classification, if a language is not too far removed from well-attested relatives, and can be thereby linked with the framework of historical phonology, lexicon and, perhaps, morphology that they allow setting up. For a Uralic example, consider Yurats (and I could cite also plenty of cases discussing the detailed dialectological positions of Finnic, Samic, Mordvinic, Mari, Mansi etc. varieties known again only from single wordlists). This is more or less what Ehret does too, even if the work is kind of buried within his corpus of South Cushitic etymologies and his overarching South Cushitic phonological reconstruction. Clearly missing though is any synthesis of the different sources of Kwʼadza and Aasax; the former known from five or six primary collections, the latter from three, many of both unpublished. Work on this might be desirable also for helping with steering people like Kießling away from outright denying the studyability of East Rift. [2] And really not even any more humble general description of either variety seems to have been published at all! Ehret, too, is mostly content to simply assert overall phonological inventories, without commenting much on the primary sources. This might be fair in the case of Kwʼadza, since he has himself conducted fieldwork with its last speakers (and there seems to be an unpublished manuscript from this that’s cited in various later work; even on Wikipedia amusingly enough), but less so with Aasax. Ehret passingly notes e.g. having converted the last field records from 1974 from phonetic into phonological transcription, but one would like to know some details on this too.
It might be less clear if East Rift is really entirely a sister group of West Rift, or just a divergent member, or even, some kind of an areal within it. At least one of the more distinctive common West Rift innovations that Ehret proposes, the shift of the ejectives *kʼ, *kʼʷ to uvulars *q⁽ʼ⁾, *q⁽ʼ⁾ʷ, is alas trivial within Cushitic, appearing in several other groups or languages including Agaw, Somaloid, Konsoid (here as an implosive [ʛ]!); and maybe indicating that some stage of Common Cushitic did not quite have *kʼ, but rather something slightly different such as an also-pharyngealized *kˤʼ. [3] The same holds also for one of the more immediately obvious common East Rift innovations, the merger of pharyngeals *ħ *ʕ with glottals *h *ʔ respectively, which is again attested in a large number of other basic groups, e.g. Agaw, Oromoid, Highland East, and per Ehret indeed also in Ma’a. But also many minor conditional developments are posited for both West Rift and East Rift. These would require actual review rather than blanket dismissal.
Several yet further differences between WR and ER involve weakly attested sound correspondences that Ehret simply reconstructs as additional Proto-Rift segments. Some do involve specific attestable segments in ER (e.g. WR *d ~ ER prenasalized *nd < Ehret’s Proto-Rift dental *d̪, oddly enough); some others, just “crossed” correspondences between more basic segments. This is where we start getting into clearly dubious territory. These segments do not really flesh out any highly sensible phonological subsystem. Some of them could be fitted into “empty slots” seen in the West Rift and East Rift consonant systems, but only with further assumptions about their development: e.g. a correspondence WR *b ~ ER *p is reconstructed by Ehret as Proto-Rift *pʼ, [4] even though there are no other examples of either ejective voicing in WR or ejective devoicing in ER.
Still, before continuing this thread, a few words also on Ma’a. For this variety we already have clear criticism of Ehret’s position of treating it as a third basic branch of South Cushitic: Mous 1996, “Was there ever a Southern Cushitic Language (Pre-)Ma’a?” The most basic argument, which I take no issue with, is that attested Ma’a should not be itself even treated as a Cushitic language, but merely a largely-Cushitic lexical register of a Bantu language otherwise known as Mbugu. This then opens the option that Ma’a might not originate by language shift from some highly distinct Cushitic variety, but rather, from adoption of Cushitic vocabulary from at least two sources, in his view one of them probably a member of West Rift, the other closer resembling Oromo. So far, so good; Mous even admits that language shift from Cushitic still remains on the table (though he thinks this pre-Ma’a to have been more probably East Cushitic), which to me at least would still sound like a compelling reason for why modern Ma’a-as-a-register arose at all. However, other problems remain. As in research on Dahalo, Mous too seems to deem various lexemes to be either “West Rift” or “East Cushitic” mainly by their etymological distribution, without detailed attention to comparative phonology. For a simple example, West Rift /ħ ʕ/ correspond with Ma’a /h ʔ/. While this could arise as a sound substitution by Bantu speakers unfamiliar with pharyngeals (but would they not have been also unfamiliar with the glottal stop at least?), it could also indicate borrowing, instead, from East Rift, where as mentioned, loss of pharyngeals appears natively; at least if we admit that things about the East Rift languages are knowable. Even the geographically closest certainly-Cushitic language to Ma’a is indeed Aasax! (None are in direct contact with it.) For that matter, at least a few broad but specifically Ma’a–Aasax isoglosses seem to be proposed in Ehret’s work too: *x > h and *tsʼ > s in at least some cases, again simple enough to be plausibly just sound substitutions by foreign speakers, but plausibly also real common innovations, especially if we no longer require that these should be reflected everywhere in the Cushitic component of Ma’a.
Furthermore, all this is complicated also by the proposals in literature that Dahalo, too, has ended up with lexicon of both South (≈ Rift) and East Cushitic origin. Cushitic vocabulary in Ma’a may not force the existence of a single substratal pre-Ma’a as an independent South Cushitic branch, but it definitely forces the existence of at least a Cushitic language in contact with older Mbugu; if two different Cushitic lexical strata are accepted, then at least two such contact languages. We must then still ask: what was the internal history of this / these Cushitic varieties? Already the geographic separation of Ma’a and Rift has implications for this. Even if we supposed there was simply an originally Rift variety that wandered further towards the coast along the Pangani river, this does not rule out a possibility that the “East Cushitic” component surfacing today was not borrowed independently into Bantu Ma’a, but rather, already into this Rift variety — just as the proposed two-stratum theory of Dahalo would also require the existence of South / East Cushitic language contacts. (Trivial if Dahalo is really South Cushitic, it’s already right next to southern Somali and Oromo(id) varieties, but less so if it’s supposed to be “just another” East Cushitic branch.) The opposite scenario can be considered as well: a lost East Cushitic language in the area, which had at some earlier point in history absorbed also some Rift influence, before itself contributing a chunk of vocabulary into Ma’a.
Even moreover, the general problem that “East Cushitic” remains without a good definition by shared innovations (and might even contain all of the putatively South Cushitic languages in it) keeps on the table also the option that some of the Ma’a vocabulary is not narrower East Cushitic-isms, as much as archaisms, lost from the attested Rift languages. This holds even if Ma’a is indeed analyzed to contain vocabulary from two different Cushitic sources! After all, it is not very economical to assume two completely separate Cushitic spreads south into Tanzania, only for one of them to then disappear completely except for a few loanwords into Ma’a. Instead, it would make geographical sense for the “East Cushitic” component to be still really para-Rift, as per the family tree followed by Ehret. Something of this sort is also readily suggested by theories of East African prehistory which posit the Rift languages and maybe Dahalo (if it’s been in Kenya longer than its immediate “East Cushitic” neighbors) to not represent any kind of an outpost of Cushitic, as much as a remnant of a continuous Cushitic belt that would’ve once, before the newer expansions of South and East Nilotic and Northeast Bantu, stretched all across the areas of modern Kenya and northern Tanzania.
All this leaves a large number of “moving parts” available for any reanalysis of the historical phonology of Ma’a. As we will see, various reanalyses are probably required regardless; but it also seems to me Mous’ idea of mixture of basically modern West Rift and modern East Cushitic is too simplistic and, above all, geographically implausible in an environment where nothing West Rift nor classically East Cushitic has been attested. Prehistory hides many lost languages, and there is nothing a priori implausible in proposing one where evidence so suggests.
As I’ve mentioned recently on Twxttxr, any review of Ehret’s phonological scheme of South Cushitic should probably begin from the reconstruction of Proto-West Rift. There is fairly good overall agreement between Ehret’s reconstruction and the later work of Kießling & Mous (henceforth K&M), to be expected since also the modern languages retain the makeup of the system almost intact. PWR comes out with a reasonably distinctive system containing at least:
K&M add to this the labiovelar nasal *ŋʷ (actually mostly corresponding to Ehret’s *ŋ), vowel length, and, appearing mainly in clear loanwords, postalveolar affricates *č *ǰ. Ehret adds ejective *čʼ, supposedly distinguished from *tsʼ only in older Alagwa records (but see below). As it turns out, comparison already with Ehret’s own Proto-Cushitic reconstruction shows that most of these segments can be easily equated with identical precedessors also there — only *nʲ and K&M’s *č *ǰ seem to be entirely novel. A few call for other notes, but give no reason to doubt their PWR or Proto-Rift existence:
After this things start getting worse. Already the bare numbers suggest bloat in Ehret’s deeper phonological reconstructions: his system of 29 consonants in Proto-West Rift expands to 33 in Proto-East Rift, 36 in Proto-Rift, and finally balloons to 49 in Proto-South Cushitic. A priori this might not be a completely terrible amount, when compared with an impressive 60+ in Dahalo, where many of these find unique reflexes… though then just 30-ish in any other South Cushitic variety. More alarming is that even Ehret himself finds no Proto-Cushitic source for most of these additional segments. This could be all still OK, maybe open to various kind of reanalyses, if these reconstructions were based on good robust data. Alas, they are not. Most are based on few etymologies, often with semantic stretches or other irregularities. Often also with major distributional gaps, such that an asserted overall correspondence pattern really comprises e.g. individual Rift ~ Ma’a and Rift ~ Dahalo correspondences lumped together (or perhaps even weaker correspondences like Kwʼadza ~ Dahalo, West Rift ~ Ma’a, etc.), with no or almost no evidence of the implied Ma’a ~ Dahalo correspondences even existing. — This strategy of “farming” or “lumping” rarer proto-segments from disjoint correspondences probably needs a name for it, I keep seeing it in many long-range or otherwise dubious reconstruction proposals; e.g. it’s all over the place in versions of Nostratic. (Cf. also footnote 4.)
The most distinctively poor set of Ehret’s extra segments are prenasalized stops and affricates in word-initial position. (Word-medial cases do not look distinguishable from plain old nasal + stop clusters, well-attested thruout Cushitic. I’ll also skip over *nɬ, which does not yield anything prenasalized and which Ehret in his later Proto-Cushitic work readjusts to *dɮ.) These are a regular but small part of the phonology of Dahalo, which should be probably assumed to mainly originate as intrusive vocabulary, maybe some also from irregular nasalization of former plain stops. And almost all of the South Cushitic etymologies Ehret finds for them are weaksauce. To roll out the data as he cites it (abbreviations: I = Iraqw, B = Burunge, A = Alagwa, Q = Kwʼadza, S = Aasax, M = Ma’a, D = Dahalo; transcription should be mostly obvious but I maintain Ehret’s ṯ ḏ for dental stops in D):
A few of these maybe might still be at least related areal vocabulary, but it should be clear that the low number of comparanda, both overall and especially in the otherwise well-documented West Rift, reliance on semantically off-field comparisons, and a common need for additional morphological or phonological assumptions, does not add up to a corpus supporting the existence of an already Proto-South Cushitic consonant series. E.g. the example of ‘bull’ could instead suggest a borrowing that is ultimately of Cushitic origin but was passed through some other intermediates before getting to Dahalo. Ehret also has one other similar case just in Dahalo and so not formally reconstructible for PSC: D ŋgaasið- ~ Somali kas- ‘to explain’ (besides prenasalization, vowel length also not matching; also s ~ s cannot be native).
There should be also a general suspicion of anything with prenasalized stops more likely coming from Bantu. Looking over Mauro Tosco’s 1991 A Grammatical Sketch of Dahalo, however (which includes a glossary with some loanword notes), fairly few cases of that have been identified: from Swahili there’s mbona ‘why’, nḏuugo ‘kinsman’, ŋgúúfu ‘strong’; from Northern Swahili nḏani ‘inside’, nḏigad- ‘to bury’, nḏoo ‘come!’ and tentatively nḏupa ‘bottle’, ŋgúúko ‘cock’ (~ NSw thupa, khuku). [5] Nothing from other nearby Bantu languages like Pokomo, but maybe that simply has not been (was not?) studied yet. But note also Dahalo’s nasalized dental click nǀ, which might also count as “prenasalized” phonologically, but almost surely can’t originate as is from anything Bantu.
It would be possible to go over similar problems in base etymological data also for some of Ehret’s other more poorly attested segments. As my tally of prenasalized consonants already shows, he in particular adds a few additional place of articulation series for PSC, which aren’t really reflected as such anywhere: retroflexes and palatalized dentals, most of them scantily attested and probably rejectable entirely. He proposes PC origin for two of them though: the voiced retroflex *ɖ and the palatalized ejective *tʲʼ. These show different issues, maybe worth discussing in more detail.
If taken at face value, Ehret’s PSC *tʲʼ probably should be first of all reconstructed instead as a postalveolar affricate *čʼ; since that is both his alleged PC source and its Proto-Rift reflex. Also the asserted Ma’a reflex is č, but most data for that are again poor etymologies, e.g. M -čá ‘to be crafty’ is compared with Q salimuko ‘coward’, D tʼar- ‘to practice witchcraft’; M hečéri ‘yet, not yet’ is first analyzed to have a prefix he- continuing a fossilized demonstrative, then compared with Q sel- ‘to straighten’, to gain a PSC root supposedly having meant ‘to make ready, prepare, put in order’. A few comparisons between Rift and Dahalo look better, e.g. *tʲʼatʼ- ‘soil, earth’ > B čʼečʼeʔiya, A tsʼatsʼaʔi ‘dust’ (K&M: PWR *tsʼatsʼaiʔya) ~ Q saʔamuko ‘earth’ ~ D tʼattʼe ‘mud’. But as the Dahalo reflex is just the alveolar ejective tʼ, it also turns out that most of the data would fit simply as cases of Ehret’s plain *tʼ! He relies most often on Kwʼadza data on making the distinction, where supposedly *tʲʼ > s (as in all three examples above) — but these cases generally leave again room for doubts about their validity, e.g. irregular *tʼ > ʔ in ‘earth’ above. A few also have Ehret’s PWR *čʼ on the grounds of čʼ in older Alagwa data, but these are firstly few, and secondly, they mostly occur in a palatal environment (e.g. čʼiraʔa ‘bird’; K&S’s PWR *tsʼiraʔa) where we might suspect this was actually just a dialectalism or a lost allophonic feature. Hard to tell though from the current presentation. Ehret still gives also words like PSC *tʼah- ‘to be pregnant’ >> A tsʼihay ‘pregnancy’, but does not state if this comes from older or newer Alagwa data. Either way, Ehret’s supposed distinction *tʼ | *tʲʼ looks like it has been really “farmed” together from disparate sources, most prominently
that actually do not correlate well with each other. I’ve already tentatively suggested that the supposed Ma’a cognates with č are just wrong, and that older Alagwa čʼ might be secondary. For Kwʼadza I’m not sure if either of these approaches works entirely. Not all data with s looks easily dismissable, e.g. PWR (K&M) *tsʼaaʔas- ‘to shine, shed light on’ (probably with the common Cushitic causative *-as- suffix) ~ Q saʔ- ‘to burn’; PWR *tsʼitsʼaʕiya ~ Q sasaʔamo ‘star’. [6] A third option though might be internal loaning: in Aasax, the regular word-initial reflex of Proto-Rift *tsʼ is a sibilant š, and this would probably be reflected as s (Q has itself no š) if some words had been borrowed into Kwʼadza from an Aasax-like variety. — Ehret’s later proposed distinct Proto-Cushitic sources also do not look very strongly established, but that would be more of a tangent that I want to get into in detail; though again, the same general types of problems recur as in Ehret’s PSC reconstruction.
As the last stretch for this blog post, Ehret’s PSC *ɖ proves to be relevant in several ways. Word-initially, this is proposed to be distinct from plain *d in that Dahalo would have an implosive ɗ for the former, a dental plosive ḏ for the latter; Ma’a would have mostly ɗ- from both, but sometimes also some *ɖ- > z-. A few do look plausible (e.g. I A deʔem-, B Q deʔ-, M -zéʔu ‘to herd’). All of Rift, however, shows just *d- (retained in I B A Q, > ɗ in S). Furthermore, comparison with Proto-Cushitic supposedly shows *ɖ- < *d- versus *d- < *z-. Both correspondences indeed have some decent etymologies for them, e.g. PC *dar- ‘to increase, add to’ > D ɗar- (Ehret only lists other cognates from Agaw and Somali; I’d consider adding also Highland East *darš- ‘to swell’); PC *zab- ‘to grasp’ (in Beja, Somaloid) > D ḏáβa ‘hand’. (Both of these also well represented in Rift languages.)
In his Proto-Cushitic book, Ehret claims that this chain shift of *d and *z would be evidence for the unity of South Cushitic. The fortition *z > *d is distinctive at first sight, but this sound change is again widespread in Cushitic, e.g. Beja, Saho–Afar, Oromoid, Konsoid; it may have started already at an early date, perhaps diffusing to Pre-Proto-Rift already from some neighboring East Cushitic dialect area. Thus, if Rift does not even show the distinction of Ehret’s *ɖ- and *d-, to me this supposed isogloss seems worthless. A real chainshift can be only really set up for Dahalo, and also without any shunts in place of articulation. The language probably has *d- >> ɗ- simply as a part of an areal innovation of implosion of voiced stops, appearing already also in Aasax (only ɓ- ɗ-) and Ma’a (all of ɓ- ɗ- ɠ- ɠʷ-), as well as in the local Bantu languages, including Swahili. Initial P(S)C *b-, too, gives Dahalo ɓ-. [7] The most likely reason for why Proto-Cushitic *z was not affected would be that it still remained a fricative at this point, if maybe already [ð] — which still exists in Dahalo as the intervocalic allophone of /d̪/ — and only fortiting to a stop ḏ- later: thus, in particular, independently of the similar change in Rift.
Besides implications on classification, another corollary is that if the distinction between Ehret’s *ɖ and *d was in the last common ancestor of Rift and Dahalo rather in manner and not place of articulation, there is little reason to expect the existence of Ehret’s other, more poorly evidenced retroflexes *ʈ, *ʈʼ (or *nʈ, which I believe I’ve already demonstrated above to be spurious).
The Ma’a initial correspondences, then, may have been assigned the wrong way around. It seems to me that we should suspect z- reflexes to be archaisms continuing PC *z- and not *d-. Most data could be in fact swapped around with ease, since Ehret has very little evidence for the correspondence M z– ~ D ɗ-; the only decent-looking case is I daʔ– ‘to penetrate’, A daʕ- ‘to thrust into’, D ɗaʕ- ‘to insert’ ~ M zaʔá ‘inside’ (apparently a native Cushitic root further cognate with Beja da- ‘to enter’). Another example, I daqaw- ‘to go’, D ɗakʷ- ‘to be going’ ~ M -zuxu ‘sandal’, is semantically divergent enough to not be immediately reliable. And if we allow for the existence of a few Rift or para-Rift loanwords in Dahalo, these could be chalked as examples of that. I can even note from Dahalo the preposition ḏa ‘in’, which could be taken as the real native reflex of PC *za(ʕ)- ‘inside’! There also might be other explicit evidence that Ma’a z- originates from PC *z-: the above-mentioned -zéʔu ‘to herd’ looks comparable with Highland East Cushitic *zoh- or *zoʔ- ‘to roam, wander’ (whence Hadiyya doʔ-, Kambaata zoh-, Sidaamo do-). Ehret’s proposed correspondence *d- > M ɗ- ~ D ḏ- can be also dealt with similarly. Most data is weak, but at least Q daʔas- ‘to scoop into fingers (e.g. porridge)’ ~ M -ɗaʔá ‘to pick, pluck’ ~ D ḏaʕaað- ‘to catch hold of’ looks like a decent comparison, but we could however suggest that this is simply a Rift-type word in Ma’a. The absense of data showing the opposite correspondences, M z- ~ D ḏ- and M D ɗ-, will have to remain a weakness; but not a major one, if we will not claim the two to be relatively close relatives within a South Cushitic group.
Word-medially, Ehret’s *-ɖ- behaves differently. It is now supposed to reflect both of PC *-d- and *-z-; and to yield in Ma’a -ɗ- or -r-, in Dahalo -ɗ- or -ṯṯ- (yes, dental!). In just one enviroment, at the end of noun stems, *-z- and now also *-s- are supposed to instead give *-d-, yielding Ma’a -r-, Dahalo -ḏ- or -r-. Most of Rift has *-r- for both, with the exception of Burunge, where *-ɖ- > -r-, but *-d- > -d-. Looking up the full data behind any of this would be more work than any survey of word-initial correspondences (though would not need to be done from scratch: Ehret’s PSC lexicon helpfully lists also non-initial occurrences of each consonant). Given from earlier examples that Ehret is maybe particularly prone to bad etymologizing with Ma’a, and that variable reflexes there could have a complex background involving Cushitic-internal loaning, I will simply ignore it for now to save my efforts. Exclusive Rift ~ Dahalo vocabulary will also not be the most interesting here. However, if Ehret’s argument for the unity of South Cushitic from alleged common development of *d- and *z- does not hold, will word-medial evidence work any better? Again there is no evidence for a common shift in place of articulation at least. Cases of Dahalo -ḏ- [-ð-] from earlier *-z- would not need to have ever gone thru a stop at all; cases of -ɗ- might be again independent development from plain *-d-. The supposed development of *-s- to *-z- is a bit less trivial — maybe not in a general typological light, but any medial voicing of fricatives seems rare across Cushitic. And there’s again indeed evidence decent enough to think that this is a real correspondence, e.g. Somali gus ‘penis’ ~ PWR (K&M) *gu(d)doo ‘testicles’ ~ D giḏḏa ‘semen’; Somali ħaas ‘wife; family’ ~ PWR (K&M) *hadee ‘wife’ (for which Ehret reports instead reflexes with ħ-). This is complicated, however, by Ehret finding that in Dahalo also verb-stem-final *-s- is voiced to [-ð-] (and *-f- to [-β-]), which changes do not appear in Rift (cf. e.g. the example of Q daʔas- ~ D ḏaʕaað- above). If the conditions of fricative voicing in fact differ, this innovation thus seems to be at most areal in nature, not inherited from common PSC. Also, the cases in noun stems could be even interpreted differently: as really devoicing of PC *z in original word-final position in a few languages like Somali — i.e. not even an innovation at all! Thus, still no particularly clear evidence here to set up a South Cushitic (Rift–Dahalo) group independent of wider East Cushitic or general Cushitic.
Many other issues remain that I’ve not even touched so far (e.g. Ehret’s PSC central vowels *ɨ, *ə which have no individually distinct reflexes in any of the languages). But I hope to have demonstrated that a better understanding of the South Cushitic hypothesis, and the history of its constituent languages, requires 1. actually engaging with Ehret’s PSC and PC reconstructions, preferrably to some extent also with the spottily documented Kwʼadza, Aasax and Ma’a; 2. being regardless prepared to throw out plenty of weak etymologies in the process; 3. not taking older literature’s assumptions about the history, reconstruction or existence of “East Cushitic” for granted either. A big wide playing field… but probably not insurmountable.
Postscript. I’ve recalled I have around also one further work from Ehret on overall Cushitic reconstruction: a 2008 article “The primary branches of Cushitic: Seriating the diagnostic sound change rules”. [8] He has in this come to accept a few similar conclusions on SC reconstruction as I do here — he retracts the reconstruction of the distinct prenasalized stops (though mostly hangs on to the etymologies, claiming that they instead arise from the reduction of some semantically unspecified prefix *(h)in-), as well as the weaker retroflex and palatalized stops (*ʈ, *ʈʼ, *tʲ, *dʲ); former *ɖ adjusted now to *ɗ and *tʲʼ indeed to *čʼ. He still insists on what I see here as a just-Dahalo chainshift of *d *z to constitute a defining feature of South Cushitic though, with no word on the Rift merger. Also, if the most obvious junk phoneme issues have already been admitted, it might have been a good idea to move to further issues, such as Ehret’s non-ejective affricates *ts *dz *dɮ. All three attested phonemes in Dahalo, but Rift and Ma’a correspondences look more problematic.
His delineation there of East Cushitic looks dubious as well. The proposed defining features include e.g. the rise of a substantial implosive series: *pʼ > *ɓ, *dɮ > *ɗ, *tɬʼ > *ʄ, *ɣ⁽ʷ⁾ > *ɠ⁽ʷ⁾… clearly nonsense, when several putative EC languages / basic subgroups actually have implosive reflexes for at most one or two of these. But this was not a post for reviewing EC reconstruction; that would be a different topic entirely, a bigger one that has had several more people working on it too.
[1] “Some salient features of Southern Cushitic (Common West Rift)“. Unclear to me from the academia.edu page where, or even if, this is published.
[2] Ultimately I suspect Kießling’s position to have been influenced by the occasionally seen overblown claim that “morphology is the only way to classify languages”, some kind of a broken-telephone exaggerration of the fact that morphology can often provide very strong evidence for classification (but is in no way the sole possible type of evidence).
[3] Or even already a uvular *qʼ, which could have later on reverted to a more neutral / less marked *kʼ in languages in closer contact with Omotic or Ethio-Semitic. But this all surely still requires a good areal survey of the whole of Cushitic and environs. I do suspect this is not entirely Proto-Cushitic anyway, but has ultimately spread from Semitic somehow, though this is complicated by the *ḳ > q shift being in modern Semitic limited to Central Semitic (Arabic etc.), absent from both Ethio-Semitic & Modern South Arabian as far as I know. Would it be a plausible or in any way investigable hypothesis that *kʼ ~ *q variation existed earlier in Ethio-Semitic too, and was just later levelled out in favor of the ejective? If contact with EthSem. is hypothesized to have brought about *q > kʼ in modern Bilin (in Agaw), then it is at least conceivable that the same could have happened also in some of the currently–smaller Ethio-Semitic languages; and perhaps not before them having passed uvularization “on to” various more southern Cushitic languages.
— A second, still more speculative idea that I could consider is that perhaps spontaneous kʼ > qʼ is in fact a natural sound change, and is merely blocked in most of the world’s language stocks with ejectives due to the fact that they happen to already possess also distinctive uvulars? This is after all the case almost universally in the Caucasus, quite widely also in western North America, and common even for several more isolated lineages with ejectives, e.g. Aymara, Itelmen, Mayan, Tuu.
[4] Projected also further to PSC and PC, and finally, in his PAA reconstruction, proposed to correspond with Proto-Omotic *pʼ, itself projected from the North Omotic branch. Actually, no actual cases of a correspondence of North Omotic *pʼ ~ West Rift *b ~ East Rift *p seem to exist: Ehret’s book on PAA lists 18 cognate sets with initial *pʼ-, but only five are attested in both Omotic and Cushitic, of these none in South Cushitic. His *pʼ thus really breaks down into disjoint sets of cases rooted in a South Cushitic etymon vs. cases rooted in a North Omotic etymon (and several reconstructed from still more indirect considerations like an Egyptian p ~ Semitic *b correspondence — six cases, two of them with a South Cushitic *pʼ cognate and one with a North Omotic *pʼ cognate).
[5] I don’t know OTTOMH if these comparisons are supposed to suggest earlier *nt, *ŋk in Bantu or the development of prenasalization within Dahalo itself.
[6] Per Ehret a derivative from the same root as the previous; per K&M instead derived from *tsʼaʕ- ‘to appear’, looking more likely since they document -ʕ- and not -ʔ-. Both seem to agree on reduplication.
[7] Ehret fails to recognize even this change as areal, and instead operates with allophonic implosion of word-initial *b- *d- *ɖ- already in PSC + its later reversion in most Rift languages.
[8] From the collection In Hot Pursuit of Language in Prehistory: Essays in the four fields of anthropology, ed. John D. Bengtson.
Posted in Commentary, Methodology