A substratum (plural: substrata) or substrate is a language that influences an intrusive language that supplants it. The term is also used of substrate interference; i.e. the influence the substratum language exerts on the replacing language. According to some classifications, this is one of three main types of linguistic interference: substratum interference differs from both adstratum, which involves no language replacement but rather mutual borrowing between languages of equal "value", and superstratum, which refers to the influence a socially dominating language has on another, receding language that might eventually be relegated to the status of a substratum language.
In a typical case of substrate interference, a Language A occupies a given territory and another Language B arrives in the same territory (brought, for example, with migrations of population). Language B then begins to supplant language A: the speakers of Language A abandon their own language in favor of the other language, generally because they believe that it will help them achieve certain goals within government, the workplace, and in social settings. During the language shift, however, the receding language A still influences language B (for example, through the transfer of loanwords, place names, or grammatical patterns from A to B).
In most cases, the ability to identify substrate influence in a language requires knowledge of the structure of the substrate language. This can be acquired in numerous ways:
- The substrate language, or some later descendant of it, still survives in a part of its former range;
- Written records of the substrate language may exist to various degree;
- The substrate language itself may be unknown entirely, but it may have surviving close relatives that can be used a base of comparison.
One of the first-identified cases of substrate influence is an example of a substrate language of the second type: Gaulish, from the ancient Celtic people the Gauls. The Gauls lived in the modern French-speaking territory before the arrival of the Romans, namely the invasion of Julius Caesar's army. Given the cultural, economic and political advantages that came with being a Latin speaker, the Gauls eventually abandoned their language in favor of the language brought to them by the Romans, which evolved in this region until eventually it took the form of the French language that is known today. The Gaulish speech disappeared in the late Roman era, but remnants of its vocabulary survive in some French words (approximately 200) as well as place-names of Gaulish origin. It is also posited that some structural changes in French were shaped at least in part by Gaulish influence including diachronic sound changes and sandhi phenomena due to the retention of Gaulish phonetic patterns after the adoption of Latin, calques such as aveugle ("blind", literally without eyes, from Latin ab oculis, which was a calque on the Gaulish word exsops with the same semantic construction as modern French) with other Celtic calques possibly including "oui", the word for yes, while syntactic and morphological effects are also posited.
Other examples of substrate languages are the influence of the now extinct North Germanic Norn language on the Scots dialects of the Shetland and Orkney islands. In the Arab Middle East and North Africa, colloquial Arabic dialects, most especially Levantine, Egyptian, and Maghreb dialects, often exhibit significant substrata from other regional Semitic (especially Aramaic), Iranian, and Berber languages. Yemeni Arabic has Modern South Arabian, Old South Arabian and Himyaritic substrata.
Typically, Creole languages have multiple substrata, with the actual influence of such languages being indeterminate.
In the absence of all three lines of evidence mentioned above, linguistic substrata may be difficult to detect. Substantial indirect evidence is needed to infer the former existence of a substrate. The nonexistence of a substrate is difficult to show, and to avoid digressing into speculation, burden of proof must lie on the side of the scholar claiming the influence of a substrate. The principle of uniformitarianism and results from the study of human genetics suggest that many languages have formerly existed that have since then been replaced under expansive language families, such as Indo-European, Afro-Asiatic, Uralic or Bantu. However, it is not a given that such expansive languages would have acquired substratum influence from the languages they have replaced.
Several examples of this type of substratum have still been claimed. For example, the earliest form of the Germanic languages may have been influenced by a non-Indo-European language, purportedly the source of about one quarter of the most ancient Germanic vocabulary. There are similar arguments for a Sanskrit substrate, a Greek one, and a substrate underlying the Sami languages. Relatively clear examples are the Chude, Merya, Muromian and Meshcheran languages: while unattested, their existence has been noted in medieval chronicles, and one or more of them have left substantial influence in the Northern Russian dialects. By contrast more contentious cases are the Vasconic substratum theory and Old European hydronymy, which hypothesize large families of substrate languages across western Europe. Some smaller-scale unattested substrates that remain under debate involve alleged extinct branches of the Indo-European family, such as "Nordwestblock" substrate in the Germanic languages, and a "Temematic" substrate in Balto-Slavic (proposed by Georg Holzer).
When a substrate language or its close relatives cannot be directly studied, their investigation is rooted in the study of etymology and linguistic typology. The study of unattested substrata often begins from the study of substrate words, which lack a clear etymology. Such words can in principle still be native inheritance, lost everywhere else in the language family; but they might in principle also originate from a substrate. The sound structure of words of unknown origin — their phonology and morphology — can often suggest hints in either direction. So can their meaning: words referring to the natural landscape, in particular indigenous fauna and flora, have often been found especially likely to derive from substrate languages. None of these conditions, however, is sufficient by itself to claim any one word as originating from an unknown substratum. Occasionally words that have been proposed to be of substrate origin will be found out to have cognates in more distantly related languages after all, and therefore likely native: an example is Proto-Indo-European *mori 'sea', found widely in the northern and western Indo-European languages, but in more eastern Indo-European languages only in Ossetic.