Chinese can be traced back to a hypothetical
proto-language. The first written records appeared over 3,000 years ago during the
Shang dynasty. As the language evolved over this period, the various local varieties became mutually unintelligible. In reaction, central governments have repeatedly sought to promulgate a unified standard.
Most linguists classify all varieties of Chinese as part of the
Sino-Tibetan language family, together with
Tibetan and many other languages spoken in the
Himalayas and the
Southeast Asian Massif. Although the relationship was first proposed in the early 19th century and is now broadly accepted, reconstruction of Sino-Tibetan is much less developed than for families such as
Austroasiatic. Difficulties have included the great diversity of the languages, the lack of inflection in many of them, and the effects of language contact. In addition, many of the smaller languages are spoken in mountainous areas that are difficult to reach, and are often also sensitive border zones.
 Without a secure reconstruction of proto-Sino-Tibetan, the higher-level structure of the family remains unclear.
 A top-level branching into Chinese and
Tibeto-Burman languages is often assumed, but has not been convincingly demonstrated.
Old and Middle Chinese
The earliest examples of Chinese are divinatory inscriptions on
oracle bones from around 1250 BCE in the late
Old Chinese was the language of the
Western Zhou period (1046–771 BCE), recorded in
inscriptions on bronze artifacts, the
Classic of Poetry and portions of the
Book of Documents and
I Ching. Scholars have attempted to reconstruct the
phonology of Old Chinese by comparing later varieties of Chinese with the rhyming practice of the Classic of Poetry and the phonetic elements found in the majority of Chinese characters. Although many of the finer details remain unclear, most scholars agree that Old Chinese differed from Middle Chinese in lacking retroflex and palatal obstruents but having initial consonant clusters of some sort, and in having voiceless nasals and liquids. Most recent reconstructions also describe an atonal language with consonant clusters at the end of the syllable, developing into
tone distinctions in Middle Chinese. Several
derivational affixes have also been identified, but the language lacked
inflection, and indicated grammatical relationships using word order and
Middle Chinese was the language used during
Northern and Southern dynasties and the
Song dynasties (6th through 10th centuries CE). It can be divided into an early period, reflected by the
rime book (601 CE), and a late period in the 10th century, reflected by
rhyme tables such as the
Yunjing constructed by ancient Chinese philologists as a guide to the Qieyun system. These works define phonological categories, but with little hint of what sounds they represent. Linguists have identified these sounds by comparing the categories with pronunciations in modern
varieties of Chinese,
borrowed Chinese words in Japanese, Vietnamese, and Korean, and transcription evidence. The resulting system is very complex, with a large number of consonants and vowels, but they were probably not all distinguished in any single dialect. Most linguists now believe it represents a
diasystem encompassing 6th-century northern and southern standards for reading the classics.
Rise of northern dialects
After the fall of the
Northern Song dynasty, and during the reign of the
Jin (Jurchen) and
Yuan (Mongol) dynasties in northern China, a common speech (now called
Old Mandarin) developed based on the dialects of the
North China Plain around the capital. The
Zhongyuan Yinyun (1324) was a dictionary that codified the rhyming conventions of new
sanqu verse form in this language. Together with the slightly later
Menggu Ziyun, this dictionary describes a language with many of the features characteristic of modern
Up to the early 20th century, most of the people in China spoke only their local variety. As a practical measure, officials of the
Qing dynasties carried out the administration of the empire using a
common language based on Mandarin varieties, known as Guānhuà (官话/官話, literally "language of officials"). For most of this period, this language was a
koiné based on dialects spoken in the
Nanjing area, though not identical to any single dialect. By the middle of the 19th century, the Beijing dialect had become dominant and was essential for any business with the imperial court.
In the 1930s a
standard national language Guóyǔ (国语/國語 "national language") was adopted. After much dispute between proponents of northern and southern dialects and an abortive attempt at an artificial pronunciation, the
National Language Unification Commission finally settled on the Beijing dialect in 1932. The People's Republic founded in 1949 retained this standard, calling it pǔtōnghuà (普通话/普通話 "common speech"). The national language is now used in education, the media, and formal situations in both Mainland China and Taiwan. In
Hong Kong and
Macau, because of their colonial and linguistic history, the language of education, the media, formal speech, and everyday life remains the local
Cantonese, although the standard language is now very influential and taught in schools.