Rethinking How we Learn Mandarin Tones

Mandarin tones are a common source of confusion. On more than one occasion I have heard fellow learners complain that native speakers do not use tones properly when speaking the language at a natural pace. This may sound like a bizarre assertion and taken at face value it is wrong. However, I believe this misperception is motivated by an underlying truth which all Chinese learners encounter sooner or later; Chinese spoken naturally sounds very different from the canned textbook words and phrases most beginner learners are exposed to. 

As soon as a learner moves from hearing perfectly enunciated, slowed phrases to watching Chinese movies or listening to native podcasts they will notice this difference. At the centre of this confusion is a misunderstanding about the function of the Chinese five level system (four tones + one neutral tone) which we all study; The five tones we learn in pinyin: flat[1], rising[2], falling-rising[3], falling[4] and neutral[5] are not intended to serve as a perfect or complete description of the way Chinese tones work. Instead they are a very useful and general guide. 

A system to help learners acquire tones is absolutely necessary, especially when starting out. Native speakers of non-tonal languages such as English require training to focus their attention on tones. We must become accustomed to the fact that there are four main tones and that the meaning of individual syllables varies depending on which one is deployed. In this regard pinyin provides an essential service for all beginner learners. But it should come with a hazard warning: it doesn’t tell the full story. Entire PHD theses have been written on the complex ways in which tones work, such as how tones vary depending on where they appear in a sentence, how one character’s tone can be influenced by the preceding character and how some tones overlap with others in different contexts.  Pinyin cannot capture all these complexities and nor is it intended to. 

The misperception that pinyin can account for everything sometimes leads learners to falsely assume that native speakers are not pronouncing tones correctly or that there is something wrong with their own ear. Typically learners will wonder why a word like 什么 is often pronounced without the clear rising tone on the first character they have been taught by their textbooks to expect. One reason for this is that, as with many words spoken fast in the middle of a sentence, the vowel is swallowed and with it the opportunity to pronounce the tone is also lost. 

Another source of confusion is the overlapping nature of different tone pairs, for example 4,4 and 4,5. When two fourth tones appear together it is common for the second syllable to be under stressed, leading it to sound much like a neutral tone. As a result, in natural speech the tonal differences between these two combinations can be minuscule and a learner with perfectly adequate hearing may find it impossible to discern whether a tone pair is 4,4 e.g. 刻意 or 4,5 e.g. 谢谢. This probably tells us more about the bluntness of the five tone system as a descriptive tool than about a learner’s comprehension skills or ability to pronounce tones well. But students who are not aware of this may lose confidence when they leave the textbook and enter the real world of natural speech. 

Since pinyin doesn’t tell the full story, an additional tool is required to properly acquire tones: listening. A reliance on pinyin memorisation without sufficient listening leads to poor fluency and speakers sounding like robots, pronouncing each tone with eerie regularity whilst failing to reproduce any of the complex nuances of normal, native speech. For this reason a solid understanding of pinyin must be combined with huge amounts of focussed listening to native content as the primary study tool to acquire tones, enabling the brain to gradually acquire a more refined intuition of how tones actually sound in everyday speech. This is often taken for granted by students who are studying in China and have the advantage of being surrounded by the language. But it is sometimes neglected by those learning outside China who need to create an immersive environment for themselves.  

To be clear, all beginner learners should develop a solid grasp of how pinyin and the five tone system works right from the start so that they can train themselves to listen carefully for tones. But many teachers and online influencers place undue pressure on beginners to memorise every tone for each new character and accurately recall them when speaking, regardless of how little listening material they have consumed. The reasoning behind this philosophy typically invokes the “bad habit” hypothesis which holds that if learners are permitted to get away with mistakes early on these errors will crystallise and become hard to correct later.

Since most classrooms demand that beginners practice speaking from day one (a questionable assumption) it follows that failure to immediately memorise and accurately reproduce tones could lead to irreversible damage. Although I cannot disprove this hypothesis, I note that it is usually asserted on the basis of no evidence and is highly contentious. I have also observed that a premature demand for tonal accuracy before learners have acquired any sense of familiarity with Mandarin can lead to feelings of shame when errors are inevitably committed. Worse still, some learners consequently develop a fear of speaking which can be long lasting. 

I am certainly not opposed to memorisation as a language learning tool but the order of this approach strikes me as back to front. As a beginner I found attempting to recall the tones for every word when speaking a very tall order. This was not for want of trying, rather as a learner based in the UK I wasn’t immersed in the target language environment and, like most of my classmates, didn’t realise the importance of combining flashcards with large amounts of listening outside of class. Much of the advice I followed online made me feel my failure to remember tones accurately when speaking was down to a lack of willpower or failure of memory rather than the inevitable consequence of studying without a significant amount of listening input. 

Conversely, in the past two years I have changed my learning style towards a mass input approach spending many hours reading and listening to native content. The more I have listened to Mandarin the easier I have found it to commit individual tones and tone pairs to memory using flashcards and other memorisation techniques. After hearing a word 1000 times in natural speech the task of reproducing it with the correct tone when speaking is far easier than when you have only just came across it. When I use pinyin flashcards now they often serve as an affirmation or reminder of information my brain has already acquired through many hours of listening. 

I suspect unrealistic attitudes towards tones have contributed to many beginners simply giving up, leading to famously high dropout rates on Mandarin courses. A saner approach than demanding all new learners immediately commit every tone to memory would be to manage their expectations. Students should be continually reminded that language acquisition is a gradual process. Before acquiring a strong familiarity with Mandarin it is entirely natural to forget tones for individual words when speaking but this inevitable fact should never be a source of shame.  

