Opinion: No, Professor Krashen, the Perfect Accent Is NOT Already Within You

Just over a year ago I attended an online networking conference for Mandarin enthusiasts. The event was run by the Confucius Institute and provided an opportunity for students and teachers to meet each other and discuss Chinese learning.

I entered one of the breakout rooms and raised a topic which had been on my mind: Is it really necessary to memorise every tone for every character or every tone pair for every vocabulary item? Surely that’s far too painstaking! A better option, I suggested, is to expose yourself to lots of listening content and mimic as best you can. After all, that’s what I’d been doing and my tones had been gradually improving…

When I’d finished speaking everyone fell silent. The handful of learners in the room were all significantly more experienced than me and though they were very polite made it clear they weren’t about to be lectured to by some novice. One by one they rebutted my suggestion that tones can be acquired naturally. Memorisation and repetition drills are a necessity, they asserted. The message was stark: Fail to take deliberate tone practice seriously and you will regret it.

I left the session feeling deeply frustrated. At the time I was a committed – bordering on religious – Krashenite. That meant I believed wholeheartedly in the Input Hypothesis and the importance of Comprehensible Input as expounded by the influential linguist Stephen Krashen. According to him, the more of it you get, the better your output or spoken language abilities will be. For many members of immersion language learning communities that is basically a law of physics. 

Krashen makes a distinction between acquisition (internalising languages unconsciously through mass exposure to comprehensible messages) and learning (formal instruction leading to conscious knowledge about the language, for example grammar rules and the four tone system). His message is that acquisition is generally more efficient than learning.

He falls short of claiming learning is completely redundant but when pressed is reluctant to concede any specific cases where it’s necessary (more on this below). In short, if language learners want to improve they should spend more time reading and listening to compelling content in their target language rather than pursue traditional methods. I found it an attractive idea. 

Equally appealing was Krashen’s claim that “the language acquisition device never shuts off.” Some linguists believe our ability to acquire language fades after puberty, forcing adults to rely on learning methods that make use of our analytical capacities. Krashen doesn’t think these differences are significant. The most extreme interpretation of his position – which to my knowledge he has never stated but many of his online followers seem to believe – is that there are no differences at all between infant and adult brains in their capacity to acquire language.

A Twitter user expressing a widely held belief within immersive language learning communities

All of these ideas chimed with my experience. Over a four year period I found the more Mandarin I read and listened to the more I generally improved. It was thrilling to have finally discovered a method that worked. There was just one small, inconvenient problem: my tones weren’t great. But that, I reassured myself, was either because I hadn’t had enough input yet or because I hadn’t adopted the ‘in group’ mentality Krashen argues is a necessary condition for good method acting and accurate mimicry of foreign accents. In any case, it didn’t create any communication barriers. Native speakers could understand me perfectly. Or so I thought…

Krashen, himself a Mandarin learner, reflected on his own difficulties with Chinese tones in an interview with Olly Richards. “When it comes to Mandarin”, he said, “I sometimes forget [the input hypothesis] and I go to the old classic thing where I want to make sure what I’m saying is absolutely correct…is this tone number two or tone number three? The cure for this is compelling input. The cure are interesting stories and interesting conversations. Then all that fades away.”

During moments of self-doubt I sought reassurance in these words as well as several of Krashen’s other catchphrases: “keep getting input, relax, keep getting input, relax” and “the perfect accent is within you.” Krashen often repeats the latter quote in reference to his belief that we all acquire near-perfect accents in our target language through input but tend not to perform them because it clashes with our identity as outsiders and we fear appearing silly. He admits there is no research to support this.

But the experienced learners in the breakout room weren’t Krashenites, at least not dogmatic ones. I began to wonder: why were they so quick to reject the notion that comprehensible and compelling input is the key to acquiring Chinese tones?

Loss of Faith

In the year since that conference, two experiences have caused me to alter my perspective and rethink my dogmatic commitment to Krashenism. The first is that for the first time I met several native Chinese speakers who were willing to be brutally honest with me about my tonal problems. It turned out this had been affecting my ability to communicate much more than I thought.

Whenever a sentence I uttered contained one or more key words I didn’t know the tone for, listeners would be forced to guess what I meant. Sometimes they guessed right from context but sometimes they didn’t. More often than not they were too polite to tell me they had misunderstood or were themselves unaware they had guessed incorrectly. I have never been a perfectionist and having a foreign accent didn’t bother me. But the more I became aware of my problem with tones the more frustrating it felt not to be able to communicate clearly.

The second experience that altered my perspective was that I started the I’m Learning Mandarin podcast and began interviewing experienced learners about the secrets of their success. I searched far and wide for case studies of adults who had picked up tones through input alone but couldn’t find a single one. Sure, there were plenty of people who claimed they’d done this but as my ear for tones got sharper I could discern that they clearly hadn’t. Their tones were all over the place, they just weren’t able to hear it and their Chinese friends were too polite to break the bad news.

Without exception, the most successful Chinese speakers I interviewed had one thing in common: In addition to mass input, a large component of their pronunciation practice consisted in the most unkrashenite of activities: learning how the tonal system works in theory (including knowledge of tone sandhi), conscious memorisation of tone pairs, repetition drills and – crucially – ‘monitoring.’

Monitoring, which Krashen frequently refers to, is a mental process involving deliberately screening each sentence you’re about to say for errors before self-editing prior to speaking out loud. Krashen warns that this is a highly laborious and inefficient mechanism. There simply isn’t enough time in a spontaneous conversation to stop and think before uttering each word: “what tone is 班 again?” 

Instead, he argues if you get enough of the right kind of input, you’ll internalise the target language unconsciously and be able to speak clearly without a need to excessively monitor your speech – just like we do when we speak our first language.

I agree that monitoring is inefficient, dull and painstaking. But in the absence of case studies supporting the hypothesis that it’s possible to internalise tones through mass comprehensible input alone (and with mountains of evidence to the contrary) we are forced to ask: what are the alternatives? 

I struggled to find answers from Krashen or his supporters so I turned to the masters of Chinese I interviewed on my podcast. Following their advice I drilled tone pairs and flashcards, memorised the tones for all my vocabulary, got a tutor to correct all my mistakes and shadowed native speakers as often as I could. I aimed for 100% accuracy accepting I’d always fall short. And I mentally screened every single word in every sentence for mistakes when speaking out loud. I documented this whole process on my blog.

Initially I sounded like a robot and it took ages to splutter out my words – just as Krashen warned. People told me my Chinese had regressed because I no longer spoke in the carefree way I used to when I didn’t care about tones. But within a few weeks of daily practice my monitor increased in speed and efficiency until eventually it began functioning on autopilot, largely fading from conscious view altogether. After six months I could produce whole sentences with ease, memorise tones for new vocabulary with minimal effort and mimic native speakers. After a year I was able to comfortably host a podcast discussion in Mandarin Chinese. I am reliably told that this process is far swifter and less painful for learners who, unlike me, take these steps from the start.

To be clear, this was not a vanity project and was in no way motivated by perfectionism. I have never aspired to sound perfectly native nor have I achieved that. My goal was always to speak Chinese fluently and clearly so that native speakers – including those who are not used to speaking with foreigners – could easily comprehend my speech. Only after I started working on tones did I regularly get the compliment from native speakers: “It’s astonishing! I can totally understand you!” Were I studying a European language that probably wouldn’t be considered particularly ambitious.

The two clips below were taken roughly 12 months apart. The first was recorded immediately prior to starting work on my tones. The second is a discussion I recently recorded for my podcast. The difference in quality between these clips may not be immediately apparent to inexperienced learners but to the trained or native ear they are poles apart.

Recording 1, 2021
Recording 2, 2022

This experience taught me that mass comprehensible input is a necessary but insufficient condition for developing good Chinese pronunciation. Listening to the first clip it’s obvious that my tones had stagnated. The perfect accent was not inside me. No amount of further input or method acting could overcome the fact that, despite thousands of hours of listening, I had failed to acquire a crucial component of Chinese pronunciation.

Contrary to what Krashen says, it is unwise to place all your faith in comprehensible input in the hope that your tone worries will “fade away”. My tones wouldn’t have improved without a major learning intervention and judging by all the other Chinese learners I’ve met and interviewed the overwhelming likelihood is that yours won’t either.

Late Acquired or Never Acquired?

When Steve Kaufmann interviewed Krashen for his YouTube channel the polyglot lamented that he hadn’t yet managed to acquire some of the finer points of Russian grammar despite reading over a million words and listening to hundreds of hours of the language. In response Krashen suggested the grammar Kaufmann had issues with was probably “late acquired,” the implication being that more input would eventually solve the problem. 

Krashen seems reluctant to admit that certain features of language which infants have no difficulty acquiring are, for adults, not late acquired but simply never acquired – or at best only ever partially acquired through comprehensible input alone. Yet how else can one explain native Chinese speakers who still miss out articles despite being immersed in an English environment for decades and boasting impeccable comprehension skills. Or native English speakers of Spanish who still make gender mistakes even after hundreds of thousands of hours of comprehensible listening input? Perhaps the fear is that by conceding this point he risks strengthening his opponents who overstate the extent to which the language acquisition device fades with age.

Language YouTuber Matt Vs Japan, a vocal Krashen supporter, pressed the linguist on this point during a wide ranging interview. As a Japanese learner Matt followed Krashen’s advice almost to the letter and achieved stunning results. Only one thing was missing: he failed to acquire pitch-accent, a core feature of Japanese pronunciation. Krashen doubled down, responding“you’ve probably acquired more of it than you think,” and once again suggested the solution might lie in method acting. Matt seemed unconvinced and when I interviewed him revealed he’d spent years correcting his wayward pitch using various learning techniques (including the dreaded monitor.) 

Perhaps a more persuasive argument than denying comprehensible input has any limits at all would be to point out that these limits are largely inconsequential. Many linguistic features which tend not to be fully acquired by adults are unessential luxuries. After all, the only consequence of having bad pitch accent is that Japanese people can tell you sound a little foreign. But that argument won’t wash with tones. Unlike pitch accent, Chinese speakers cannot hope to be understood clearly without a reasonably high degree of tonal accuracy. 

When I raise this issue with Krashen supporters they usually insist that under the right conditions tones can in fact be acquired through comprehensible input. They admit they have no evidence on which to base this assertion but point to environmental differences between adults and infants. For example, adult learners tend to speak from early on leading us to develop bad pronunciation habits whereas infants have a long ‘silent period’ during which they only listen. The argument runs that if only adults could replicate the conditions under which infants acquire language – including the long silent period – the results would be the same. 

Perhaps. It’s impossible to prove a negative and I remain sceptically open minded. But to anyone reading this who still firmly believes tones can, under the right conditions, be acquired through comprehensible input alone I issue the following challenge: Find me a single case of someone who speaks fluent Chinese with accurate tones (warning: relying on self assessment won’t do here) and acquired them through comprehensible input without a heavy learning component.

I will happily interview them on my podcast and gladly admit I was wrong.

  This doesn't make much sense to me. It seems like you need to just immerse more. I studied Chinese for past 5 years, immersing every day for multiple hours. When I spoke for the first time just a few weeks ago, natives just they though I was born in China. Amazing results. The reason people have to take so long to acquire tones is because they output too early and don't get enough input. If Matt had delayed output, he probably would have had perfect pitch accent. You should read Krashen's actual papers.


    Which part doesn't make sense? Was it the part where I said I was open to being proven wrong if somebody made me aware of a case contradicting my beliefs? It sounds like you're putting yourself forward as such a case. Have you documented your journey and do you have recordings of yourself speaking Chinese? The reason I ask is not to cast doubt on your lived experience but because relying on this kind of feedback from native speakers tends to be very unreliable for various reasons, as I've written elsewhere. Anyway, if you are putting yourself forward as a case of someone who acquired tones through comprehensible input alone without a heavy study component I'd be interested in interviewing you on my podcast. In Chinese.

