
How To Calculate Type Token Ratio

Complexity: Activity iv

Activeness iv: Complexity in oral vs. written language

Heritage learners are generally known to have a college oral proficiency than their foreign-language-learning peers. But what is the relationship between their oral skills and their written skills? Is this the same for heritage and foreign language learners?  In this action we examine complication in the oral and written language of our learners.

Would you expect written language or oral language to have more lexical richness? Why or why not?

When you accept finished typing your answer, click to compare your response with the Learner Language staff response.

In general, we expect a written version of a story to have more lexical and syntactic complication than the oral version of the same story. This is considering a writer has more time to recollect in creating the storyline. They accept more fourth dimension to elaborate on motivation, reasons for story developments, or more detailed descriptions.

Part One: Type Token Ratio (TTR)

One useful measure out of complication, a type-token ration (TTR), documents lexical richness, or diverseness in vocabulary. Does the learner utilise the same words over and over, or does southward/he apply a variety of different words to communicate?  A type-token ratio (TTR) is the total number of UNIQUE words (types) divided by the full number of words (tokens) in a given segment of linguistic communication. For example, that last sentence contains 26 dissimilar words (tokens), simply several of those words (similar 'a', 'the', 'words') occur more than than one time, so in that location are only xix UNIQUE words, or types. The TTR of that sentence is 19/26, or .73. The closer the TTR ratio is to 1, the greater the lexical richness of the segment.

Analysis one:

Effort performing a TTR analysis on some of the linguistic communication produced by Henry in his oral and written versions of the narration chore. Use the guidelines below and start with the written sample of his 'grocery store' narrative task:

Written Sample

Henry Written Sample Image

  1. Tokens: Count how many full words are used. Enter this number in Table 1 below.
  2. Types: How much variety in give-and-take selection is there in the written sample? Count the number of UNIQUE words that are NOT repeated. For case, "el" might exist used 5 times, but you should only count it 1 time to become this number. Enter this into Table one beneath.
  3. The Type Token Ratio (TTR). The TTR is the # of Types divided by the # of Tokens. The closer the TTR is to 1 the more lexical variety in that location is. Enter Henry's TTR for his written sample in Table i below.

Earlier performing a TTR analysis on Henry's spoken version of the chore, NOTE: for purposes of comparison, this text should contain the aforementioned number of total words equally were used in the written version, therefore, your analysis should stop with the word una and do non count "OK," "uh," and partial words or false starts.

Spoken Sample

Henry Spoken Sample Image

Click the image to enlarge, or come across the total transcript (PDF): lines 10-13.

Effigy out the Blazon Token Ratio for the spoken sample to a higher place, using the same procedure. And so enter Henry's TTR for his spoken sample in Table 1 below.

Tabular array 1: Henry TTR Analysis

Henry Tokens Types TTR (= types ÷ tokens)

Analysis 2:

Now perform the aforementioned TTR analysis on the written (first) and spoken samples for Raúl provided beneath. So enter each TTR value in Table 2.


Written Sample

Spoken Sample

Raul Spoken TTR image

Click the paradigm to enlarge, or see the full transcript (PDF): lines 6-12.

Table 2: Raúl TTR Analysis

Raúl Types Tokens TTR

Table three: TTR comparing between Henry and Raúl'due south oral and written samples

Henry Raúl
TTR Written %
Spoken %

Comparison the lexical complexity in writing vs. spoken linguistic communication for each learner, what patterns do you come across from Henry and Raúl? Are they same or dissimilar? Discuss possible reasons for the patterns you find in these two learners' vocabulary usage.

When yous accept finished typing your respond, click to compare your response with the Learner Language staff response.

Henry Raúl
TTR Written 34 / 53   = .64 29 / 55    = .52
Spoken 34 / 53   = .64 35 / 55    = .64

Neither Henry nor Raúl seem to follow our prediction of seeing a richer lexical density in writing. Henry has an equal amount of lexical multifariousness in his oral and written versions. Withal, Raúl really has LESS lexical complication in his written version than in his oral version. His lexical diversity decreased from .64 in his oral version  to .52 in his written version.

What could explicate the results of Raúl's complexity analysis?It is in fact adequately consistent with what is reported for heritage language learners: their oral proficiency in the heritage language typically outpaces their written proficiency. Heritage learners typically use the heritage language primarily if not solely for social purposes, which tend to be oral. And typically they have not used their heritage linguistic communication for academic purposes in school, where they would have developed proficiency in written skills. Raúl'south written version is very unproblematic and repeats fifty-fifty the same word club: SVO.Raúl explained in his interview that he felt as though his vocabulary was limited and something he could potentially improve upon in a formal Castilian class.

Henry, on the other hand, likely learned all of his L2 vocabulary in both oral and written modalities. As a traditional foreign language learner, he probable has been assessed on his ability to produce new vocabulary words in both written and oral tasks throughout his language learning experience.

Function Two: Syntactic complexity

Now that nosotros've gotten an idea of our learners' lexical complexity, we can accept a look at their syntactic complexity. One basic method for doing this is to calculate the pct of complex sentences in a given sample of linguistic communication. A simple sentence has a subject and one predicate verb. Two unproblematic sentences may be joined by and. These are all the same counted equally two elementary sentences. For our purposes, a complex judgement is two elementary sentences combined with a subordinating conjunction (since, because, after, although, if, until, etc.) OR a relative pronoun (who, which, whose, whom, that, how, what, etc.).

For instance:


He remembered the girl who gave him the book in the station.
Él recordó la chica que le dio el libro en la estación.

He remembered the girl.
Él recordó la chica.


Using the written and spoken samples from Role One above:
For each learner, employ the tables below to find the % of complex sentences:

  1. Count the total number of sentences (T).
  2. Count the number of complex sentences used (C).
  3. Divide C by T. What percentage of the total sentences resulted as complex?
Henry C=Complex T=Sentences %

Raúl C=Complex T=Sentences %

This box shows all of your calculations in determining the complexity of our learners' language:



TTR Oral
% Circuitous Sentences Oral

Now that you've seen syntactic complexity aslope the lexical complexity of our learners, what further commentary can y'all provide most the complication in these learners' language?

When yous take finished typing your answer, click to compare your response with the Learner Linguistic communication staff response.

Henry Raúl
TTR Oral 34 / 53    .64 35 / 55     .64
Written 34 / 53    .64 29 / 55     .52
% Complex Sentences Oral two / 5     forty% one / v     twenty%
Written 2 / 4     50% 0 / 8     0%

While their TTR analyses were somewhat comparable, the percentages of complex sentences produced past Henry and Raúl are strikingly different. Offset, Henry did follow what we originally expected by producing (slightly) more than complex linguistic communication in his written sample (50%) than in his spoken sample (40%). Meanwhile, Raúl showed much lower syntactic complication in both modalities, with 0% complex sentences in his written sample.

Henry is likely to accept been explicitly taught how to write more circuitous sentences in his formal Spanish classes, and encouraged to do and so. Raúl yet likely never had this level of academic support in Spanish every bit he began school in English language at such an early age. Therefore, the difference in complexity between reading and writing and between our FL and heritage learner largely has to do with purpose and assessment of their Castilian. The purpose at some point in Henry's language learning probably focused on his ability to construct more circuitous sentences. This was probably something assessed in his Spanish classes. His purpose for learning Spanish throughout his life were therefore vastly different than that of Raúl, whose purpose and assessment has ever been purely intelligibility and advice.


0 Response to "How To Calculate Type Token Ratio"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel