LINGUIST List 27.4571

Wed Nov 09 2016

Software: Text/Corpus Linguistics: TAALES 2.2

Editor for this issue: Amanda Foster <>

Date: 09-Nov-2016
From: Scott Crossley <>
Subject: Text/Corpus Linguistics: TAALES 2.2
E-mail this message to a friend

Tool for the Automatic Analysis of Lexical Sophistication (TAALES) 2.2 :

We are excited to announce the release of TAALES 2.2. TAALES is a freely available tool to automatically analyze over 400 lexical features in a text. TAALES 2.2 represents a major upgrade to TAALES with regard to both index coverage and usability.

Index Coverage:

TAALES now includes the following index categories:

- Academic Language

Academic Formulas List (AFL)
Academic Word List (AWL)
Academic Word List (AWL) Sublists

- COCA Indices

COCA Word Frequency and Range for five registers
COCA Bigram Frequency, Range, and Association Strength for five registers
COCA Trigram Frequency, Range, and Association Strength for five registers

- Frequency and Range Indices (from sources other than COCA)

BNC Word Frequencies
BNC Ngram Frequencies
MRC Frequencies (includes range for some lists)
SUBTLEXus Frequencies (includes range)

- Other Index Types

Age of Exposure (AOE)
Contextual Distinctiveness
ELP Word Information
ELP Response Time Norms
Hypernymy and Polysemy
Psycholinguistic Word Information Norms


TAALES now includes text level and word/bigram/trigram level index coverage diagnostics.

Text level diagnostics provide the percent of the words/bigrams/trigrams in a text that are covered by a particular index.

Word/bigram/trigram level diagnostics provide the index score for each word/bigram/trigram in a text. This feature allows users to conduct fine-grained post-hoc analyses.

TAALES 2.2 and other text analysis tools are freely available here:

Linguistic Field(s): Text/Corpus Linguistics

Page Updated: 09-Nov-2016