It's really easy: Once you open the site, choose the file you wish to evaluate or copy and paste it in the empty text box. Then press “START.” The text will then be presented on the screen, and results are displayed both by color and by percentage. Words in black are common words, words in orange are mid-frequency words, and words in red are jargon. The table on the right presents the number of words in the text and the results: the number of words and the percentage of words for each frequency (high, mid and jargon).
The program works with texts that are manually copied onto the website or uploaded Text(.txt)/ Word(.docx) files. It can evaluate texts of unlimited length; however, very long texts may take more time to evaluate than shorter texts.
The results can be used to evaluate a text or speech that communicates science before it is published or presented to non-experts. Results can also be used to compare a pre-post design, evaluating a text before and after a training workshop or course. Naturally, we would expect that the percentage of jargon in a text written following a science communication workshop would be lower than a text written before the workshop (Baram-Tsabari & Lewenstein, 2013; Rakedzon, Segev, & Baram-Tsabari, 2016; Rakedzon & Baram-Tsabari, 2016; Sharon & Baram-Tsabari, 2013).
Studies have shown that a reader needs to understand 98% of vocabulary in a text to adequately comprehend the content (Hu & Nation, 2000). According to the literature, the top 2000 high frequency word families (word families include a headword, e.g. develop, would in addition include all of the related forms, such as undeveloped, underdeveloped, development, developments, developer, and developers; high frequency words include words such as weak, eye, animal) cover on average about 85% of general spoken or written texts (Schmitt & Schmitt, 2014). Moreover, the literature has found that ideally readers, including second language readers, should be able to understand 98% of the words in a text. Therefore, the percentage for rare words should not exceed 2%. However, some researchers, discuss the option of a less stringent 95% (Laufer & Ravenhorst-Kalovski, 2010). In any case of jargon, the writer might consider replacing the jargon with other words, or adding an explanation.
Researchers have found that approximately 10% of texts contain mid-frequency words (e.g., laser, inject, protein); these words are defined as the group which contains words between high and low (rare) frequency which should be familiar to intermediate and advanced readers (Schmitt & Schmitt, 2014). In academic texts, research has shown different percentages: 5% technical vocabulary, 8-10% academic vocabulary (some overlap with mid-frequency), and 80% high frequency (Nation, 2001).
Baram-Tsabari, A., & Lewenstein, B. V. (2013). An Instrument for Assessing Scientists’ Written Skills in Public Communication of Science. Science Communication, 35(1), 56–85. (link)
Hu, M., & Nation, I. S. P. (2000). Vocabulary density and reading comprehension. Reading in a Foreign Language, 23, 403–430. (link)
Laufer, B., & Ravenhorst-Kalovski, G. C. (2010). Lexical threshold revisited: Lexical text coverage, learners’ vocabulary size and reading comprehension, 22(1). (link)
Nation, I. (2006). How Large a Vocabulary is Needed For Reading and Listening? Canadian Modern Language Review, 63(1), 59–82. (link)
Nation, I. S. (2001). Learning Vocabulary in Another Language. New York: Cambridge University Press. (link)
Rakedzon, T., & Baram-Tsabari, A. (2016). Assessing and improving L2 graduate students’ popular science and academic writing in an academic writing course. Educational Psychology. (link)
Rakedzon, T., Segev, E., & Baram-Tsabari, A. (2016). An automatic jargon identifier for scientists engaging with the public and for science communication educators. Manuscript in Preparation.
Schmitt, N., & Schmitt, D. (2014). A reassessment of frequency and vocabulary size in L2 vocabulary teaching. Language Teaching, 47(4), 484–503. (link)
Sharon, A. J., & Baram-Tsabari, A. (2013). Measuring mumbo jumbo: A preliminary quantification of the use of jargon in science communication. Public Understanding of Science (Bristol, England), 23(5), 528–546. (link)