How do I use the De-Jargonizer?

It's really easy: Once you open the site, choose the file you wish to evaluate or copy and paste it in the empty text box. Then press “START.” The text will then be presented on the screen, and results are displayed both by color and by percentage. Words in black are common words, words in orange are mid-frequency words, and words in red are jargon. The table on the right presents the number of words in the text and the results: the number of words and the percentage of words for each frequency (high, mid and jargon).

What kind of files can the De-Jargonizer check?

The program works with texts that are manually copied onto the website or uploaded Text(.txt)/ Word(.docx) files. It can evaluate texts of unlimited length; however, very long texts may take more time to evaluate than shorter texts.

How can I judge the results of my text?

The results can be used to evaluate a text or speech that communicates science before it is published or presented to non-experts. Results can also be used to compare a pre-post design, evaluating a text before and after a training workshop or course. Naturally, we would expect that the percentage of jargon in a text written following a science communication workshop would be lower than a text written before the workshop (Baram-Tsabari & Lewenstein, 2013; Rakedzon, Segev, & Baram-Tsabari, 2016; Rakedzon & Baram-Tsabari, 2016; Sharon & Baram-Tsabari, 2013).

Studies have shown that a reader needs to understand 98% of vocabulary in a text to adequately comprehend the content (Hu & Nation, 2000). According to the literature, the top 2000 high frequency word families (word families include a headword, e.g. develop, would in addition include all of the related forms, such as undeveloped, underdeveloped, development, developments, developer, and developers; high frequency words include words such as weak, eye, animal) cover on average about 85% of general spoken or written texts (Schmitt & Schmitt, 2014). Moreover, the literature has found that ideally readers, including second language readers, should be able to understand 98% of the words in a text. Therefore, the percentage for rare words should not exceed 2%. However, some researchers, discuss the option of a less stringent 95% (Laufer & Ravenhorst-Kalovski, 2010). In any case of jargon, the writer might consider replacing the jargon with other words, or adding an explanation.

Researchers have found that approximately 10% of texts contain mid-frequency words (e.g., laser, inject, protein); these words are defined as the group which contains words between high and low (rare) frequency which should be familiar to intermediate and advanced readers (Schmitt & Schmitt, 2014). In academic texts, research has shown different percentages: 5% technical vocabulary, 8-10% academic vocabulary (some overlap with mid-frequency), and 80% high frequency (Nation, 2001).

Baram-Tsabari, A., & Lewenstein, B. V. (2013). An Instrument for Assessing Scientists’ Written Skills in Public Communication of Science. Science Communication, 35(1), 56–85. (link)
Hu, M., & Nation, I. S. P. (2000). Vocabulary density and reading comprehension. Reading in a Foreign Language, 23, 403–430. (link)
Laufer, B., & Ravenhorst-Kalovski, G. C. (2010). Lexical threshold revisited: Lexical text coverage, learners’ vocabulary size and reading comprehension, 22(1). (link)
Nation, I. (2006). How Large a Vocabulary is Needed For Reading and Listening? Canadian Modern Language Review, 63(1), 59–82. (link)
Nation, I. S. (2001). Learning Vocabulary in Another Language. New York: Cambridge University Press. (link)
Rakedzon, T., & Baram-Tsabari, A. (2016). Assessing and improving L2 graduate students’ popular science and academic writing in an academic writing course. Educational Psychology. (link)
Rakedzon, T., Segev, E., & Baram-Tsabari, A. (2016). An automatic jargon identifier for scientists engaging with the public and for science communication educators. Manuscript in Preparation.
Schmitt, N., & Schmitt, D. (2014). A reassessment of frequency and vocabulary size in L2 vocabulary teaching. Language Teaching, 47(4), 484–503. (link)
Sharon, A. J., & Baram-Tsabari, A. (2013). Measuring mumbo jumbo: A preliminary quantification of the use of jargon in science communication. Public Understanding of Science (Bristol, England), 23(5), 528–546. (link)

Can you give an example of how to read the results?

example texts:
1. A well-written popular text:
Below you can find a well-written text processed by the De-Jargonizer. The text is intended for general intermediate to advanced readers, followed by the score. The score shows common (high frequency - black), normal, (mid-frequency- orange) and rare words (jargon - red).

Common: 90%, 243
Mid-Frequency: 10%, 27
Rare: 0%, 1
Score: 95
Number Of Words: 271
Interrupted a trillion times a day? Thats may make you more creative!
Trying not to sink in a flood of emails, text messages and phone calls, I tried to work on my masters thesis in behavioral sciences and management at the Technion. If the following research I managed to perform under this interruptions attack sounds surprising to you, it proves its own argument: interruptions can make us more creative. Numerous studies in psychology have been busy in recent years trying to answer whether interruptions are bad to our performance and mood. Actually, most of these studies answered a different question: how much are interruptions bad. But these studies overlooked a certain type of tasks, in which forgetting of what you have done a second ago is helpful: creative tasks! When trying to come up with a new idea, one is often stuck on too predictable ways of thought. An interruption may help one to turn over to a new leaf.
We tested this hypothesis in a lab study. We gave 61 students two creative tasks. One third of the participants worked on the tasks continuously, without any interruption. The rest were interrupted during the tasks by simple arithmetic exercises that popped -up on their screen, either once for a longer period, or multiple times. We found that those participants who were interrupted were more creative, and specifically their creative performance improved after each interruption. So next time you are thinking about turning off your phone in order to concentrate on a task, think again. If its a creative task, keep it on and let yourself be inspired by interruptions.

For each level, a percentage of the words (left) and total number of words (right), of the text are presented. This text uses only one unknown word, which readers could spot as a name, 10% normal (mid-frequency) and 90% common (high frequency) words.

2. A text requiring some adaptation for non-experts:

Common: 84%, 99
Mid-Frequency: 13%, 15
Rare: 3%, 4
Score: 90
Number Of Words: 118
Rhythm disorder of the heart is a proximal cause for heart failure. Nowadays treatment for rhythmic disorder relay on electronic pacemakers. Although an excellent solution, it withholds disadvantages such as the need for battery change, and the risk of contamination. Thus, a biological alternative may be ideal. Stem cells that are generated from the person's own hair, and can differentiate in to heart-like cells, hold a therapeutic promise.
To assess compatibility of these heart-like cells to human physiology, their functionality should be investigated. Hence, our major goal is to characterize their electrical behavior and examine whether they functionally recapitulate adult human heart cells. Although much research is still due, this novel biological solution allows optimistic hopes.

Author: Meital Ben-Ari

This text uses 3% jargon words, 13% normal (mid-frequency) and 84% common (high frequency) words. In this case, the writer should review the words in red and decide to either leave them, delete them or provide an explanation. For example, the writer in this case may leave withholds; they may delete proximal, which is not necessary to understand the text, and they may simplify functionally recapitulate to “repeat the functions [of adult heart cells].”