Using the Professional Lexile Analyzer®
Overview | The Lexile Book Database | How the Lexile Analyzer Works |
Appendix A | Appendix B
This document describes the Professional Lexile Analyzer®, which you have been granted access to use by MetaMetrics® to analyze the reading demands of books, articles, passages and other texts as a Lexile® measure. The Lexile Analyzer is a web-based tool that determines the Lexile measure of professionally edited, complete, conventional prose text. Because of the way the Lexile Analyzer works, its accuracy depends on your following the text-preparation procedures and formatting conventions detailed in this document.
What Texts Can Receive a Lexile Measure?
Certain categories of text should not be measured as a Lexile measure. Because The Lexile Framework® for Reading was built upon the measurement of professionally edited, complete, conventional prose text, the Lexile Analyzer will return an inaccurate Lexile measure for other kinds of text. Follow these guidelines as you choose texts to measure:
You should measure...
You should not measure...
- Newspaper and magazine articles
- Short stories and reading selections
- Passages, interviews
- Student writing
- Multiple-choice questions
- Fill-in-the-blank questions
- Recipe lists, song lyrics
- Instant messages, text language
Please also note that texts in the measurable category will still require text editing, as detailed in "Step 2: Preparing your text for the Lexile Analyzer."
The Lexile Book Database
The Lexile Book Database contains certified Lexile measures for over 115,000 English fiction and nonfiction trade book titles from over 150 publishers. It is freely searchable using "Find a Book" on the Lexile website. All books are processed with whole-text measurement using the Lexile Analyzer, and all Lexile measures are certified by MetaMetrics.
How Do Books Get a Lexile Measure?
MetaMetrics measures a book at a publisher's request. Books are always measured in their entirety. Publishers pay for this service, as well as the right to use the Lexile measure in their marketing materials.
In order to ensure the most accurate Lexile measure, MetaMetrics' text measurement process includes the following steps, with quality checks at each stage:
Lexile Certification Process for Conditioned Texts
Several publishing partners use the Professional Lexile Analyzer to measure their texts developmental level before submitting for a certified Lexile measure. Upon submission to MetaMetrics, these files have been prepared using our text-preparation guidelines. These files are reviewed by our resource measurement coordinators to assure that the editing guidelines have been met. They are then submitted for Lexile code review and Lexile measures are returned to the publisher. It is only after review by MetaMetrics' resource measurement team that these measures are deemed "certified" and then available for distribution via marketing materials, websites, and searchable in "Find a Book."
How the Lexile Analyzer Works
The Lexile Analyzer evaluates two characteristics of a text: the frequency of its words and the lengths of its sentences. Research has shown that word frequency can be used as a proxy for vocabulary difficulty, and sentence length can be used as a proxy for sentence complexity. Sentence length carries more weight in the Lexile equation than word frequency does. When you submit a text to the Lexile Analyzer, it divides the text into "slices" of roughly 125 words, retaining complete sentences. The Lexile Analyzer aggregates the Lexile measures of all of these text slices to arrive at an overall Lexile measure of the text.
The Lexile Analyzer cannot read text. It counts words in sentences by automatically recognizing sentence beginnings and endings; it determines word frequency values by matching words to those in the 650-million-word MetaMetrics research corpus. Consequently, when using the Lexile Analyzer to measure text, you should keep in mind two keys to getting an accurate Lexile measure:
- All sentences must be automatically recognizable (capital letter at the beginning; end-punctuation at the end; no unconventional spacing or punctuation)
End-punctuation recognized by the Lexile Analyzer includes the following: period (.),
question mark (?), exclamation point (!), colon (:), semi-colon (;), and ellipses (...)
- All words must be automatically recognizable (correct spellings, spacing and punctuation)
Step 1:Converting Text from Image to Document
Instead of typing many pages of text, you can scan the pages, save them as a PDF file, and load them into an optical character recognition (OCR) program. Newer versions of Adobe Acrobat include this option:
- Open your PDF scan file in Adobe Acrobat.
- Under the Document menu, select Recognize Text Using OCR.
- When the OCR process is finished running, select Save As... from the File menu.
- In the Save as type drop-down box, choose Rich Text Document (.rtf)
- Open the resulting file and check for conversion errors. Adhere to text-preparation guidelines for the entire text.
The OCR results from Acrobat can be inconsistent, particularly involving punctuation marks such as periods not being recognized at all (see below). These inconsistencies will impact the accuracy of the Lexile measure. Unfortunately, the repair of a poorly OCRed file can take as long as typing the text.
A better OCR option is ABBYY FineReader, for which a free trial version is available online. Go to www.abbyy.com and select the "Downloads" tab. This program is not simple to use, but it does enable you to convert a complete book and quickly save as a rich text document. Files can be saved to your specified drive in batches and returned to for editing review once the OCR is complete. This is typically the process we follow in-house when processing text.
If a text is converted from hard copy to electronic format using an OCR application, some problems may occur in the conversion process. These tend to relate to the specific software used, and special care should be taken to ensure the accuracy of the electronic facsimile. Some examples of common OCR errors are as follows:
- A letter "m" might convert as "rn."
- A comma followed by a quotation mark (,") might be interpreted as a slash w/an apostrophe (/').
- Verify that all the intended punctuation is in place-no periods missing, semicolons omitted, etc.
- If a polysyllabic word is split between two lines with a hyphen, the hyphen should be removed and the word made whole.
Given the near limitless possibilities of language usage and layout, these examples should not be considered exhaustive. Rather, they should be seen as representative of the kinds of things that should be recognized when preparing a text for measurement.
Step 2: Preparing your text for the Lexile Analyzer
The Lexile Analyzer is designed to measure professionally-edited, complete, conventional prose text. It should not be used on non-prose, unpunctuated or unconventional text. The Lexile Analyzer determines sentence length through recognition of sentence endings, so sentences must be conventionally punctuated to be recognized (refer to How the Lexile Analyzer Works for sentence ender information). Likewise, the Lexile Analyzer determines word frequency by recognizing correctly spelled, well-formed words. Otherwise, the Lexile Analyzer will not return a useful estimated Lexile measure.
In preparing a file for measurement, your two basic objectives are to:
- Preserve the prose sentences and the words within them in your text
- Remove non-prose content from your text before you analyze it
You should keep in mind that the usefulness of an estimated Lexile measure depends on the proper preparation of a text for analysis. Seemingly minor errors can result in significant variation in Lexile measures. See "Appendix A-Editorial Errors and their Effect on Lexile Measures" on page 14 for an illustration of the influence that mistakes or improper editing can have on a Lexile measure.
Remember, the Lexile measure you receive from the Lexile Analyzer
is only as good as the text file you put into it.
Please observe the following text-preparation guidelines before you submit your sample file to the Lexile Analyzer.
Text preparation guidelines
Here are some guidelines for removing non-prose text:
You should measure...
You should not measure...
- Paragraphs of prose
- Captions that are complete sentences
- Bulleted/numbered lists in which the list items are complete sentence
- Dialogue, sentences within quotation marks
- Numbers and dates
- Foreign words
- Parenthetical phrases or clauses within sentences
- Informational text boxes containing complete sentences
- Incomplete sentences
- Sentences with unconventional punctuation
- Page headers and footers, page numbers
- Frontmatter (forewords, prologues, prefaces, tables of contents)
- Backmatter (afterwords, epilogues, glossaries, indexes, bibliographies)
- Chapter and section titles
- Captions that are incomplete sentences
- Headings and sub-headings
- Bulleted/numbered lists in which the list items are incomplete sentences
- The leading name and colon conventionally used in interview notation
- The greeting and closing from letters
- Footnotes and endnotes
- Poetry/song extracts
- Tables and graphs
- Abbreviations-especially instant messages and text messages
- Phonetic pronunciation guides
- Parentheses which contain complete sentences (remove parentheses)
Additional considerations should be made when editing a text for measurement. Historical notes, introductions, "About the author" pieces, and previews of the next book in a series should typically be removed. Such text is often written separately from the main text and thus contains unique textual characteristics that can influence the Lexile measure. However, such decisions should be carefully considered while preparing your text for analysis. Some frontmatter and backmatter may be a legitimate part of the larger text and should be included. As a general guideline, if text appears to be written by the same author for the same audience, then it should be included in the Lexile analysis.
In the layout of children's picture books, single sentences are sometimes distributed across multiple pages of a book. In the activity of reading, these page breaks function as sentence endings, so a semi-colon should be inserted at each page break in your file. The Lexile Analyzer interprets semi-colons as sentence-ending punctuation. In the example below from Ludwig Bemelmans' Madeline (Puffin Books), semi-colons would be placed in the plain text file after the words "good" and "bad" to emulate the effect that page breaks have on reading.
Page-break punctuation example
When using resources downloaded from websites, be sure to remove the non-prose and web page-specific elements, as indicated in the example below from a CNN.com article:
Only the main body of the article (G) and the complete-sentence figure caption (C) should be measured. The article title (A), date line (F), and image (B), as well as web site-specific elements such as article highlights (E) and margin advertisement text (D), should not be measured.
Also be careful to eliminate all HTML code and URLs from your sample file when measuring web resources.
Tests and Assessments
Reading comprehension tests
All complete sentences in all reading passages should be measured all together as one document. You should not measure sample items, directions, or the test items themselves. Depending on the format of the test item, you may also measure the complete sentences in the items in order to compare their difficulties to that of the reading passage, but do not measure the item text together with the passage text.
All complete sentences in the passages and the items should be measured (except directions within the items, e.g., "Make no change."). Passages that have embedded blanks should be measured with the correct answer in place of the blank. The writing prompt and any associated directions should be measured. You should not measure sample items and all other directions.
All complete sentences in the items should be measured. You should not measure images, diagrams, tables, sample items, and directions. Some examples of content you should not measure include maps or captions in social studies tests and equations or formulae in mathematics tests.
See the example mathematics items below for a representative text-preparation sample for content-area tests. Example taken from Grade 5, 2006 Texas Assessment of Knowledge and Skills (TAKS), © 2006 Texas Education Agency.
Please contact MetaMetrics directly at email@example.com if you have further questions about measuring tests.
Step 3: Convert your Text for Lexile Analysis
The Lexile Analyzer requires an ASCII plain text document (*.txt file) for proper processing and Lexile measurement. A plain text file is one which uses only the basic ASCII character set and contains no special formatting. If you submit files of an incorrect format to the Lexile Analyzer, an incorrect Lexile measure will be returned.
Note: The Lexile Analyzer cannot measure Microsoft Word, PDF, HTML, or scanned image files such as JPGs.
If the source text to be measured is in an electronic document format, such as a word-processing document or a rich text document, the file usually can be converted into the plain text format using the settings in the application's Save As... menu.
For instance, if your document is in Microsoft Word for Windows, then follow this procedure to save to plain text:
- With your document open, select Save As... from the File menu.
- In the Save as type drop-down box, select Plain Text (*.txt).
- Click the Save button and a File Conversion window opens:
Microsoft Word (Windows) file conversion dialog box.
- Click the Other encoding radio button and select US-ASCII from the list of formats to the right.
- Also check Allow character substitution.
- Click the OK button.
You have saved your document in the plain text format for the Lexile Analyzer.
Note: If you are working on a Macintosh or Apple computer, then follow the same procedure except save plain text documents in the "MS-DOS text" format. For a user with a Mac Operating System, the user needs to edit his or her text file in TextEdit and change the "Plain Text Encoding" to Western (Windows Latin 1) when saving their file.
Step 4: Analyze your file!
You should have received log-in information for access to the Professional Lexile Analyzer. It is accessible via the Lexile website. Select the "Tools" tab, and then the "Lexile Analyzer" link.
When you are ready to measure your text, log-in to the Lexile Analyzer with your username and password. The file submission dialog box appears.
Professional Lexile Analyzer file submission
Next, select Browse to find the plain text file of your text on your hard drive.
Step 5: View your results
Select the "Analyze" button and your Lexile Analyzer results appear on a new screen.
Lexile Analyzer Results
You should print the results screen and note your filename or the name of your text because these results are not saved in any retrievable way. If you do not print or record the results, you will have to re-analyze your text.
Lexile Analyzer results
Lexile Analyzer results are provided in four categories:
- Lexile measure - This value indicates the reading demand of the text in terms of the semantic difficulty and syntactic complexity. The Lexile scale generally ranges from 200L to 1700L, although actual Lexile measures can range from below zero to above 2000L.
- Word Count - This value reflects the total number of words in the text that was analyzed.
- Mean Sentence Length - This value is the average length of a sentence in the text, based on the sentences that were analyzed.
- Mean Log Word Frequency - This value is the logarithm of the number of times a word appears in each 5-million words of the MetaMetrics research corpus of 650 million words. The mean log word frequency is the average of all such values for words which appear in the text being analyzed.
When you use the Professional Lexile Analyzer to get an estimated Lexile measure of a text, please note that:
- You may not publish or distribute the Lexile measure
- You may not enter the Lexile measure into a library or media center database or catalog
- Your Lexile measure is not a certified Lexile measure of that book or text
Appendix A-Editorial Errors and their Effect on Lexile Measures
Proper file preparation, as detailed in the earlier section "Step 2: Preparing your text for the Lexile Analyzer" on page 7, is the crucial step for ensuring Lexile measurement accuracy. File preparation errors or oversights, such as missing or incorrect punctuation or sections of unconventional prose or non-prose, may compromise your Lexile Analyzer results and return an estimated Lexile measure too far from the actual Lexile measure to be of use to you.
The measurement impact of editing errors and oversights is more severe the shorter the length of the input file. For this reason, special attention is encouraged when preparing a short passage, article or children's text for analysis.
An illustration of the effects of improper editing when using the Lexile Analyzer can be seen by looking at Cardinals by Lynn M. Stone (Rourke Corp.). When properly edited, the Lexile Analyzer results are as follows:
Lexile measure: 800L
Word Count: 800
Mean Sentence Length: 10.81
However, when the text is analyzed with five "end of sentence" punctuation marks removed, the results are:
Lexile measure: 910L
Word Count: 800
Mean Sentence Length: 11.89
Because Cardinals has a word count of only 800 words, the mere omission of five punctuation marks within the text affected the Lexile measure by 110L, and increased the mean sentence length significantly.
Appendix B-Lexile Usage Conventions
When you use the Professional Lexile Analyzer, please note that:
- You may not publish or distribute the Lexile measure
- You may not enter the Lexile into a library or media center database or catalog
- Your Lexile measure is not a certified Lexile measure of that book or text
Follow these terminology conventions when you refer to Lexile measures:
- "Lexile measure" should always have a capital "L."
- "Lexile measure" should always have a lower case "m."
- We refer to "The Lexile Framework for Reading" (with the registered trademark symbol) the first time that it is mentioned, and then "the Lexile Framework" henceforth.
- Lexile measures are reported as a number followed by a capital "L" (for Lexile measure). There is no space between the number and the "L" and Lexile measures of 1000 or greater are reported without a comma (e.g., 1050L). All Lexile measures are rounded to the nearest 10L to avoid over-interpretation of the measures.
- We refer to a Lexile "zone" or "level" as representing the bands on the Lexile map (e.g., the "700L Zone" goes from 700L to 790L).
- We refer to a "Lexile range" as the suggested range of Lexile measures that a reader should read. The Lexile range for a reader is 50L above to 100L below his or her Lexile measure. This takes into account measurement error found in the tests administered to students and in the automated measurement of the books. If a student attempts material above his or her Lexile range, the level of challenge may be too great for the student to be able to independently construct very much meaning from the text. Likewise, material below the reader's Lexile range will provide that student with little comprehension challenge. Material above or below a reader's Lexile range can be used for specific instructional purposes.
- All Lexile measures of zero and below are reported simply as "BR" for "Beginning Reader."