QuickScore Validity Q & A

WHAT IS THE QUICKSCORE?

The QuickScore is an online multi-level test battery that provides a quick and effective way to identify the reading comprehension level of students, see student strengths and weaknesses, and determine the strategies students use to construct meaning from text. It consists of multiple-choices response, written response, and oral response tests and provides comprehensive teacher reports that are easy to use and interpret for making instructional decisions.

WHAT IS THE PURPOSE OF THE QUICKSORE?

The QuickScore is designed to make it easy for a teacher to assign and track student performance at eight different reading proficiency levels in order to determine the level at which a student can successfully construct meaning from text and to provide a quick and comprehensive means for diagnosing students who might need special help. QuickScore is designed to provide teachers with the information they need to make informed instructional decisions.

WHAT IS THE ASSESSMENT INTENT OF THE QUICKSCORE?

The intent of the QuickScore is to engage the student in the reading process through the use of high interest stories with questions that are structured to engage higher level thinking. The psycholinguistic definition of reading underlying the test is that reading is the process by which the reader brings to the text background experience (schema) and uses context clues and graphophonic, syntactic, and semantic cues to sample the text, make predictions, and seek confirmation of those predictions as a means of constructing meaning. The Michigan Department of Education puts it this way: "Reading is the process of constructing meaning through the dynamic interaction among the reader's existing knowledge, the information suggested by the written language, and the context of the reading situation."

HOW IS COMPREHENSIBILITY DEFINED FOR THE QUICKSCORE?

The QuickScore defines comprehensibility as the extent to which the difficulty of the test material matches the ability of the reader to comprehend it. Many factors are involved in determining comprehensibility, including content, context, vocabulary, test score analysis, student interviews, and schema. QuickScore gives special consideration to schema since what a reader brings to the text in the form of background experience is important to the reader's ability to construct meaning from the text. Story content and questions were designed with a consideration of background as a comprehension factor, and questions were designed so that being able to answer questions correctly without reading the story would be minimized.

WHAT IS THE STRUCTURAL DESIGN OF THE TEST STORIES?

QuickScore stories (story narrative genre) are designed to encourage the student to put effort into constructing meaning from text. Vocabulary difficulty is controlled. All stories were analyzed with computer programs to determine the percentage of words that appear on a list of 2,450 most common words versus the percentage of words not on the lists. Words not on the basic word list were considered to be "difficult" words. A ratio of 20% "difficult" words to 80% "basic" words was selected, and stories were written to accommodate this ratio. Synonyms were considered for replacements for words identified as particularly "difficult."

All stories were run through computer grammar-check programs. Word lengths, sentence lengths, difficult word runs, and sentence variation were also checked. Readability computer programs were used to determine the approximate grade level for the stories - Flesh-Kincaid, Flesh Reading Ease, Gunning Fog Index, and Dale-Chall readability indices were calculated to determine story difficulty based on surface structure. The QuickScore recognizes that comprehensibility cannot be determined from such surface structure analysis. Statistical item analysis methods were used to address item difficulty, comprehensibility, and validity.

All of the stories were field tested to determine grade level interest and engagement. The intent was to use only high interest stories to keep the student on task and thus maximize the measurement of the reading process. Only story narrative genre was used so there would be more consistency and less measurement error across stories and subtests.

Although the stories were constructed to accommodate the eight categories used for the questions, the stories were not so contrived that the originality, authenticity, and integrity of the story was lost. If a story was determined to have lost these characteristics as determined by the story author, the story was either not used or was revised, always with the intent to meet the standards of story integrity as opposed to being contrived to fit the questions.

WHAT TYPES OF QUESTIONS ARE USED FOR LEVELS 1-2?

There are six parts each to Levels 1 and 2. Multiple choice questions are used for the first five parts, which includes: word recognition in which three answer-choices are presented; picture-word association in which a picture is presented with three answer choices; short questions with three short answer choices for each question; a short story with four questions with three answer choices for each question. Part 6 requires a written response in which the student is asked to describe in writing what is happening in different pictures.

WHAT TYPES OF QUESTIONS ARE USED FOR LEVELS 3-8?

Multiple choice items are used for the stories in Levels 3-8, followed by one written response. There are eight questions for each story, and each question has three answer choices. The eight questions measure: 1) main idea, theme, and concept; 2) character traits and attitudes; 3) conflict and problem; 4) important details; 5) sequence of events; 6) cause and effect relationships; 7) problem resolution, results, and conclusion; 8) implications, inferences, and predictions. Because this item ordering is used consistently for each story across all levels, a profile of the student’s responses in relation to the eight measures can be computed. The written response item requires the student to retell in writing one of the stories, and the response is scored by the teacher with an online rubric.

WHAT ARE THE OUTCOMES BEING MEASURED BY LEVEL 3-8 QUESTIONS?

Question 1: The student is expected to use knowledge and meaning gained from the reading to synthesize a response about the main idea, theme, or concept presented in the story.

Question 2: The student is expected to use knowledge and meaning gained to respond to a question about the traits or attitude of a character. Generally, the student will have to gather evidence of the trait or attitude from more than one place in the story.

Question 3: The student is expected to apply specific information from the story to identify what (a, the, one, or the main) problem is in the story.

Question 4: The student is expected to identify a detail that is important to the meaning of the story.

Question 5: The student is expected to recognize the sequential ordering of events. The events may be present in one sentence or more than one sentence, but the question addresses only what event immediately follows another, without skipping over an intermediate event.

Question 6: The student is expected to recognize the causal relationship between two or more events, with event A preceding event B. The mention of event A may come after the mention of event B in the story.

Question 7: The student is expected to make a decision about how the story ended.

Question 8: The student is expected to make a prediction of some consequence or event, or give a probable reason for a past action or event, based on explicit and implicit evidence in the story.

WHY IS THE ITEM STEM ALWAYS A QUESTION?

In constructing the QuickScore, it was determined, based on published research, that the question format versus a sentence completion format would be the fairest approach and would produce the highest score reliability and validity. The reasoning is that it is easier to hold the stem in memory as answer choices are being considered when the stem is a complete question. The amount of time spent checking and rechecking the relationship between the stem and the choices is also made easier with a question stem format. In her article, Ask a Clear Question and Get a Clear Answer, Stella Statman (1988) presents a strong argument for the use of the question stem format as a "clearer, fairer, and more authentic way of testing reading comprehension." Haladyna and others (2004) recommend the question format. Asking a question with a complete thought is typical of how we communicate, and it is only logical to use a complete question format when testing to fairly obtain as much information as possible about the meaning the student constructs from the text.

WHY ONLY THREE ANSWER CHOICES FOR MULTLIPLE-CHOICE ITEMS?

The tradition of using four or five options for multiple-choice items is strong, despite the research evidence suggesting that it is nearly impossible to create selected-response test items with more than three functional answer choices (Downing, 2006). A study by Haladyna and Downing (1993) shows that three choices are typically sufficient, because even in very well-developed tests it is rare that more than three choices are statistically functional. The fourth choice is typically difficulty to construct as a plausible option because the best options have already been used, and the fourth option (regardless of what order the options are in) is typically the poorest performing answer choice. Thus, the fourth choice provides less information. Also, the guessing factor in relation to a four option item is not truly 25% because the examinee rarely just makes random guesses.

A meta-analysis of 80-years of published empirical studies on the appropriate number of options to use for multiple-choice concludes: “…MC items should consist of three options, one correct option and two plausible distractors. Using more options does little to improve item and test score statistics and typically results in implausible distractors (Rodriguez, 2005).

WHY USE STORIES OF SIMILAR READABILITY AND COMPREHENSIBILITY FOR EACH LEVEL?

The QuickScore program uses stories at each particular level that have low enough readability and comprehensibility and high enough interest level so that most students at the designated level will not become discouraged and stop reading because the stories are too difficult. Typically, stories are at the lower (easier) range of readability for a grade level.

The stories for each particular level are of similar readability and comprehensibility so that similar measures of reading comprehension can be obtained for each story. This multiple-measures-similar-difficulty format avoids the problems associated with reading comprehension tests that provide only one passage per level. The problem with only one passage as the measure of comprehension is that a student may score low because the student does not like the story or has no background knowledge to bring to that particular story, or on the other hand, may obtain a high score because the student has a special interest in the story or has a great deal of familiarity with the content of the story due to background experiences. With the QuickScore multiple-measures-similar-difficulty model, the reliability, meaning, interpretability, and use of the scores are increased. Additionally, multiple measures present more information for making more accurate instructional decisions when there is an extremely high and/or extremely low story score at a particular level.

HOW QUICKSCORE TEST RESULTS SHOULD AND SHOULD NOT BE USED?

  • Use as a screening test or as additional information when school or district personnel need to know if a student needs to be assigned to a special reading program such as Title I, a Limited English Proficiency program, or a bilingual program.
  • Use to determine the status or gains of an individual student or a group of students in reading comprehension.
  • Use to evaluate a curriculum or an instructional program if it has been established that the goal of the program is to develop reading comprehension.
  • Use as a diagnostic instrument by teachers, but teachers need to be cautious not to use the QuickScore results in ways not recommended.
  • Don’t use QuickScore results as the sole evidence for placement of students in special programs or for writing Individual Educational Programs. Scores from the QuickScore could be used as supportive evidence.
  • Don’t use QuickScore results alone for class to class, school to school, or district to district comparisons because many factors play a role in determining differences.
  • Don’t use QuickScore results for evaluating teachers, because other influential factors affect student performance, and the QuickScore does not tease out these other influences.
  • Don’t use QuickScore results for evaluating schools or districts if a baseline is not established and other factors have not been considered.

WHAT TYPES OF QUICKSCORE TESTS ARE USED

Multiple Choice. Five parts of Levels 1 and 2 consist of multiple-choice items. For Levels 3-8 a set of 8 multiple-choice items per story per level are used so that the kinds of cognitive behaviors required of each student to answer each question is consistent from story to story.

Written Retelling. After finishing the multiple choice part, there is one written response test for each test level. For Levels 1 and 2 the student describes what the student sees in a set of pictures. For Levels 3-8, the student is asked to pick a favorite story and retell the story in writing. The written responses are scored by the teacher with an online scoring rubric that is provided.

Miscue Analysis. For Levels 3-8, the student is asked to read a story aloud and the teacher clicks on an incorrect word (miscue), and immediately a floating scoring rubric appears for categorizing each miscue. The teacher selects on the rubric the strategy the student used and the miscue word and the rubric scores are automatically recorded so the teacher can go back later to analyze the results and make instructional decisions.

Oral Retelling. For Levels 3-8, immediately following the oral reading of the story used for the miscue analysis, the student is asked to retell the story orally. A scoring guide showing all the chunks of information from each story appears so the teacher can easily score the Oral Retelling. The percentage of chunks of information addressed by the student is calculated and immediately recorded. This percentage can be compared to the Written Retelling score and to the multiple-choice scores.

WHAT WAS THE FIELD TEST METHODOLOGY USED FOR THE QUICKSCORE?

The Level 3-8 tests were developed over four field test administrations. The first administration was a developmental field test on approximately thirty students from each grade level, using stories that had been selected from the story pool based on student interest. During the test, students were allowed to raise their hands, and the examiner would answer a student's question and help the student with word pronunciation, story understanding, and question interpretation. During this developmental field test procedure, notes were taken on all student comments, and tallies of difficult words or problem areas were recorded. At the end of the test, the stories were discussed with the students, and students were encouraged to give comments regarding which stories they liked or did not like, and why. Following this first field test, twenty students were individually interviewed regarding the information gathered from the field test.

After revising the test based on information from the developmental field test, the second field test was administered to approximately sixty students from each grade level. Students were allowed to raise their hands during the test, but the examiner would only tell a student the pronunciation of the words the student did not know and would not give help with understanding the story or answering the questions. All questions asked by the students were recorded, however, and a tally was kept on which specific words were asked about. At the end of the test, students were asked to vote on how well they liked the stories. Any story not receiving at least eighty percent acceptance was marked for possible revision. Students were interviewed regarding vocabulary, story content, readability, comprehensibility, ethnic specific words that might be a problem, and student schema. An item analysis computer program was used to run the test results, and two students from each grade level who had obtained moderately high scores were individually interviewed and asked to respond to any questions and options that the computer program identified as problem items.

After revisions were made to both stories and test items, the test was field tested on approximately ninety-five students (three classrooms) from each grade level. The students were asked to complete the Written Retelling at the end of the multiple choice test. Students were allowed to raise their hands for help during the testing session if they found anything in the test that they did not understand, and notes were taken, but no help was given unless there was a typographical error in the test itself. All the student questions were recorded, and two students from each grade level were later interviewed regarding the questions raised.

An item analysis with full statistical results was performed, and twenty-four items out of the total of one-hundred and eighty-four different items (based on repeated items for the anchor stories) were marked for revision. The goal was to obtain proportion correct values of .50 with distracter values of .25 or higher, and to obtain an item to total (point-biserial) correlation value for correct answers of at least .25 after correction for spuriousness (taking the item being correlated out of the total). Correcting for spuriousness reduces the point-biserial value but gives more accurate information for decision making.

A final pilot test was administered to approximately two-hundred students from each grade level. The test was administered with standardized instructions. The results were analyzed, and fourteen of the total of one-hundred and seventy-eight different items were marked for revision. Following several live test administrations, norms and score scales were developed using the results of approximately 35,000 students.

DOES THE QUICKSCORE CONFORM TO TESTING INDUSTRY STANDARDS?

Yes. The developer of this test believes that all tests developed for use with children should ascribe to professional testing standards. The Standards for Educational and Psychological Testing, The Code of Fair Testing Practices, and the Code of Professional Responsibilities in Educational Measurement have been used for guidance in the development and the recommended uses of this test. For more technical information, such as descriptive statistics, item analysis, factor analysis, and validity studies, contact Ronald Carriveau @ rcarriveau@verizon.net.

© All rights reserved



About Us

QuickScore Validity Q & A

Contact Us