06 – Scoring

06 – Scoring

Overview

Scoring

  • Idea
  • Scoring Debug Window
  • Scoring Syntax
  • Missing Coding

Scoring – Introduction

Motivation

  • CBA allows automatic scoring (scoring definition part of item authoring)

  • Scoring at runtime can be used for various purposes

    • Feedback (e.g., notification when questions / items are omitted)
    • Test assembly (e.g., multi-stage testing or adaptive testing)
    • Data-sets (e.g., monitoring data instantly after completing a case)

Structure

  • Raw scoring is be part of the Task (allows, for instance, incorporating innovative parts like page visits, attempts, etc.)

  • Other parts (e.g., the assignment of IRT parameters for difficulty, discrimination etc.) might be population specific, hence, not part of the CBA ItemBuilder project file.

Scoring Debug Window

For developing and testing Scoring rules, the Preview provides the Scoring Debug Window that lists the Names of the active hit (=Categorical value) for each Class (=Variable) and, if provided, the Result Text (=Non-categorical value).

During Preview

  • Current scoring (at runtime) can be debugged in the Preview

  • Strg/Ctrl + S is the (default) hotkey for the Scoring Debug Window

  • Hotkey configured in the Preference of the CBA Preview (main menu Utilities > Open preferences)

  • Find a short introduction here and a full description here.

Scoring Debug Window in Action

What for?

  • Select a possible answer and verify the correct scoring.

With the Scoring Debug Window item authors can define and test the scoring for all possible ways to interact with the Task.

Tip

  • The input focus must be within the item, before the hotkey can work.

How to Read the Scoring Debug Window

  • Old legacy scoring (can be ignored)
  • Process indicators (i.e., information about the Task, e.g., Execution Time)

  • Hits assigned to Classes (for all variables in column Class the value is in column Name or Result Text), e.g.,
    • Variable1 = Question1_Missing
    • Variable3=D

  • Complete list of defined hits (Result either True or False)
  • Each Hit refers to syntax that translates selections and user-interactions

Basic Building Blocks for Scoring

Conditions (aka Hit-Conditions / Hits)

  • Basic building block for the scoring definition are Conditions.
  • Conditions combine possible responses (e.g., selected RadioButtons, Checkboxes, entered text in InputFileds) to Logical Expressions
Example Condition
RadioButton with UserDefinedId rbA is selected rbA
Checkboxes cbOne and cbTwo are selected (cbOne and cbTwo)
Text in inpEx1 matches a regular expression matches(inpEx1,"...")
The number of interactions with the item is smaller than 10 [user_interaction()<10]
FSM variable has a particular value variable_in(V1,5)
  • Operators (like matches()) and automatically created variables (like [user_interaction()]) allow the definition of very detailed and fine-granular conditions.

Scoring – Syntax Editor

Scoring is defined in
the Tasks-view:

  1. Task must be defined first.

  2. Click button: Add Hits

  3. Enter Name for the Hit.

  4. Click button: Open to define the Hit-Condition

Hit-Conditions are Assigned to Classes

  • Classes needs to be defined first.
  • Assignment either in the Task-viewer or in the dialog Edit Hit/Miss Classes

Scoring – Editor Classes

Classes group Hits-Conditions:

  • Variables in the expected data set are Classes in the CBA ItemBuilder

Edit Classes (i.e., edit expected
data set variables):

  • Add new class to add a variable
  • Delete selected class to remove

  • Class name needs to be valid (only letters, digits, underscore; first character not a digit)
  • Comment is optional (used for documentation)
  • A Codebook is typically used to map Class names to final variable names in the data set.

Logical Expressions in Scoring Conditions

Example

  • User Defined IDs of components: A, B, C and D

Hit Definitions

  • A_and_B: and operation
  • A_or_B: or operation
  • notB: negation not

Bracketing

  • all: Pairs need to be created using brackets

Condition Dependencies in Scoring

Option A (default): Conditions are mutually exclusive

  • Sequence of hit conditions is irrelevant.
  • Item authors need make sure: For each class only one hit is active at a time.

Option B: Use first hit per class

  • Simplifies the definition of Conditions.

The option Use first active hit per class is suggested to simplify scoring definition.

  • Order of Hits matters (i.e., order Hits in the Task-editor)
  • Value of a variable is the first hit (i.e., first active Hit for a Class).

Scoring of Text Responses

  • Short responses can be scored within the CBA ItemBuilder using the matches()-operator

Note: Regular Expressions can be combined with Logical Expressions

  • More complex automated scoring (i.e., Natural Language Processing, NLP) need to be scored outside.

Combination of Conditions with Text Responses

  • Value of each variable is the active Hit conditions.

  • Missing responses identified using matches(ID,"").

  • Regular expressions can be combined to check for values.

Copy Non-Categorical Values (Result Text)

Hits can also provide non-categorical values (Result Text), that allows to use either text responses, numeric answers or combinations of text and numeric responses.

  • Condition and Result-Text are combined

    • Condition = When
    • Result-Text = Which

Use of Variables in Scoring

  • Drag-and-Drop can be scored by using the FSM Variables

  • variable_in()-operator can be used to use Variables in Conditions

    • First argument: Variable Name
    • Second argument: Value

Summary of Operators in Scoring

UserDefinedID’s of Components with Component State

  • For Checkboxes, RadioButtons, Lists, … \(\rightarrow\) UserDefinedId within scoring syntax evaluates to True, if the component is selected (otherwise False)

Operators (see appendix for a list)

  • Most operators are evaluate to True or False and take one or multiple arguments
  • Operators for special purposes (e.g., Text Highlighting)

Result-Text-Operator

  • Always evaluates to True (i.e., can be easily added to Conditions)
  • Can be use to copy the Component State of InputFields to the string value of variables

Terminology for Missing Coding

Different Missing Codes, but typically at least:

  • Omitted
  • Not Reached
  • Missing by design

Test-Assembly

  • Missing by design typically classified by the deployment software.

Task-Level

  • Omitted responses (and reset answers) can be classified as part of the scoring definition.

Not reached items within units must be classified in the scoring (but not reached items at the task-level must be classified by the deployment software).

Missing Coding: Idea

Example

  • 3 units with 3 items each:
Unit 1 2 3
Item 1-1 1-2 1-3 2-1 2-2 2-3 3-1 3-2 3-3
  • 2 Booklets
    • A (Unit 1 and 3) \(\rightarrow\) Items 2-1, 2-2 and 2-3 are missing by design
    • B (Unit 2 and 1) \(\rightarrow\) Items 3-1, 3-2 and 3-3 are missing by design
  • Not-reached coding (e.g., due to timeout)

    • Within unit: Coded as part of the CBA ItemBuilder scoring
    • Between units: Must be added by the deployment software

Example for Missing Coding within Tasks

  • Single-Choice: One variable

  • Multiple-Choice: A variable for each choice

Missing Coding

  • Analyze the visited pages

Scoring Tipps

  • CBA ItemBuilder scoring is very flexible (allows multiple variables).
  • Consider to activate Use first active hit per class, otherwise item authors make sure that all hits are mutually exclusive (i.e., that only one hit is active for all possible conditions).
  • Each Class (=variable) can provide nominal / categorical values (Hits)
    • Correct / Full Credit
    • Incorrect / No Credit
    • Additional Hits for Partial Credits, Missing Code, …
  • Text and numbers (i.e., non-categorical values) can be scored with Regular Expressions within items, or copied to the data set using result_text()-operator.
  • Missing Value-Coding within Tasks can be implemented.

Summary

Important terminology

  • Variable = Class
  • Hit = Categorical value
  • result_text() = Non-categorical value

Helpful to remember

  • If Use first active hit per class is…
    • not activated: Make sure the Hit-conditions are mutually exclusive.
    • is activated: Order of Hit condition matters.
  • Regular expression to score simple text responses.

History

  • The CBA ItemBuilder also contains an old approach for scoring of tasks that have only one result.
  • To maintain compatibility to old items Mis-Conditions and Weights are part of the CBA ItemBuilder’s user interface.