2.10 Item and Test Security

Computer-based testing is also connected in different ways with the concept of item and test security.

2.10.1 Protection of Items and Tests

Test security is first of all related to the protection of item content, of particular importance for high-stake assessments or for low-stakes assessments that aim to measure trends by using so-called link items. The core idea is that test instruments should only be accessible to a limited group of users to protect item contents from becoming known.

2.10.2 Secure Test Deployment

Whether and in what way item contents can be made known depends mainly on the type of test delivery. If, for example, a proctor is present on-site during a test session and the test is carried out in a so-called kiosk mode (see section 7.2.3), then the proctor and the software can help to manage the possibility of uncontrolled dissemination of item content.

From a diagnostic perspective, test security also has a second meaning: An assessment platform should implement a certain degree of restrictions so that the assessment can be carried out with as little disruption as possible. Since a certain degree is difficult to verify from an IT perspective, we work with the following operational definition of operational test security of assessments in terms of usability for diagnostic purposes:

No unintentional mishandling of the assessment platform or any unintended test-taker behavior should lead to an interruption of the assessment.

The usability of computer-based assessments and the protection of items and tests are related, as both aim at restricting the possibilities of how the test-takers can interact with the test delivery while answering items and working on assigned tasks. The following examples illustrate for web-based deployments how a computer-based assessment can be challenged regarding the usability (and related to test security):

Accidental closing of the entire browser can interrupt the assessment or result in losing data.
Navigation using browser Back-, and Forward- buttons (instead of the buttons within the assessment content) might interrupt the assessment.
Drag and drop operations on content that requires scrolling to be completely visible can occur unintendedly, for instance, if test-taker can reduce the size of a browser window.
The attempt to open an assessment in a second window or tab can lead to unwanted interruptions or, in the worst case, inconsistencies in the data.

Thus, the question of the delivery of computer-based assessments can have a significant effect on the validity of measurements. Collecting log data, for instance, for so-called off-task behavior, can help identify and ultimately quantify the problem. However, interpreting scores at the individual level may be challenging or, analogous to disengaged response behavior and rapid guessing responses, impossible.