8.7 Documentation and Archiving of Computer-Based Assessments

The assessment cycle introduced in the beginning of this chapter (see Figure 8.1) contains the Documentation & Dissemination as the last component. In general, documentation can be understood with respect to the items (i.e., the instrument) and the data (see Table 8.9).

TABLE 8.9: Steps for Documentation
Documentation
Item and Instrument Documentation
Data and Log-Data Documentation

Archiving and documentation of computer-based assessments can have different objectives. The first central question is whether there is a link to research data that has already been collected. Hence, the software’s archiving often takes place in the context of the data archiving so that questions regarding the interpretation or understanding of the existing data can be answered concerning the used software. In this case, the software used should be provided along with the assessment content (i.e., tasks, instruction pages, etc.) as closely as possible to how they were used to collect the research data. However, since the software might come with specific requirements archiving the computer-based assessment must take into account these requirements so that the software can (hopefully) also be executed in the future.

Archiving of computer-based assessments can also serve the purpose that other researchers or stakeholders can use the developed assessment instruments (sharing). The two goals need not be mutually exclusive, but it should be made clear what the goal of archiving computer-based assessments is.

Goal to archive assessment content to document an existing data set
Goal to allow the use of developed content in future data collections

A second key issue concerns the separation of assessment software and assessment content. Such a separation exists, for example, if the software allows the export of the assessment content created with it, as it is the case, for instance, with TAO that allows exporting the items in QTI format (Question and Test Interoperability⁹⁷). In the case of QTI, different software components could be used to administer assessments that use the QTI content. A similar separation also applies to the CBA ItemBuilder, which allows the assessment components created with it to be archived independently of the software (i.e., the specific version used to author the CBA ItemBuilder project files and the software used for test-deployment). Since the CBA ItemBuilder project files contain the runtime configuration (see section 8.3.3), that is sufficient to use deployment software (including TAO, see section 7.4) or the TaskPlayer API (see section 7.7).

Requirements to run / use the software (operating system / frameworks / browsers)
Requirements to run / use the content (compatibility of content, e.g., QTI version)

A third question concerns the anticipated, expected, and allowed use and possible modifications to the archived computer-based assessment, for instance, for future data collections on new samples. This third question includes licensing issue regarding the content (i.e., the items and possible embedded resources such as images), licensing of the software, and the technical aspects required for using (i.e., executing and running) the software securely.

Right to use the software / the content for specific purposes (e.g., new data collection)
Right to store the software / the content (for instance, for archiving)
Right to distribute the software / the content for further use (e.g., for other projects)
Right to change the software / the content (for instance, to adjust for further needs)

8.7.1 Archiving CBA Software to Document Datasets

If the goal is to archive a digitally-based assessment to interpret existing data, a first idea could be to archive the complete software as used for the data collection.⁹⁸ The underlying rationale is similar to paper-based assessments and the practice of archiving the assessment materials (i.e., booklets), for instance, as PDF files. However, acknowledging that the assessment was digitally based, more than static representations for items or screens (e.g., screenshots) might be required, and archiving the assessment as an interactive system might be considered the natural choice.

Documentation of Requirements: Whether the archiving of the software used in data collection is useful depends, first of all, on how the requirements needed to run the software can be fulfilled. Accordingly, a prerequisite for investigating the viability of this approach is a documentation of all runtime requirements from a technical perspective. Assessments used in offline deployments (see section 7.2.1) might require a particular operating system, require a minimum screen resolution, and might be tested only for particular pointing devices (e.g., not tested for touch input). Beyond these apparent requirements, dependencies (i.e., specific browser versions, installed frameworks or components, such as Java or .NET), user privileges (i.e., is admin access required), and network requirements (e.g., free ports) need to be documented and considered. If assessments were performed with dedicated hardware (i.e., computers that were deployed to the assessment sites), additional settings and configurations (e.g., at the operating system level) might also be necessary in order to be able to reproduce the data collection with the archived software. In particular for mobile deployments using apps, the distribution of the assessment software to the mobile devices needs special attention. For online deployments, both perspective need to be distinguished: For online deployments, both perspectives need to be distinguished: At the client side, supported (i.e., tested, see section 8.4.1) browsers need to be documented, while at the server side, documentation of runtime requirements and the server configuration might be relevant to run the assessment software.

Software Virtualization: Techniques such as Virtual Machines (used, for instance, for desktop virtualization, such as VMWare, Virutal Box or Parallels) and Containers (used, for instance, for server virtualization, such as Docker or LXC) might help to make software (in specific environments) available for a more extended period. However, in particular for desktop virtualization, licensing of the operating system need to be considered.

Intended Use of Archived Assessment Software: The critical question regarding the usefulness of this type of archiving is what researchers can do with assessment software archived in this way. If no further precautions have been taken in the assessment software itself, then items can be replayed and answered in the combinations used in the field (e.g., within a booklet design). This option can be helpful, for example, to learn about the items (i.e., the assessment content) in context, to inspect the behavior of items and the assessment platform, and to investigate how prompts or feedback were displayed. If the archived assessment software also provides the generated (raw) data access, this approach also allows checking how a particular test taker or response behavior is stored or represented in the data set.

8.7.2 Dedicated Approaches for Documenting CBA Data

As described in the past section, archiving the assessment software itself, while an obvious idea, is of limited benefit for documenting data from computer-based assessments without special provisions within the assessment software.

Documentation of Result Data and Process Indicators: In terms of documentation of outcome data (i.e., raw responses and input as well as scored responses), data sets with result data of computer-based surveys are standard. Hence, codebook documents can be used to describe the result data (in terms of metadata). Result Data, available in variable values per person, can be supplemented by additional Process Indicators (i.e., information describing the test-taking process), for which a value (including NA) is also expected for each person.

If knowledge of the specific item content is necessary for interpreting the result data or the process indicators, insight into tasks provided by an archived assessment software may be sufficient. However, some information about the log data generated when interacting with the assessment content can be necessary to document Raw Log Events and Contextualized Log Events (see section 2.8.1 for the terminology).

Documentation of Raw Log Events and Contextualized Log Events: Which interactions are generally stored by a digitally-based assessment can often be documented and described even without the specific assessment content. In case of the CBA ItemBuilder, the log events provided by the items are described for the different components used to implement the content (see appendix B.7 for a documentation of log events), and additional log events might be defined by the item author (described as Contextualized Log Events). Moreover, the deployment software is expected to add additional log events at the platform-level.

The more challenging part of the documentation is to relate the assessment content and the collected log data so that the data can be meaningfully interpreted in the context of test-takers interactions and assessment content.

Real Items and Live Access to Log Events: The obvious option to allow researchers to inspect interactive assessments is to give them the computerized items in a form where the events stored in the log data are visible after one has demonstrated a particular behavior or interacted with the item. This can be achieved by different approaches, either by modifying the deployment software (see, for instance, section 7.3.7) or by using the authoring software (see Trace Debug Window in section 1.6.2).

Documenting Instruments Using Mock-Items: Given that assessments are often translated (e.g., in the context of international large-scale assessments), there is another way of documenting interactive items to facilitate the interpretation of log data. For that purpose, we define Mock-Items as items in which the sensitive item content (i.e., everything that should not become public to keep the items secure) is replaced by placeholders. Such a replacement is required for all texts, images, video, and audio files that could provide hints about the item’s content. However, it is assumed that replacing the content is possible without altering or destroying the interactive items’ structure and functioning.

How to do this with the CBA ItemBuilder? The recommended strategy to document the log data of assessment content created with CBA ItemBuilder is to provide access directly to the CBA ItemBuilder Project Files. If this is not possible for item content protection, mock items might help.

Screen Casts or Annotated Screenshots: Documenting log events can also be done using screen casts (i.e., screen recordings showing a particular behavior and the generated log events), for instance using released items. Or annotated screenshots of computer-based instruments can be used. And even specifically created webpages that show how specific interaction are logged can be used (e.g., PIAAC R1 Log Data Documentation).

8.7.3 Approaches to Archive or Share Assessments for Re-Use

Beyond documenting existing data, an important goal can be sharing developed assessment content to use in future data collections.

Sharing Software as is: Although similar to the idea described above (see section 8.7.1), sharing assessment content bundled with an assessment software as-is for re-use adds additional challenges. The following aspects require special attention: First, it must be considered that the redistribution of the software is different from the use of the software, so it may be a question of licensing whether the right to redistribute exists for the software and for the included content. A second aspect concerns the issue of IT security. For archiving accompanying a data set, the assessment software is used under controlled conditions. However, if sharing assessment for re-use aims to facilitate future data collections with a digitally-based assessment using existing software as is, it must also be possible to do so safely. For online deliveries, in particular, this requires patching and applying security updates sooner or later, meaning the possibility of maintaining the software.

Sharing of Software with Sources: Many assessment software maintenance and customization issues can be solved if the runtime components (i.e., compiled or built code) and the source code are archived. In particular, if assessment content and assessment software are not separate, making them available, for example, via a public source code repository (e.g., GitHub.com) may allow other researchers to reuse the resources developed. While the open source provision of assessment software naturally presupposes the right to disseminate the sources, it also presupposes the human resources (i.e., appropriate IT know-how) to be able to use them.

Sharing of Content (Only): An obvious alternative to sharing created assessment content for further use arises when the Content can be separated from the Software. The option to share created items as Content is at first analogous to paper-based assessment. As soon as a PDF or Word document of a test booklet is shared, it can be used to prepare future assessments.

Question & Test Interoperability (QTI) is a standard to share assessment content that is supported, for instance, by TAO. Using the converter fastib2pci, CBA ItemBuilder generated content can be packaged as PCI components, that can be embedded and used in QTI items (see section 7.4).

Two examples will be examined in more detail here. If a standard exists (as is the case, for example, with Open Office XML⁹⁹ for text documents), then different programs can use documents that follow that standard. The Question & Test Interoperability (QTI) specification can be understood as a similar standard for computer-based assessments. If, for example, items created in TAO are exported in QTI format, then these can be stored and used in later assessments if an assessment software can read and process the QTI format. The apparent prerequisite for this model to be applicable is that the assessment content can be implemented as QTI items. As the field of computer-based assessment continues to evolve, the QTI standard is also being expanded and adapted¹⁰⁰. Hence, it might be necessary to document the exact version of the QTI standard, and only the particular version of the software used to author the QTI items (e.g., a specific TAO version) might interpret the assessment content precisely (i.e., the rendering and behavior of the interactive content might be different across different QTI players). Moreover, if the software used for QTI editing adopt to a new version, a migration process might be required.

How to do this with the CBA ItemBuilder? Assessment content created with CBA ItemBuilder in one version can be archived as CBA ItemBuilder Project Files and shared for later use. To use the assessment content, delivery software containing the runtime of this CBA ItemBuilder version is required.

If the standard is not sufficient, sharing the content independently from the software used to create the content can also be possible. This is illustrated with the CBA ItemBuilder, which does not follow the QTI standard. However, as long as a deployment software is available that supports this version of the CBA ItemBuilder, the generated content can be used for future data collections.

Migration Strategy: Project Files of recent CBA ItemBuilder versions can be used for assessment projects, as long as sufficient browser support is provided and no technical or security-related issues prohibit the use of old versions. If archived Project Files of an older version can no longer be used in current delivery software, older Project Files can be migrated using the CBA ItemBuilder. Migrating an outdated Project File means opening the Project File in a newer CBA ItemBuilder and then saved as a Project File in this new version. Doing so will update the generated code or the runtime configuration required to use the Project File with a particular deployment software.

The update of Project Files is possible because the implementation of the CBA ItemBuilder ensures that a newer version can read the content of the previous version and convert it if necessary. Accordingly, it may be necessary to perform the migration in multiple steps (using intermediate versions of the CBA ItemBuilder). The release notes of the CBA ItemBuilder (see Table B.5) provide information on points to be considered regarding backward compatibility.

How to do this with the CBA ItemBuilder? The recommended strategy for sharing and archiving assessment content created with CBA ItemBuilder is to provide Project Files. As long as suitable deployment software supports the version, the Project Files can be used directly. The Project Files can be migrated with the CBA ItemBuilder if no deployment software supports the (outdated) version.

8.7.4 Assessment Content as Open Educational Resources (OER)

The archiving of created and digitally based implemented assessment content in educational science applications can be understood as a particular form of Open Educational Resources (OER). This is particularly true if the goal is to enable content sharing, where the developed items constitute the shared resource.

How to do this with the CBA ItemBuilder? To support the sharing and provision of assessment content created with CBA ItemBuilder, a suitable license should be defined and metadata stored within the CBA ItemBuilder Project Files (see section 6.3.4)

Before making extensive adjustments to items, it must be thought about whether this will change psychometric properties and item parameters that have been empirically determined or verified, for example, with the help of a scaling study (see section 2.5.4).

References

“Question and Test Interoperability (QTI): Implementation Guide.” 2022. http://www.imsglobal.org/question/qtiv2p2/imsqti_v2p2_impl.html.

“Question and Test Interoperability (QTI): Implementation Guide” (2022)↩︎
Sometimes deciding which version of a computer-based assessment should be archived might also be relevant. Suppose changes in the assessment software and content are tracked, for instance, using version control tools (see section 8.3.2). In that case, the datasets might reference that particular version, and archiving should contain all versions used during data collection.↩︎
ECMA-376, ISO/IEC 29500↩︎
See, for instance, 1EdTech Question & Test Interoperability (QTI)↩︎