2.1 Advantages & Benefits of CBA

Using CBA for the measurement of individual differences or group differences in various personal characteristics such as skills, competencies, personality dimensions, motivation, or attitudes is often motivated by potential benefits that will be briefly mentioned in this section (see, for instance, Haigh 2010).

Standardization: Among the apparent advantages of CBA is that the comparability of assessments can be increased through an increased degree of standardization (e.g., Jurecka 2008). In a computer-based assessment, the item authors can design diagnostic situations, defining which information is visible and accessible at which point (navigation restriction) and how test-takers can interact with the content to answer questions or construct responses. Carefully designed items allow standardizing, for instance, in which sequences tasks can be answered or if questions linked to an audio stimulus are accessible before the audio stimulus was played (navigation restriction). Item authors can also improve the common understanding of response formats by providing animated and interactive tutorial tasks (instructions). Audio and video components might also be used to reduce the reading load of items for all test-takers or as test-accommodation provided for selected test-takers only. Even simple answer formats can be designed in the computer-based assessment in such a way that invalid responses are impossible and test-takers are informed about possible handling errors, e.g. the selection of several answers in a single-choice task.

How to do this with the CBA ItemBuilder? Details about the use of audio and video elements can be found in section 3.10.3. More about animated instructions can be found in section 6.4.1 and an example illustrating the idea of navigation restrictions can be found in section 6.4.6.

Scoring: Various additional advantages are achieved by the possibility of instant scoring of closed response formats and selected text responses. (e.g., number inputs or text inputs that can be scored using simple rules like regular expressions and open text responses that can be scored using advanced NLP techniques). Automatically scored information derived from already administered items can be used for various purposes, either during the assessment or to simplify the post-processing of assessment data.

How to do this with the CBA ItemBuilder? Key for using the results of answered items directly is the so-called scoring definition, that can be defined for CBA ItemBuilder items within the tasks (see chapter 5).

Instant Feedback: During a computer-based assessment, instant feedback is possible on the results, processes and time (see section 2.9.1), the presentation of prompts, and a combination of tasks as scaffolding can improve data quality or implement formative assessment and assessment for learning (i.e., formative assessment). Immediately following the completion of an assessment, result reports can be generated and made available. Feedback can also refer to missing values in the assessment, for instance to reduce accidental overlooking of individual subtasks.

How to do this with the CBA ItemBuilder? Instant feedback and sequences of questions within tasks can be implemented using the conditional link feature (see section 4.3) or with the help of the so-called Finite-State Machine (see section 4.4). Feedback across tasks can be provided as part of the deployment software (see section 7.3.5 for an example).

Adaptivity & Measurement Efficiency: If scoring is already available for at least some of the items or questions during the test administration, various forms of adaptivity can be realized. The spectrum of possibilities ranges from hiding particular non-relevant items, simple skip rules, and filters to psychometric approaches such as multi-stage testing and one- or multidimensional adaptive testing as strategies of Automated Test Assembly that can result in an increased Measurement Efficiency (see section 2.7).

How to do this with the CBA ItemBuilder? Adaptivity within instruments can be implemented directly in the CBA ItemBuilder (see section 6.7 for an example), strategies of Automated Test Assembly require the use of specific deployment software (see section 7.2.7).

Innovative Items / Technology-Enhanced Items (TEI): Computer-based assessment promises to provide benefits for validity and construct representation of assessments using innovative item formats (e.g., Sireci and Zenisky 2015; Wools, Molenaar, and Hopster-den Otter 2019) and technology-enhanced items (TEI, e.g, Bryant 2017), using capabilities provided by the digital environments used for assessment (e.g., Parshall 2002). Item formats that were not possible in paper-based assessment include drag-and-drop response formats, digital hypertext environments (Hahnel et al. 2022), performance-based assessment in simulated environments and authentic assessment (e.g., B. B. Frey, Schmitt, and Allen 2012) to game-based assessments and stealth assessment (e.g., Shute et al. 2016).

How to do this with the CBA ItemBuilder? Starting from freely designable pages, items can be contextualized in the CBA ItemBuilder and enhanced with a variety of interactive components, including hypertexts (see section 3.13.2) and drag-and-drop (see section 4.2.6). A large number of innovative items can be implemented directly with the components provided by the CBA ItemBuilder. In addition, the CBA ItemBuilder allows the integration of external content, e.g. to enable interactive response formats that cannot yet be created with the components in the CBA ItemBuilder (see section 3.14). Furthermore, items designed with the CBA ItemBuilder can also be integrated into other web-based content (see section 7.7).

Log Data & Process Indicators: Computer-based assessment as a method also provides insight into test processing through behavioral data (Goldhammer and Zehner 2017), i.e., log data (gathered in the form of events) from which process indicators can be derived (Goldhammer et al. 2021). While log data can be collected using technical tools even with paper-based assessments (see, e.g., Dirk et al. 2017; Kroehne, Hahnel, and Goldhammer 2019), the availability and use of log-data from computer-based assessment has developed into a unique area of research (e.g., Zumbo and Hubley 2017).

How to do this with the CBA ItemBuilder? Items created with the CBA ItemBuilder by default collect log events (see section 1.6.2 for the live preview in the Trace Debugger), and custom log events can be added if required. The CBA ItemBuilder aims at replay-completeness (as defined in Kroehne and Goldhammer 2018) and the analysis of log data gathered with the CBA ItemBuilder is possible, for instance, with the LogFSM package (see section 2.8).

Response Times: A typical kind of information, which can also be understood as a specific type of process indicator, is the Response Time. Suppose the task design meets specific requirements (e.g., the one item one screen , OIOS, Reips 2010), response times can be easily identified and may already be part of the programming of computer-based assessments. Response times can be used for various purposes, including improving the precision of ability estimates (e.g., Time on Task as used in Reis Costa et al. 2021). However, even when multiple tasks are administered within a unit, time information is available. In that case, item-level response times can either be derived using methods for analyzing log data (Kroehne and Goldhammer 2018), or at least the total time for the entire unit or screen can be derived from computer-based assessments. A specific process indicator that can be derived using response times that allows the identification of disengaged test-taking and careless insufficient responding is Rapid Guessing and Rapid Responding (see section 2.5.3), a thread to validity, in particular, for low-stakes assessments. Response times allow monitoring test-taking engagement and can be used to investigate differences in test-taking motivation (e.g., Kroehne, Gnambs, and Goldhammer 2019).

How to do this with the CBA ItemBuilder? The CBA ItemBuilder runtime automatically provides the total time for each task and user-defined time measures can be computed during the assessment using an operator measuring the elapsed time (see section 4.4) or extracted from the log data (see section 2.8).

Online & Mobile Deployment: The manifold possibilities of Internet-based assessment were recognized early on (e.g., Buchanan 2002; Bartram 2005). Since the early years, the possibilities to conduct online assessment under similar conditions have technically improved. For example, it is now possible to carry out assessments in full-screen mode and to register and record exits or interruptions in the log data, if not to prevent them. At the same time, however, the heterogeneity of Internet-enabled devices, tablets, and especially mobile phones has increased. Reliable and secure online and mobile assessments are therefore still a topic of current research and (further) developments.

How to do this with the CBA ItemBuilder? The CBA ItemBuilder runtime can be used to deliver tasks under a variety of online, mobile, and offline scenarios (see chapter 7 for more details on test delivery options).

CBA also results in changed costs for the assessments since the effort to create and test computer-based assessments can be higher (in particular for testing, see section 8.4), but the costs for the distribution of the computer-based administered instruments and the scoring of closed response formats, in particular, can be lower. However, most importantly content created for computer-based assessments can be shared and duplicated without additional costs. While these options obviously do not change the requirements for item protection and confidentiality (see section 2.10), especially concerning assessment content from large-scale assessments, they change how developed assessment instruments from research projects can be distributed and applied in practice (see section 8.7.4). All the potential benefits of CBA come with, for instance, practical challenges (e.g, Mills 2002; Parshall 2002), some of them will be discussed in section 6.

References

Bartram, Dave. 2005. “Testing on the Internet: Issues, Challenges and Opportunities in the Field of Occupational Assessment.” In Computer-Based Testing and the Internet, edited by Dave Bartram and Ronald K. Hambleton, 13–37. West Sussex, England: John Wiley & Sons, Ltd. https://doi.org/10.1002/9780470712993.ch1.

Bryant, William. 2017. “Developing a Strategy for Using Technology-Enhanced Items in Large-Scale Standardized Tests.” https://doi.org/10.7275/70YB-DJ34.

Buchanan, Tom. 2002. “Online Assessment: Desirable or Dangerous?” Professional Psychology: Research and Practice 33 (2): 148–54. https://doi.org/10.1037/0735-7028.33.2.148.

Dirk, Judith, Gesa Katharina Kratzsch, John P. Prindle, Ulf Kroehne, Frank Goldhammer, and Florian Schmiedek. 2017. “Paper-Based Assessment of the Effects of Aging on Response Time: A Diffusion Model Analysis.” Journal of Intelligence 5 (2): 12. https://doi.org/10.3390/jintelligence5020012.

Frey, Bruce B., Vicki L. Schmitt, and Justin P. Allen. 2012. “Defining Authentic Classroom Assessment.” https://doi.org/10.7275/SXBS-0829.

Goldhammer, Frank, Carolin Hahnel, Ulf Kroehne, and Fabian Zehner. 2021. “From Byproduct to Design Factor: On Validating the Interpretation of Process Indicators Based on Log Data.” Large-Scale Assessments in Education 9 (1): 20. https://doi.org/10.1186/s40536-021-00113-5.

Goldhammer, Frank, and Fabian Zehner. 2017. “What to Make Of and How to Interpret Process Data.” Measurement: Interdisciplinary Research and Perspectives 15 (3-4): 128–32. https://doi.org/10.1080/15366367.2017.1411651.

Hahnel, Carolin, Dara Ramalingam, Ulf Kroehne, and Frank Goldhammer. 2022. “Patterns of Reading Behaviour in Digital Hypertext Environments.” Journal of Computer Assisted Learning, July, jcal.12709. https://doi.org/10.1111/jcal.12709.

Haigh, Matt. 2010. “Why Use Computer-Based Assessment in Education? A Literature Review,” no. 10: 8.

Jurecka, Astrid. 2008. “Introduction to the Computer-Based Assessment of Competencies.” Assessment of Competencies in Educational Contexts, 193–214.

Kroehne, Ulf, Timo Gnambs, and Frank Goldhammer. 2019. “Disentangling Setting and Mode Effects for Online Competence Assessment.” In Education as a Lifelong Process, 171–93. Edition ZfE. Wiesbaden: Springer VS. https://doi.org/10.1007/978-3-658-23162-0_10.

———. 2018. “How to Conceptualize, Represent, and Analyze Log Data from Technology-Based Assessments? A Generic Framework and an Application to Questionnaire Items.” Behaviormetrika. https://doi.org/10.1007/s41237-018-0063-y.

Kroehne, Ulf, Carolin Hahnel, and Frank Goldhammer. 2019. “Invariance of the Response Processes Between Gender and Modes in an Assessment of Reading.” Frontiers in Applied Mathematics and Statistics 5: 2. https://doi.org/10.3389/fams.2019.00002.

Mills, Craig N., ed. 2002. Computer-Based Testing: Building the Foundation for Future Assessments. Mahwah, N.J: L. Erlbaum Associates.

Parshall, Cynthia G., ed. 2002. Practical Considerations in Computer-Based Testing. New York: Springer.

Reips, Ulf-Dietrich. 2010. “Design and Formatting in Internet-based Research.” In Advanced Methods for Conducting Online Behavioral Research, edited by S. Gosling and J. Johnson, 29–43. Washington, DC: American Psychological Association.

Reis Costa, Denise, Maria Bolsinova, Jesper Tijmstra, and Björn Andersson. 2021. “Improving the Precision of Ability Estimates Using Time-On-Task Variables: Insights From the PISA 2012 Computer-Based Assessment of Mathematics.” Frontiers in Psychology 12 (March): 579128. https://doi.org/10.3389/fpsyg.2021.579128.

Shute, Valerie J., Lubin Wang, Samuel Greiff, Weinan Zhao, and Gregory Moore. 2016. “Measuring Problem Solving Skills via Stealth Assessment in an Engaging Video Game.” Computers in Human Behavior 63 (October): 106–17. https://doi.org/10.1016/j.chb.2016.05.047.

Sireci, Stephen G, and April L Zenisky. 2015. “Innovative Item Formats in Computer-Based Testing: In Pursuit of Improved Construct Representation.” In Handbook of Test Development, 313–34. Routledge.

Wools, Saskia, Mark Molenaar, and Dorien Hopster-den Otter. 2019. “The Validity of Technology Enhanced Assessments and Opportunities.” In Theoretical and Practical Advances in Computer-based Educational Measurement, edited by Bernard P. Veldkamp and Cor Sluijter, 3–19. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-18480-3_1.

Zumbo, Bruno D., and Anita M. Hubley, eds. 2017. Understanding and Investigating Response Processes in Validation Research. Vol. 69. Social Indicators Research Series. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-56129-5.