Ulf Kroehne
Paris, 2022/11/07 - 2022/11/08
Log-Data are typically investigated After Assessments, but can also be used During the Assessment if considered already during item development.
Starting Point
Raw Log Events: Basic information logged
by a specific assessment software
Promise
The terms log data / log file data are initially not specific to assessments. Logging of information (e.g., server logs, system event logs, etc.) is used by software developers in all kinds of software.
Single events can inform about WHAT happened (e.g., a page was changed), WHEN (e.g., at time \(t\)) and WHY (because of a timer).
Raw Log Events are platform specific (i.e., what is logged when using, for instance, the CBA ItemBuilder is defined by the programmers of the CBA ItemBuilder).
Application
Contextualized Log-Events: Actions that indicate specific behavior within Tasks
What is actually frequently used in empirical log-data analyses, for instance, using large-scale assessment data, are contextualized log events.
Contextualized Log-Events are task-specific (i.e., a log-event for a specific behavior, such as, using-the-calculator
is only relevant in very specific tasks, in which a digital calculator is provided).
Platform-specific Raw Log Events need to be translated into task-specific Contextualized Log-Events.
Two Problems
Same task can be implemented using different software (e.g. CBA ItemBuilder and a custom implementation as PCI component):
Low-Level Features
are identified with the help of technical platform-specific raw log events
are defined with reference to the assessment framework
Contextualized Log Events are Actions, i.e. events within states with a specific interpretation.
Process indicators (variables representing aggregates at the person or person x item level) are the theoretically and empirically meaningful information derived from log data.
Comparability (and reproducibility in the sense of Open Science) of process indicators as a prerequisite for building empirical evidence (i.e., validation).
Content knowledge (i.e., expertise) required to describe meaningful Low-Level Features (and to link them with any raw log events).
Problem 1 (Comparability):
Problem 2 (Domain Knowledge):
The naming and semantics of raw log events are platform and software-specific (i.e., arbitrarily defined to some extent by the programmers of the CBA ItemBuilder).
The CBA ItemBuilder automatically collects raw log events:
Test-Level
UserLogin
(or something related with authentication)TaskSwitch
/ ItemSwitch
/ TaskViewVisible
(events notifying that the Task was loaded and rendered)Task-Level
PageSwitchTopLevel
/ PageSwitchEmbedded
/ … (events notifying about pages changes)Panel
/ ImageField
/ Button
/ … (events notifying about user-interactions with any of the components)…
For learning about Log-Events, the Preview provides the Tracing Debug Window that lists all recently collected Log Events of the currently running item.
During Preview
Recent events (at runtime) can be debugged in the Preview
Strg/Ctrl + Y is the (default) hotkey for the Tracing Debug Window
Hotkey configured in the Preference of the CBA Preview (main menu Utilities
>
Open preferences
)
Find a short introduction here.
Response completeness
Result data (i.e., the raw results in terms of final answers and the entered text) can be derived from log data. This criterion means that the result dataset can be reconstructed from the log data (and the items). Note that additional steps might be necessary to apply scoring rules to obtain a dataset with scored responses.
Progress completeness
Result data can be derived from the log data as described for response completeness, but at any possible point in time. If this requirement is fulfilled, multiple result datasets can be created, each containing the final responses at the moment the assessment was hypothetically interrupted. This criterion requires that all events informing about an answer change are available.
Replay completeness
Visual replay (i.e., a screen-cast that shows what was visible on screen and when) can be reconstructed from the collected log events. Requires that everything that was visible on screen and everything that changed visibility of content is logged.
State completeness
With respect to a concrete set of actions and states all log data are available to identify the transitions between states (or the occurrence of actions). If this condition is useful to describe the completeness of the log data gathered in the particular assessment, please list the relevant actions and states that can be identified using the log data.
Illustration of Replay Completeness
Goals
The item shown on this page with the embedded video shows the experimental preview of the replay feature that is under development.
Response times for questionnaires with multiple questions per screen (i.e. Item-Batteries) are not naturally defined, but can be constructed from log events.
Item Design
If a test-taker notices a specific part of the stimulus or item can be made traceable for content that is only visible after specific (logged) user interactions.
Test-takers should be able to apply strategies and knowledge when solving the tasks as required for authentic assessments (face validity). When possible, allow choosing between different ways to solve the task.
Test Design
Example: A “free” navigation between tasks, i.e. leaving and re-entering tasks, complicates the interpretation of editing processes and is not necessarily required if test-wiseness is not to be measured.
Consider Unit-Level time limits.
Provide feedback about potential navigation restrictions and missing values.
Helpful to remember
Task for item authors
Think about possibilities to use them for diagnostic purposes while designing the items.
Low-level features with duration (States):
Low-level feature with timestamp only (Actions / Contextualized Events)