Careless Responding Detection Improved: The Detection Accuracy of Direct and Indirect Indices Examined Under Contextual Factors
In the last decade, researchers have proposed several indirect or obtrusive indices for detecting careless responding in questionnaires. However, only few experimental studies have compared so far how different indirect careless response indices perform compared to each other (for selected comparisons between indices, see Goldammer et al., 2020; Huang et al., 2012; Niessen et al., 2016) and how they perform individually and jointly compared to popular direct careless responding measures (for selected comparisons between indices, see Niessen et al., 2016). In addition, it is unclear for many indices on whether their detection performance depends on contextual factors, like the severity of the careless response pattern (i.e., full vs. partial careless responding), the type of item-keying (i.e., unidirectionally keyed items only vs. bidirectionally items), and the type of item presentation (one item per page vs. several items per page [as matrix]).
This paper has therefore three major aims. First, we want to examine how well indices of representative selection of 12 conceptionally different indirect careless responding detection indices perform compared to each other. Second, we want to examine how well the selected indirect indices perform (individually and jointly) compared to three frequently applied bogus items. Third, we want to examine to what extent the detection performance of these indices is affected by the three following contextual factors: Severity of careless response pattern, type of item-keying, type of item-presentation. By examining these issues, we provide researchers with a differentiated basis upon which they can select and combine indirect indices for detecting careless responding in their data.
To examine our research questions, we conducted three studies in which the indices’ detection accuracy was examined based on experimentally induced careless response sets. In Study 1 and 2, we used a between-subject factor design, in which participants had to rate the personality of an actor that presented himself in 5-minute lasting video-recorded speech. In Study 3, we used a design with a within- and between-subject factor, in which participants rated their own personality across two measurements. Noteworthy and in contrast to previous experimental studies on careless responding detection (Goldammer et al., 2022; Huang et al., 2012; Niessen et al., 2016), each of our three studies was designed in such a way that we could conduct a ‘true’ test of the effect of the response instructions on the participants’ response behavior (e.g., by comparing the ratings of the experimental groups with that of a norm-group). The results of Study 1 are presented in the research colloquium.