Is that still science — or can it go away?

A look at the scientific nature of personnel selection tests

Aptitude diagnostics instead of intuition and gut instinct is the decisive success factor in the selection of personnel in the future — that is the credo of leading scientists. The use of aptitude diagnostics is often used as a guarantee for reliable identification of High Potentials traded. However, a look at the practice of personnel selection shows that the reality is much more complex and HR managers often fall victim to a fallacy here. Because the accuracy of this conclusion depends on a decisive factor: the Metrological basis of the methods used. Based on measurement theory, this means that the tests were constructed on the basis of a scientifically based theory, which usually comes from academic psychology. In addition to the theoretical basis of tests, the metrological basis of tests also includes the professional use of mathematical and statistical models and evaluation methods.

‍Definition: aptitude diagnosis

Aptitude diagnostics, also personnel diagnostics, is a collective term for psychological selection processes based on measurement theory, which are used to test a A fit between applicants and requirements be used in the workplace. (Schuler & Hoff 2007)

Inventory: What is the scientific nature of selection tests?

In practice, there is now a plethora of Ability and personality tests, and new providers are added almost daily with new tests. What they have in common is the promise of being able to reliably identify the best applicants in the candidate pool. In addition, this should usually be achieved in the shortest possible time and with little effort and be perceived as entertaining and positive by applicants.

‍

But care must be taken

Because these tests are only partly based on psychological theories of competence and personality development and provide reliable data relating to the key test quality criteria Reliability, Validity, and Normalization. With at least as many tests, on the other hand, people are desperately looking for a scientific basis. If a theory is used at all, it often falls within the realm of lay theories (e.g. people who speak softly are necessarily introverts). In doing so, reality is simplified to such an extent that the scientific nature of the processes falls by the wayside. One example of this is the Myers-Briggs Type Indicator.

‍

Negative example: Myers-Briggs type indicator

The Myers-Briggs Type Indicator (MBTI for short) is one of the world's best-known and most frequently used test methods in personality diagnostics and very common, particularly in the United States. The method is based on Carl Gustav Jung's psychological type theory (1921). In a questionnaire, participants are asked to indicate which of the various statements applies more to them (e.g. I make a choice more consciously vs. rather spontaneously). The evaluation is carried out in a purely categorical format (Yes vs. no) - a possible gradation (e.g. on a 7-point Likert scale of Totally agree unto Disagree at all) is not possible in this case.

After all questions have been answered, the result is a four-digit letter combination. These letters consist of four categories, each with polar personality traits, such as introversion vs. extraversion. People with certain letter combinations are attributed talents in specific professional positions. For example, extroverted people who perceive things intuitively and holistically but still make rational and sustainable decisions are considered the ideal managers (letter combination ENTJ). At first glance, that doesn't sound so implausible.

However, a look at the scientific quality criteria shows that the test falls far short of the minimum requirements for aptitude diagnostic tests set out in DIN 33430 (e.g. Pittenger 1993). For example, the test is susceptible to candidates' self-presentation tendencies, the reliability is low (i.e. the result fluctuates when carried out several times) and a high convergent (i.e. a connection with other personality tests) and predictive validity (i.e. usefulness for predicting professional success) cannot be proven.

The example shows that the goal of distinguishing suitable from unsuitable applicants does not seem to be achieved with the help of MBTI. And this also applies to similar tests (e.g. DISC), which are based on type classifications. But despite frequent criticism of the lack of scientific basis, these tests are still an integral part of the Staff selection.

‍But why when there are scientifically based aptitude diagnostic methods?

‍

The appeal of these processes lies in their simplicity

Because do tests build simple amateur theories from everyday life On, of course, they seem quite plausible. In classical test theory, one also speaks of a Apparent validity, i.e. the extent to which the test result and its use in the personnel selection context are logically comprehensible. In addition, the descriptions are so general (e.g. the classification into a few comprehensible dimensions in MBTI) that the participants (almost) always find themselves referring to a phenomenon that in psychology is also known as Barnum effect is referred to as (Forer 1949).

Applicants are therefore usually satisfied with the test result, which usually sounds good — and so are the HR managers. The extent to which the result ultimately contributes to predicting professional success is often secondary.

“While the simplicity of the testing procedures contributes to their great popularity, the added value for the actual goal of personnel selection, the identification of the most suitable applicants for a position, often falls by the wayside. ”

To help you find your way around the market, we have a List of important key questions compiled, which help you to distinguish scientific from unscientific methods.

‍

10 key questions to identify scientific tests

Is the test based on a established theory/has a scientific origin?
How high is the Provider expertise? Is the team heterogeneous and diverse, with experts in personnel diagnostics, psychology and data science?
Does the test have a clear requirement reference/is it based on a systematic requirements analysis?
Is data on Objectivity, Reliability, and Validity The test procedure is available? In particular, how high is the predictive validity for professional success? (For testing, we recommend that you follow the requirements for scientific tests of DIN 33430)
Sufficiently large beds Standard samples before and are they regularly updated?
Will the test procedures continuously developed and adapted to changes in a complex, dynamic working environment? (See also the blog article on the continuous improvement process in aptitude diagnostics)
Is the test fair and does not discriminate based on gender, age, or ethnic background? Werden Unconcious Bias prevented?
Is the test economic, In other words, is the benefit of the test procedure commensurate with its duration?
Are procedures for Preventing deception (particularly with Online assessments) and reduction of social desirability (especially in self-assessment)?
Is the test experienced as positive by the target group and contributes to a positive candidate experience and strengthening the employer brand at?

‍

sources‍

Bosco, F., Allen, D.G., & Singh, K. (2015). Executive attention: An alternative perspective on general mental ability, performance, and subgroup differences. Personnel Psychology, 68, (4) 859—898. https://doi.org/10.1111/peps.12099 ‍
DIN German Institute for Standardization e.V. (2016). Requirements for procedures and their use in job-related aptitude assessments — DIN 33430. Berlin: Beuth. https://doi.org/10.31030/2514220‍
Forer B.R.: The Fallacy of Personal Validation; a Classroom Demonstration of Gullibility. In: Journal of Abnormal Psychology. Volume 44, 1949, pp. 118—123. PMID 18110193.
Young C.G. (1921): Psychological types. Dusseldorf: Solothurn.
Kanning, U.P. (2019). Personnel diagnostics standards. Design personnel selection professionally (2nd, revised and expanded edition). Göttingen: Hogrefe.
Meade, A.W., Pappalardo, G., Braddy, P.W., & Fleenor, J.W. (2020). Rapid Response Measurement: Development of a Faking-Resistant Assessment Method for Personality. Organizational Research Methods, 23 (1), 181—207. https://doi.org/10.1177/109442811879529
Pittenger, D.J. (1993): Measuring the MBTI... And Coming Up Short. In: Journal of Career Planning and Employment, 54 (1), PP. 8—52
Ryan, A. & Ployhart, R. (2000). Applicant Perceptions of Selection Procedures and Decisions: A Critical Review and Agenda for the Future. Journal of Management, 26, 565—606. https://doi.org/10.1177/014920630002600308
Schwarz, N. (1999). Self-reports: How the questions shape the answers. American Psychologist, 54 (2), 93-105 https://doi.org/10.1037/0003-066X.54.2.93‍
Schuler, H. & Hoft, S. (2007). Diagnosis of professional aptitude and performance. In H. Schuler (ed.), textbook Organizational Psychology (pp. 289—343). Bern: Huber.