Author: Yu Qi

A New Way to Take the 16 Personalities Test (Without Faking It)

Apr 10, 2025

The issue of the reliability of results obtained through personality psychometric testing remains a subject of extensive discussion in both academic and applied contexts. Despite decades of theoretical development and practical refinements of tests—such as the MBTI, the Big Five, or Socionics-based questionnaires—the fundamental problem of validating respondent answers remains highly relevant.

Traditionally, validity refers to the degree of alignment between the parameters measured and the actual psychological characteristics of the individual. However, under real-world conditions—particularly in corporate environments—test validity is undermined not only by internal methodological limitations but also by the behavioral tendencies of the test-taker.

Most Accurate Personality Test →

II. The Problem of Validity in Typological Testing

A. Conceptual Foundations of Validity and Their Practical Limitations

In traditional psychometrics, validity is regarded as a fundamental criterion for test quality, encompassing several dimensions. Construct validity determines the extent to which a given scale accurately reflects the intended psychological construct (e.g., introversion or logical reasoning). Content validity assesses how comprehensively a test represents the theoretical domain it is based on, while predictive validity measures the test’s ability to forecast behavioral or professional outcomes.

However, in practice—particularly in large-scale applications within HR, education, or coaching—these types of validity often prove insufficient. They fail to account for variability in the motivation and cognitive state of the test-taker, leading to systematic distortions in the resulting data.

B. The Human Factor as a Source of Distortion

1. Intentional Distortion – Lying for the Sake of Outcome

One of the most well-documented phenomena is the respondent’s tendency to manipulate the outcome. A common example is “social desirability” (Paulhus, 1991), where individuals select answers they believe will be viewed positively by others or by employers. This is particularly relevant in corporate environments, where test results may influence promotion decisions, team composition, or access to projects.

Such distortions significantly compromise the construction of an accurate psychological profile, especially when the respondent is well-informed about the principles behind the test and intentionally crafts a desirable version of themselves.

2. Sabotage and Effort Minimization

Another risk category is the respondent’s overt unwillingness to engage in the process. An employee may perceive the test as an imposed obligation irrelevant to their actual job performance. In such cases, the primary motivation becomes completing the test as quickly as possible without processing the content of the questions.

This behavior is particularly evident in linear questionnaires with semantically repetitive items—fatigue and boredom lead to random or inconsistent responses.

3. Superficial Processing and Fragmented Thinking

Modern generations—especially members of Generation Z—have grown up in a high-speed information environment. Their cognitive profile is marked by reduced attention span, a preference for visual content, and a high degree of attention-switching. The influence of the internet and social media has fundamentally altered the way information is processed—from sequential to fragmented, from logical to associative (Carr, 2010; Twenge, 2017).

In testing contexts, this often results in questions being skimmed rather than thoughtfully considered. Thus, even in the absence of deliberate distortion, the validity of responses is threatened by shifts in generational cognitive architecture.

4. Cognitive Fatigue and Psychophysiological Strain

Even with sincere engagement, lengthy testing can itself become a source of distortion. After 30–40 similar questions, attention dulls, irritation increases, and perceptual errors begin to emerge. The effect of "decision fatigue" (Baumeister et al., 1998)—a decline in the quality of decision-making under prolonged cognitive load—becomes apparent. This is especially pronounced among individuals with high emotional reactivity and low tolerance for monotony.

Typological tests rarely account for these factors. They are based on the assumption of a rational decision-making process, overlooking the fact that responses may vary depending on fatigue levels, emotional state, or external pressures (e.g., a time-constrained office setting).

III. Analysis of Structural Issues in Traditional Tests

Contemporary typological tests, widely used in HR and psychological consulting, often retain a traditional linear architecture: a fixed set of statements, each requiring the selection of one predetermined response option. At first glance, this structure appears reliable and time-tested. However, it is precisely this format that frequently becomes a source of validity distortions, especially in real-world application contexts.

A. Linear Structure and Fatigue Effect

A typical linear personality type test includes between 60 and 100 items. For example, the MBTI Form M contains 93 items; Big Five assessments range from 50 to 120; and professional Socionics questionnaires include no fewer than 80. Each item requires a minimal degree of analysis, comparison with prior statements, and a decision. Under time constraints and limited attention, such a format quickly induces cognitive fatigue.

A decline in response quality typically begins in the second half of the test: the respondent accelerates their pace, more frequently selects extreme or patterned responses, and reduces the level of analytical engagement. This effect is documented in a study by Mikki Hebl (2000), which found that cognitive overload during the second half of lengthy questionnaires correlates with an increase in response inconsistency.

B. Repetitive Formulations and Loss of Engagement

Many items in traditional tests repeat essentially the same semantic content with slight variations in wording:

"I enjoy spending time alone,"
"Social interaction tires me,"
"I find it difficult to be the center of attention."

For the attentive respondent, this comes across as redundancy; for the inattentive one—as an annoying repetition. In both cases, engagement is lost. The individual begins seeking a "quick exit"—mechanically completing the answers or clicking the first plausible option. This results in the illusion of data in the absence of real psychological substance.

C. Lack of Adaptivity

Most traditional tests disregard individual differences in information processing speed, cognitive style, or even basic psychophysiological traits: some people process text slowly, while others quickly lose focus. In contrast to modern adaptive testing systems (such as the GRE CAT), typological tests rarely employ logic that adjusts to the respondent’s answers.

In other words, regardless of how one responds, the respondent proceeds through the entire set of items in the same order and under the same cognitive load. This rigidity in structure becomes a contributing factor to the degradation of validity in and of itself.

IV. Validation Models Proposed by Opteamyzer

Acknowledging the limitations of classical psychometric approaches, Opteamyzer has developed a set of solutions aimed at enhancing the validity and reliability of data obtained during typological assessments. The primary goal of these solutions is not merely to achieve more “accurate” results, but to eliminate the root causes that lead respondents to distort their answers or lose engagement. The platform incorporates three key innovations into its architecture: scenario-based (nonlinear) testing, time tracking, and incongruent control items.

A. Scenario-Based Nonlinear Structure

As discussed earlier, traditional tests tend to overwhelm respondents through a linear, monotonous structure. Opteamyzer adopts a fundamentally different approach: each subsequent question depends on the respondent’s previous answer. This creates a nonlinear logic and reduces the total number of questions by several times.

For example:

A test originally containing 60–80 items can be condensed into 12–16 while maintaining accuracy.
The time required to complete the test is reduced to five minutes, aligning with the actual time constraints typical in workplace or industrial settings.
The participant experiences the process as dynamic, which increases motivation and reduces the likelihood of refusal or mechanical completion.

Similar approaches have long demonstrated effectiveness in adaptive systems such as GRE CAT and GMAT, where each subsequent question is calibrated based on the test-taker’s previous performance (Weiss, 1982). However, such logic has rarely been applied within typological assessments to date.

B. Time Tracking as a Tool for Validity Assessment

Unlike some modern tests that use countdown timers to pressure the respondent, Opteamyzer avoids forcing decision-making under artificial time constraints. Instead, it implements unobtrusive time tracking: the system records the total time from the first to the last response, as well as the duration of each individual reaction.

This allows the platform to:

Identify unusually fast completions (e.g., 12 questions in 25 seconds), which strongly indicate non-reflective behavior.
Detect instances where the respondent lingered on a specific question for an unusually long time, which may point to hesitation, stress, or technical difficulties.

In this way, time becomes a meta-indicator that helps differentiate valid responses from statistical noise.

C. Incongruent Control Items

Another structural component implemented in Opteamyzer is the inclusion of control items that have no logical connection to the presented scenario and are formulated as “inserted anomalies.” These options are embedded among standard response choices. When selected, they signal potential invalidity of the test session.

For example, in a workplace conflict scenario, an available response option might be “have a coffee and forget the situation.” While seemingly benign, it is semantically inappropriate. If a respondent selects such options:

Once — the test may be flagged as questionable.
Twice or more — the session is either invalidated or requires review by a specialist.

This method is comparable to techniques used in clinical psychodiagnostics to detect invalid response protocols (e.g., the F and L scales in the MMPI).

V. Ethical and Practical Considerations in Implementing New Models

The development and implementation of new methods for validating data in typological testing inevitably raise ethical, motivational, and organizational concerns. Even the most precise technologies, if introduced without consideration of the human factor, risk not only losing their effectiveness but also alienating participants.

A. Voluntariness vs. Accuracy: Striking a Balance

One of the fundamental principles of personality testing is the voluntary nature of participation. However, in corporate practice, formal coercion is common—an employee is "invited" to take a test, but in reality, has no viable alternative. This undermines trust, reduces motivation, and leads to response distortion.

The Opteamyzer model accounts for this by emphasizing interface transparency: the test appears simple, short, and engaging, thereby giving users a sense of control over the process. The scenario-based format transforms the test into something akin to a dialogue or interactive task, reducing resistance and increasing sincerity.

B. Minimizing Cognitive Load as a Humanistic Approach

Traditional tests often operate under the implicit assumption that the more questions there are, the more reliable the result. However, recent research shows that fatigue, irritability, and anxiety during testing are direct sources of data distortion (Kaplan & Saccuzzo, 2017). Simplifying the test structure does not dilute its essence but rather reflects an adaptation to the cognitive reality of the modern individual.

Thus, a humanistic approach does not entail rejecting diagnostics, but rather caring about its form: Opteamyzer offers a format in which the respondent no longer feels like a research object but instead becomes a co-author of the process.

C. Increasing Engagement Without Manipulation

The use of gamification, adaptive branching, and a neutral visual environment all contribute to enhancing intrinsic motivation. However, it is important to avoid manipulative tactics such as intrusive timers, reward schemes for completion, or emotional pressure.

Opteamyzer maintains psychological neutrality in its interface and allows results to be used as part of a dialog with the user—not as a final verdict, but as a hypothesis requiring collaborative interpretation. This restores a sense of agency to the individual and strengthens trust in the system.

Conclusion

Modern psychometrics faces a dual challenge: on the one hand, the need to improve the precision and reliability of measurements; on the other, a profound transformation in how individuals perceive information, sustain motivation, and process cognitive tasks. This tension is especially pronounced in typological testing, where the accuracy of data directly depends on the respondent’s attentiveness, engagement, and psychological state at the moment of participation.

As demonstrated above, response distortion can result not only from deliberate misrepresentation (e.g., impression management) but also from structural flaws in the tests themselves—excessive length, repetition, lack of adaptivity, and misalignment with actual conditions of information consumption. Superficial thinking, fatigue, irritability, and even indifference all contribute to data distortion, which in turn leads to flawed models of personality, team dynamics, and career planning.

Opteamyzer introduces a new generation of tools designed to identify and prevent invalid responses. Scenario-based (nonlinear) testing dramatically reduces cognitive load without sacrificing precision. Time tracking functions as an objective meta-indicator of response sincerity, and the inclusion of control items helps flag mechanical or disengaged behavior. These methods pave the way for a new ethics of assessment—one that is more respectful of the individual, more sensitive to context, and ultimately more accurate in its outcomes.

Disclaimer