CLASSICAL TEST THEORY AND ITEM RESPONSE THEORY
PERCEPTIONS OF THE VALIDITY OF TEST SCORES
In testing, achievement is that which results from the confrontational
interaction between the level of ability under measurement possessed by a
testee and the level of trait demanded by the task developed to evoke such
ability during testing. Hence test scores have meaning or test scores are valid
to the extent that it was only the ability under measurement that sustained the
responses to each item in the test through which the scores were generated.
While classical test theory thrives on reliability, that is, how well a test
repeatedly gives us the same score; item response theory sets to ensure that
the score generated through a test represents the truth. That is, CTT is tilted
towards satisfying the dictates of reliability, the repeatedness of the same test
score across time, form, and parts of the same test, IRT is tilted towards
ensuring a valid test, that is, ensuring that it is only what the test was designed
to measure that underlies responses to its items. In other words, reliability is
closer to the heart of CTT than that of IRT hence CTT is oftentimes referred to
as reliability theory. For CTT, validity is moodily defined because there is no
attempt to ensure that that which underlies responses to items in a test is that
which the test was designed to measure, whereas for IRT the assumption of
unidimensionality implies its recognition and step taken to ensure that items
that fit any of its models measure one and only one ability.
Key words: Classical test theory; item response theory; reliability; validity;
unidimensionality.
H. Johnson Nenty, PhD (Prof.)
CLASSICAL TEST THEORY AND ITEM RESPONSE THEORY PERCEPTIONS OF THE VALIDITY OF TEST SCORES
In testing, achievement is that which results from the confrontational interaction between the level of ability under measurement possessed by a testee and the level of trait demanded by the task developed to evoke such ability during testing. Hence test scores have meaning or test scores are valid to the extent that it was only the ability under measurement that sustained the responses to each item in the test through which the scores were generated. While classical test theory thrives on reliability, that is, how well a test repeatedly gives us the same score; item response theory sets to ensure that the score generated through a test represents the truth. That is, CTT is tilted towards satisfying the dictates of reliability, the repeatedness of the same test score across time, form, and parts of the same test, IRT is tilted towards ensuring a valid test, that is, ensuring that it is only what the test was designed to measure that underlies responses to its items. In other words, reliability is closer to the heart of CTT than that of IRT hence CTT is oftentimes referred to as reliability theory. For CTT, validity is moodily defined because there is no attempt to ensure that that which underlies responses to items in a test is that which the test was designed to measure, whereas for IRT the assumption of unidimensionality implies its recognition and step taken to ensure that items that fit any of its models measure one and only one ability. Key words: Classical test theory; item response theory; reliability; validity; unidimensionality.