Why
does item response theory matter?
by Nazia Rahman, a member of the Law School
Admission Council’s Psychometric Research group.
A great deal of intellectual effort goes into creating the LSAT—India™, and a crucial part of this is carried out by
measurement scientists, also known as psychometricians. The goal is always to
work with subject-matter experts to
design and build tests according to industry best practices that are valid,
reliable, and fair. To
this end, I have recently participated
in presenting a paper about one aspect of this: using item response theory to assemble the LSAT—India™.
Item response theory (IRT)
is a statistical approach used in the analysis of test data, and is widely
applied throughout North America and Europe. In the
testing world, questions are known as “items.” By allowing for greater precision in the estimation
of psychometric characteristics at both the item level and test level, IRT provides better control over the
characteristics of both the assembled test forms and the test
scores provided.
IRT methodology is employed by the Law School Admission Council (LSAC) in support
of the LSAT—India™. IRT is used to describe test characteristics, such as difficulty at both
the question level and the test level. IRT parameters describing the characteristics of
test items are used in the
assembly of LSAT—India™ forms to ensure proper
levels of difficulty and score precision.
At LSAC, psychometric support is provided for the
development of items, assembly of test forms, and post-test analyses. The construction process for
LSAT—India™ test forms is guided by IRT-based descriptions of item
characteristics, such as difficulty, as well as strict content requirements.
Should the testing program expand, IRT equating methods could allow for the
comparison of LSAT—India™ scores earned across multiple administrations using multiple test
forms.
Statistical support spans
the entire process from test creation to score reporting. In assembling a test form, statistical
characteristics, such as question difficulty, are used so that the overall
difficulty of a test form to be administered conforms to predefined
specifications. After the test has been administered, further statistical
analyses are carried out to verify whether individual items and the test form
as a whole performed as expected.
Ultimately, statistical analyses are used to generate final test scores.
LSAC staff with advanced
degrees in areas such as logic, English, linguistics, and psychometrics review item and test form
statistics for the administered
exam in search of any items that perform differently from what is expected. As a result of this review, if an
item is determined to be unfair in any way to any segment of the test-taking
population, it is removed from the
scoring of the test. But due to the very stringent review process that occurs
prior to the test, removal of an
item
has never been necessary for the LSAT—India™.
Schools, colleges, and
educational institutions can have greater confidence in the test scores
produced through these stringent statistical analyses. It also means that the
exam candidates themselves can have far greater confidence in the accuracy of
their test results, and ultimately in the important high-stake decisions that are made based on these test scores.
No comments:
Post a Comment