Monday, February 2, 2015

Why does item response theory matter?



Why does item response theory matter?
by Nazia Rahman, a member of the Law School Admission Council’s Psychometric Research group.

A great deal of intellectual effort goes into creating the LSAT—India™, and a crucial part of this is carried out by measurement scientists, also known as psychometricians. The goal is always to work with subject-matter experts to design and build tests according to industry best practices that are valid, reliable, and fair. To this end, I have recently participated in presenting a paper about one aspect of this: using item response theory to assemble the LSAT—India™.

Item response theory (IRT) is a statistical approach used in the analysis of test data, and is widely applied throughout North America and Europe.  In the testing world, questions are known as “items.” By allowing for greater precision in the estimation of psychometric characteristics at both the item level and test level, IRT provides better control over the characteristics of both the assembled test forms and the test scores provided

IRT methodology is employed by the Law School Admission Council (LSAC) in support of the LSAT—India™.  IRT is used to describe test characteristics, such as difficulty at both the question level and the test level. IRT parameters describing the characteristics of test items are used in the assembly of LSATIndia forms to ensure proper levels of difficulty and score precision. 

At LSAC, psychometric support is provided for the development of items, assembly of test forms, and post-test analyses. The construction process for LSAT—India™ test forms is guided by IRT-based descriptions of item characteristics, such as difficulty, as well as strict content requirements. Should the testing program expand, IRT equating methods could allow for the comparison of LSAT—India™ scores earned across multiple administrations using multiple test forms.

Statistical support spans the entire process from test creation to score reporting.  In assembling a test form, statistical characteristics, such as question difficulty, are used so that the overall difficulty of a test form to be administered conforms to predefined specifications. After the test has been administered, further statistical analyses are carried out to verify whether individual items and the test form as a whole performed as expected.  Ultimately, statistical analyses are used to generate final test scores. 

LSAC staff with advanced degrees in areas such as logic, English, linguistics, and psychometrics review item and test form statistics for the administered exam in search of any items that perform differently from what is expected. As a result of this review, if an item is determined to be unfair in any way to any segment of the test-taking population, it is removed from the scoring of the test. But due to the very stringent review process that occurs prior to the test, removal of an item has never been necessary for the LSAT—India™.
 
Schools, colleges, and educational institutions can have greater confidence in the test scores produced through these stringent statistical analyses. It also means that the exam candidates themselves can have far greater confidence in the accuracy of their test results, and ultimately in the important high-stake decisions that are made based on these test scores.