The reliability and precision of total scores and IRT estimates as a function of polytomous IRT parameters and latent trait distribution created by Steven Andrew Culpepper

By:

Culpepper, Steven Andrew [author]

Material type: Text

TextSeries: ; Volume , number ,USA : Sage; 2013Content type:

text

Media type:

unmediated

Carrier type:

volume

Subject(s):

Online resources:

Click here to access online

Summary: A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and precision of scores within the CTT and IRT frameworks. This study presented new results pertaining to the relative precision (i.e., the test score conditional standard error of measurement for a given trait value) of CTT and IRT, and the new results shed light on the conditions where total scores and IRT estimates are more or less precisely measured. The relative reliability of CTT and IRT scores is examined as a function of item characteristics (e.g., locations, category thresholds, and discriminations) and subject characteristics (e.g., the skewness and kurtosis of the latent distribution). CTT total scores were more reliable when the latent distribution was mismatched with category thresholds, but the discrepancy between CTT and IRT declined as the number of scale categories increased. This article also considered the appropriateness of linear approximations of polytomous items and presented circumstances where linear approximations are viable. A linear approximation may be appropriate for items with two response options depending on the item discrimination and the match between the item location and latent distribution. However, linear approximations are biased whenever items are located in the tails of the latent distribution and the bias is larger for more discriminating items.

Reviews from LibraryThing.com:

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Call number	Vol info	Copy number	Status	Notes	Date due	Barcode
Journal Article	Main Library - Special Collections	BF39 APP (Browse shelf(Opens below))	Vol. 37, No. 3 pages 201-225	SP17305	Not for loan	For in-house use only

Browsing Main Library shelves, Shelving location: - Special Collections Close shelf browser (Hides shelf browser)

Previous	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	Next
Previous	BF39 APP Optimal test design with rule-based item generation	BF39 APP A review of DIMPACK version 1.0: conditional covariance–based test dimensionality analysis package	BF39 APP The random-threshold generalized unfolding model and its application of computerized adaptive testing	BF39 APP The reliability and precision of total scores and IRT estimates as a function of polytomous IRT parameters and latent trait distribution	BF39 APP Two approaches to estimation of classification accuracy rate under item response theory	BF39 APP IRTPRO 2.1 for Windows (Item Response Theory for Patient-Reported Outcomes)	BF39 APP RaschFit.sas: A SAS Macro for Generating Rasch Model Expected Values, Residuals, and Fit Statistics	Next

A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and precision of scores within the CTT and IRT frameworks. This study presented new results pertaining to the relative precision (i.e., the test score conditional standard error of measurement for a given trait value) of CTT and IRT, and the new results shed light on the conditions where total scores and IRT estimates are more or less precisely measured. The relative reliability of CTT and IRT scores is examined as a function of item characteristics (e.g., locations, category thresholds, and discriminations) and subject characteristics (e.g., the skewness and kurtosis of the latent distribution). CTT total scores were more reliable when the latent distribution was mismatched with category thresholds, but the discrepancy between CTT and IRT declined as the number of scale categories increased. This article also considered the appropriateness of linear approximations of polytomous items and presented circumstances where linear approximations are viable. A linear approximation may be appropriate for items with two response options depending on the item discrimination and the match between the item location and latent distribution. However, linear approximations are biased whenever items are located in the tails of the latent distribution and the bias is larger for more discriminating items.

There are no comments on this title.

to post a comment.

here