-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy path09-Item-Response-Theory.Rmd
2689 lines (2036 loc) · 177 KB
/
09-Item-Response-Theory.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Item Response Theory {#irt}
In the chapter on [reliability](#reliability), we introduced [classical test theory](#ctt).\index{reliability}\index{classical test theory}
[Classical test theory](#ctt) is a measurement theory of how test scores relate to a construct.\index{classical test theory}
[Classical test theory](#ctt) provides a way to estimate the relation between the measure (or item) and the construct.\index{classical test theory}
For instance, with a [classical test theory](#ctt) approach, to estimate the relation between an item and the construct, you would compute an [item–total correlation](#itemTotalCorrelation-reliability).\index{classical test theory}\index{reliability!internal consistency!average item–total correlation}
An [item–total correlation](#itemTotalCorrelation-reliability) is the correlation of an item with the total score on the measure (e.g., sum score).\index{reliability!internal consistency!average item–total correlation}
The [item–total correlation](#itemTotalCorrelation-reliability) approximates the relation between an item and the construct.\index{reliability!internal consistency!average item–total correlation}
However, the [item–total correlation](#itemTotalCorrelation-reliability) is a crude estimate of the relation between an item and the construct.\index{reliability!internal consistency!average item–total correlation}
And there are many other ways to characterize the relation between an item and a construct.
One such way is with item response theory (IRT).\index{item response theory}
## Overview of IRT {#overview-irt}
Unlike [classical test theory](#ctt), which is a measurement theory of how test scores relate to a construct, IRT is a measurement theory that describes how an *item* is related to a construct.\index{classical test theory}\index{item response theory}
For instance, given a particular person's level on the construct, what is their chance of answering "TRUE" on a particular item?\index{item response theory}
IRT is an approach to [latent variable modeling](#latentVariableModeling).\index{item response theory}\index{latent variable}
In IRT, we estimate a person's construct score (i.e., level on the construct) based on their item responses.
The construct is estimated as a latent factor that represents the common variance among all items as in [structural equation modeling](#sem) or [confirmatory factor analysis](#cfa).\index{item response theory}
The person's level on the construct is called theta ($\theta$).\index{item response theory}\index{item response theory!theta}
When dealing with performance-based tests, theta is sometimes called "ability."\index{item response theory}\index{item response theory!theta}
### Item Characteristic Curve {#icc}
In IRT, we can plot an *item characteristic curve* (ICC).\index{item response theory}\index{item response theory!item characteristic curve}
The ICC is a plot of the model-derived probability of a symptom being present (or a correct response) as a function of a person's standing on a latent continuum.\index{item response theory!item characteristic curve}
For instance, we can create empirical ICCs that can take any shape (see Figure \@ref(fig:empiricalICC)).\index{item response theory!item characteristic curve}
```{r, include = FALSE}
empiricalICCdata <- data.frame(
item1 = c(.40, .63, .73, .85, .93, .95, .97, .99, .99),
item2 = c(.20, .53, .56, .76, .90, .94, .95, .98, .99),
item3 = c(.10, .28, .39, .58, .75, .82, .88, .94, .99),
item4 = c(.05, .24, .30, .50, .68, .73, .84, .92, .99),
item5 = c(.03, .19, .27, .38, .49, .53, .79, .97, .99),
item6 = c(.02, .15, .25, .43, .57, .65, .75, .85, .95),
item7 = c(.01, .07, .15, .29, .42, .59, .70, .82, .90),
item8 = c(.01, .05, .10, .26, .35, .46, .60, .80, .85),
item9 = c(.01, .02, .05, .16, .23, .35, .46, .55, .70),
item10 = c(.01, .02, .03, .05, .06, .07, .10, .14, .20),
itemSum = 1:9
)
```
```{r empiricalICC, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.height = 8, fig.cap = "Empirical Item Characteristic Curves of the Probability of Endorsement of a Given Item as a Function of the Person's Sum Score.", fig.scap = "Empirical Item Characteristic Curves of the Probability of Endorsement of a Given Item as a Function of the Person's Sum Score."}
plot(empiricalICCdata$itemSum, empiricalICCdata$item10, type = "n", xlim = c(1,9), ylim = c(0,1), xlab = "Person's Sum Score", ylab = "Probability of Item Endorsement", xaxt = "n")
lines(empiricalICCdata$itemSum, empiricalICCdata$item1, type = "b", pch = "1")
lines(empiricalICCdata$itemSum, empiricalICCdata$item2, type = "b", pch = "2")
lines(empiricalICCdata$itemSum, empiricalICCdata$item3, type = "b", pch = "3")
lines(empiricalICCdata$itemSum, empiricalICCdata$item4, type = "b", pch = "4")
lines(empiricalICCdata$itemSum, empiricalICCdata$item5, type = "b", pch = "5")
lines(empiricalICCdata$itemSum, empiricalICCdata$item6, type = "b", pch = "6")
lines(empiricalICCdata$itemSum, empiricalICCdata$item7, type = "b", pch = "7")
lines(empiricalICCdata$itemSum, empiricalICCdata$item8, type = "b", pch = "8")
lines(empiricalICCdata$itemSum, empiricalICCdata$item9, type = "b", pch = "9")
lines(empiricalICCdata$itemSum, empiricalICCdata$item10, type = "l")
points(empiricalICCdata$itemSum, empiricalICCdata$item10, type = "p", pch = 19, col = "white", cex = 3)
text(empiricalICCdata$itemSum, empiricalICCdata$item10, labels = "10")
axis(1, at = 1:9, labels = 1:9)
```
In a model-implied ICC, we fit a logistic (sigmoid) curve to each item's probability of a symptom being present as a function of a person's level on the latent construct.\index{item response theory!item characteristic curve}
The model-implied ICCs for the same 10 items from Figure \@ref(fig:empiricalICC) are depicted in Figure \@ref(fig:modelImpliedICC).\index{item response theory!item characteristic curve}
```{r, include = FALSE}
#https://www.statforbiology.com/nonlinearregression/usefulequations#logistic_curve; archived at https://perma.cc/8WFX-FCEQ
#https://github.com/OnofriAndreaPG/aomisc/blob/1eb698b3bc5f55a718c37bfd2028b9ac73a6fbbe/R/SSL.R; archived at https://perma.cc/94CM-PSGY
library("viridis")
#Log-Logistic Function nlsL.2 (similar to L.3 from drc package)
L2.fun <- function(predictor, a, b) {
x <- predictor
1/(1 + exp( - a* (x - b)))
}
L2.Init <- function(mCall, LHS, data, ...) {
xy <- sortedXyData(mCall[["predictor"]], LHS, data)
x <- xy[, "x"]; y <- xy[, "y"]
d <- 1
## Linear regression on pseudo y values
pseudoY <- log((d - y)/(y+0.00001))
coefs <- coef( lm(pseudoY ~ x))
k <- coefs[1]; a <- - coefs[2]
b <- k/a
value <- c(a, b)
names(value) <- mCall[c("a", "b")]
value
}
NLS.L2 <- selfStart(L2.fun, L2.Init, parameters = c("a", "b"))
twoPLitem1 <- nls(item1 ~ NLS.L2(itemSum, a, b), data = empiricalICCdata, control = list(warnOnly = TRUE))
twoPLitem2 <- nls(item2 ~ NLS.L2(itemSum, a, b), data = empiricalICCdata, control = list(warnOnly = TRUE))
twoPLitem3 <- nls(item3 ~ NLS.L2(itemSum, a, b), data = empiricalICCdata, control = list(warnOnly = TRUE))
twoPLitem4 <- nls(item4 ~ NLS.L2(itemSum, a, b), data = empiricalICCdata, control = list(warnOnly = TRUE))
twoPLitem5 <- nls(item5 ~ NLS.L2(itemSum, a, b), data = empiricalICCdata, control = list(warnOnly = TRUE))
twoPLitem6 <- nls(item6 ~ NLS.L2(itemSum, a, b), data = empiricalICCdata, control = list(warnOnly = TRUE))
twoPLitem7 <- nls(item7 ~ NLS.L2(itemSum, a, b), data = empiricalICCdata, control = list(warnOnly = TRUE))
twoPLitem8 <- nls(item8 ~ NLS.L2(itemSum, a, b), data = empiricalICCdata, control = list(warnOnly = TRUE))
twoPLitem9 <- nls(item9 ~ NLS.L2(itemSum, a, b), data = empiricalICCdata, control = list(warnOnly = TRUE))
twoPLitem10 <- nls(item10 ~ NLS.L2(itemSum, a, b), data = empiricalICCdata, control = list(warnOnly = TRUE))
newdata <- data.frame(itemSum = seq(from = 1, to = 9, length.out = 1000))
newdata$item1 <- predict(twoPLitem1, newdata = newdata)
newdata$item2 <- predict(twoPLitem2, newdata = newdata)
newdata$item3 <- predict(twoPLitem3, newdata = newdata)
newdata$item4 <- predict(twoPLitem4, newdata = newdata)
newdata$item5 <- predict(twoPLitem5, newdata = newdata)
newdata$item6 <- predict(twoPLitem6, newdata = newdata)
newdata$item7 <- predict(twoPLitem7, newdata = newdata)
newdata$item8 <- predict(twoPLitem8, newdata = newdata)
newdata$item9 <- predict(twoPLitem9, newdata = newdata)
newdata$item10 <- predict(twoPLitem10, newdata = newdata)
newdata$itemTotal <- rowSums(newdata[,paste("item", 1:10, sep = "")])
```
```{r modelImpliedICC, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "Item Characteristic Curves of the Probability of Endorsement of a Given Item as a Function of the Person's Level on the Latent Construct."}
plot(newdata$itemSum, newdata$item1, type = "n", ylim = c(0,1), xlab = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), ylab = "Probability of Item Endorsement", xaxt = "n")
lines(newdata$itemSum, newdata$item1, type = "l", lwd = 2, col = viridis(10)[1])
lines(newdata$itemSum, newdata$item2, type = "l", lwd = 2, col = viridis(10)[2])
lines(newdata$itemSum, newdata$item3, type = "l", lwd = 2, col = viridis(10)[3])
lines(newdata$itemSum, newdata$item4, type = "l", lwd = 2, col = viridis(10)[4])
lines(newdata$itemSum, newdata$item5, type = "l", lwd = 2, col = viridis(10)[5])
lines(newdata$itemSum, newdata$item6, type = "l", lwd = 2, col = viridis(10)[6])
lines(newdata$itemSum, newdata$item7, type = "l", lwd = 2, col = viridis(10)[7])
lines(newdata$itemSum, newdata$item8, type = "l", lwd = 2, col = viridis(10)[8])
lines(newdata$itemSum, newdata$item9, type = "l", lwd = 2, col = viridis(10)[9])
lines(newdata$itemSum, newdata$item10, type = "l", lwd = 2, col = viridis(10)[10])
axis(1, at = seq(from = 1, to = 9, length.out = 9), labels = c(-4:4))
legend("topleft", legend = paste("item", 1:10, sep = " "), col = viridis(10), lwd = 2, cex = 0.6)
```
ICCs can be summed across items to get the test characteristic curve (TCC):\index{item response theory!item characteristic curve}\index{item response theory!test characteristic curve}
```{r modelImpliedTCC, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "Test Characteristic Curve of the Expected Total Score on the Test as a Function of the Person's Level on the Latent Construct."}
plot(newdata$itemSum, newdata$itemTotal, type = "n", ylim = c(0,10), xlab = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), ylab = "Expected Total Score on Test", xaxt = "n")
lines(newdata$itemSum, newdata$itemTotal, type = "l", lwd = 2)
axis(1, at = seq(from = 1, to = 9, length.out = 9), labels = c(-4:4))
```
An ICC provides more information than an [item–total correlation](#itemTotalCorrelation-reliability).\index{item response theory!item characteristic curve}\index{reliability!internal consistency!average item–total correlation}
Visually, we can see the utility of various items by looking at the items' ICC plots.\index{item response theory!item characteristic curve}
For instance, consider what might be a useless item for diagnostic purposes.\index{item response theory!item characteristic curve}
For a particular item, among those with a low total score (level on the construct), 90% respond with "TRUE" to the item, whereas among everyone else, 100% respond with "TRUE" (see Figure \@ref(fig:iccCeilingEffect)).\index{item response theory!item characteristic curve}
This item has a ceiling effect and provides only a little information about who would be considered above clinical threshold for a disorder.\index{item response theory!item characteristic curve}\index{ceiling effect}
So, the item is not very clinically useful.\index{item response theory!item characteristic curve}
```{r, include = FALSE}
library("drc")
empiricalICCdata$ceilingEffect <- c(.90, .95, .98, 1, 1, 1, 1, 1, 1)
empiricalICCdata$diagnosticallyUseful <- c(0, 0, 0, 0, 0, 0, .69, .70, .70)
threePL_ceilingeffect <- drm(ceilingEffect ~ itemSum, data = empiricalICCdata, fct = LL.3u(), type = "continuous") #fix upper asymptote at 1
threePL_diagnosticallyUseful <- drm(diagnosticallyUseful ~ itemSum, data = empiricalICCdata, fct = LL.3(), type = "continuous") #fix lower asymptote at 0
newdata$ceilingEffect <- predict(threePL_ceilingeffect, newdata = newdata)
newdata$diagnosticallyUseful <- predict(threePL_diagnosticallyUseful, newdata = newdata)
```
```{r iccCeilingEffect, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "Item Characteristic Curve of an Item with a Ceiling Effect That is not Diagnostically Useful."}
plot(newdata$itemSum, newdata$ceilingEffect, type = "l", lwd = 2, ylim = c(0,1), xlab = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), ylab = "Probability of Item Endorsement", xaxt = "n")
axis(1, at = seq(from = 1, to = 9, length.out = 9), labels = c(-4:4))
```
Now, consider a different item.\index{item response theory!item characteristic curve}
For those with a low level on the construct, 0% respond with "TRUE", so it has a floor effect and tells us nothing about the lower end of the construct.\index{item response theory!item characteristic curve}
But for those with a higher level on the construct, 70% respond with true (see Figure \@ref(fig:iccDiagnosticallyUseful)).\index{item response theory!item characteristic curve}
So, the item tells us something about the higher end of the distribution, and could be diagnostically useful.\index{item response theory!item characteristic curve}
Thus, an ICC allows us to immediately tell the utility of items.\index{item response theory!item characteristic curve}
```{r iccDiagnosticallyUseful, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "Item Characteristic Curve of an Item With a Floor Effect That is Diagnostically Useful."}
plot(newdata$itemSum, newdata$diagnosticallyUseful, type = "l", lwd = 2, ylim = c(0,1), xlab = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), ylab = "Probability of Item Endorsement", xaxt = "n")
axis(1, at = seq(from = 1, to = 9, length.out = 9), labels = c(-4:4))
```
### Parameters {#parameters-irt}
We can estimate up to four parameters in an IRT model and can glean up to four key pieces of information from an item's ICC:\index{item response theory}\index{item response theory!item characteristic curve}
1. Difficulty (severity)\index{item response theory!item difficulty}
1. Discrimination\index{item response theory!item discrimination}
1. Guessing\index{item response theory!item guessing}
1. Inattention/careless errors\index{item response theory!item careless errors}
#### Difficulty (Severity) {#itemDifficulty}
The item's *difficulty* parameter is the item's location on the latent construct.\index{item response theory!item difficulty}
It is quantified by the intercept, i.e., the location on the x-axis of the inflection point of the [ICC](#icc).\index{item response theory!item difficulty}\index{item response theory!item characteristic curve}
In a 1- or 2-parameter model, the inflection point is where 50% of the sample endorses the item (or gets the item correct), that is, the point on the x-axis where the [ICC](#icc) crosses .5 probability on the y-axis (i.e., the level on the construct at which the probability of endorsing the item is equal to the probability of not endorsing the item).\index{item response theory!item difficulty}\index{item response theory!item characteristic curve}
Item difficulty is similar to item means or intercepts in [structural equation modeling](#sem) or [factor analysis](#factorAnalysis).\index{item response theory!item difficulty}\index{structural equation modeling!intercept}
Some items are more useful at the higher levels of the construct, whereas other items are more useful at the lower levels of the construct.
See Figure \@ref(fig:iccDifficulty) for an example of an item with a low difficulty and an item with a high difficulty.\index{item response theory!item difficulty}
```{r, include = FALSE}
library("tidyverse")
library("psych")
difficultyData <- data.frame(itemSum = 1:length(-4:4))
difficultyData$lowDifficulty <- psych::logistic(-4:4, d = -0.8)
difficultyData$highDifficulty <- psych::logistic(-4:4, d = 0.8)
L1.fun <- function(predictor, b) {
x <- predictor
1/(1 + exp( -(x - b)))
}
L1.Init <- function(mCall, LHS, data, ...) {
xy <- sortedXyData(mCall[["predictor"]], LHS, data)
x <- xy[, "x"]; y <- xy[, "y"]
d <- 1
## Linear regression on pseudo y values
pseudoY <- log((d - y)/(y+0.00001))
coefs <- coef( lm(pseudoY ~ x))
k <- coefs[1]
b <- coefs[2]
value <- c(b)
names(value) <- mCall[c("b")]
value
}
NLS.L1 <- selfStart(L1.fun, L1.Init, parameters = c("b"))
onePL_lowDifficulty <- nls(lowDifficulty ~ NLS.L1(itemSum, b), data = difficultyData, control = list(warnOnly = TRUE))
onePL_highDifficulty <- nls(highDifficulty ~ NLS.L1(itemSum, b), data = difficultyData, control = list(warnOnly = TRUE))
newdata$lowDifficulty <- predict(onePL_lowDifficulty, newdata = newdata)
newdata$highDifficulty <- predict(onePL_highDifficulty, newdata = newdata)
newdata$theta <- seq(from = -4, to = 4, length.out = 1000)
midpoint_lowDifficulty <- newdata$theta[which.min(abs(newdata$lowDifficulty - 0.5))]
midpoint_highDifficulty <- newdata$theta[which.min(abs(newdata$highDifficulty - 0.5))]
difficulty_long <- pivot_longer(newdata, cols = lowDifficulty:highDifficulty) %>%
rename(item = name)
difficulty_long$Difficulty <- NA
difficulty_long$Difficulty[which(difficulty_long$item == "lowDifficulty")] <- "Low"
difficulty_long$Difficulty[which(difficulty_long$item == "highDifficulty")] <- "High"
```
(ref:iccDifficulty) Item Characteristic Curves of an Item With Low Difficulty Versus High Difficulty. The dashed horizontal line indicates a probability of item endorsement of .50. The dashed vertical line is the item difficulty, i.e., the person's level on the construct (the location on the x-axis) at the inflection point of the item characteristic curve. In a two-parameter logistic model, the inflection point corresponds to the probability of item endorsement is 50%. Thus, in a two-parameter logistic model, the difficulty of an item is the person's level on the construct where the probability of endorsing the item is 50%.
```{r iccDifficulty, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "(ref:iccDifficulty)", fig.scap = "Item Characteristic Curves of an Item With Low Difficulty Versus High Difficulty."}
ggplot(difficulty_long, aes(theta, value, group = Difficulty, color = Difficulty)) +
geom_line(linewidth = 1.5) +
scale_color_viridis_d() +
geom_hline(yintercept = 0.5, linetype = "dashed") +
geom_segment(aes(x = midpoint_lowDifficulty, xend = midpoint_lowDifficulty, y = 0, yend = 0.5), linewidth = 0.5, col = "black", linetype = "dashed") +
geom_segment(aes(x = midpoint_highDifficulty, xend = midpoint_highDifficulty, y = 0, yend = 0.5), linewidth = 0.5, col = "black", linetype = "dashed") +
scale_x_continuous(name = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), breaks = -4:4) +
scale_y_continuous(name = "Probability of Item Endorsement") +
theme_bw()
```
When dealing with a measure of clinical symptoms (e.g., depression), the difficulty parameter is sometimes called severity, because symptoms that are endorsed less frequently tend to be more severe [e.g., suicidal behavior; @Krueger2004].\index{item response theory!item difficulty}
One way of thinking about the severity parameter of an item is: "How severe does your psychopathology have to be for half of people to endorse the symptom?"\index{item response theory!item difficulty}
When dealing with a measure of performance, aptitude, or intelligence, the parameter would be more likely to be called difficulty: "How high does your ability have to be for half of people to pass the item?"\index{item response theory!item difficulty}
An item with a low difficulty would be considered easy, because even people with a low ability tend to pass the item.\index{item response theory!item difficulty}
An item with a high difficulty would be considered difficult, because only people with a high ability tend to pass the item.\index{item response theory!item difficulty}
#### Discrimination {#itemDiscrimination}
The item's *discrimination* parameter is how well the item can distinguish between those who were higher versus lower on the construct, that is, how strongly the item is correlated with the construct (i.e., the latent factor).\index{item response theory!item discrimination}
It is similar to the factor loading in [structural equation modeling](#sem) or [factor analysis](#factorAnalysis).\index{item response theory!item discrimination}\index{structural equation modeling!factor loading}
It is quantified by the slope of the [ICC](#icc), i.e., the steepness of the line at its steepest point.\index{item response theory!item discrimination}\index{item response theory!item characteristic curve}
The slope reflects the inverse of how much range of construct levels it would take to flip 50/50 whether a person is likely to pass or fail an item.\index{item response theory!item discrimination}
Some items have [ICCs](#icc) that go up fast (have a steep slope).\index{item response theory!item discrimination}\index{item response theory!item characteristic curve}
These items provide a fine distinction between people with lower versus higher levels on the construct and therefore have high discrimination.\index{item response theory!item discrimination}
Some items go up gradually (less steep slope), so it provides less precision and information, and has a low discrimination.\index{item response theory!item discrimination}\index{item response theory!item characteristic curve}\index{item response theory!information}
See Figure \@ref(fig:iccDiscrimination) for an example of an item with a low discrimination and an item with a high discrimination.\index{item response theory!item discrimination}\index{item response theory!item characteristic curve}
```{r, include = FALSE}
discriminationData <- data.frame(itemSum = 1:length(-4:4))
discriminationData$lowDiscrimination <- psych::logistic(-4:4, d = 0, a = 0.7)
discriminationData$highDiscrimination <- psych::logistic(-4:4, d = 0, a = 2)
twoPL_lowDiscrimination <- nls(lowDiscrimination ~ NLS.L2(itemSum, a, b), data = discriminationData, control = list(warnOnly = TRUE))
twoPL_highDiscrimination <- nls(highDiscrimination ~ NLS.L2(itemSum, a, b), data = discriminationData, control = list(warnOnly = TRUE))
newdata$lowDiscrimination <- predict(twoPL_lowDiscrimination, newdata = newdata)
newdata$highDiscrimination <- predict(twoPL_highDiscrimination, newdata = newdata)
newdata$theta <- seq(from = -4, to = 4, length.out = 1000)
discrimination_long <- pivot_longer(newdata, cols = lowDiscrimination:highDiscrimination) %>%
rename(item = name)
discrimination_long$Discrimination <- NA
discrimination_long$Discrimination[which(discrimination_long$item == "lowDiscrimination")] <- "Low"
discrimination_long$Discrimination[which(discrimination_long$item == "highDiscrimination")] <- "High"
```
```{r iccDiscrimination, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "Item Characteristic Curves of an Item With Low Discrimination Versus High Discrimination. The discrimination of an item is the slope of the line at its inflection point.", fig.scap = "Item Characteristic Curves of an Item With Low Discrimination Versus High Discrimination."}
ggplot(discrimination_long, aes(theta, value, group = Discrimination, color = Discrimination)) +
geom_line(linewidth = 1.5) +
scale_color_viridis_d() +
scale_x_continuous(name = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), breaks = -4:4) +
scale_y_continuous(name = "Probability of Item Endorsement") +
theme_bw()
```
#### Guessing {#itemGuessing}
The item's *guessing* parameter is reflected by the lower asymptote of the [ICC](#icc).\index{item response theory!item guessing}
If the item has a lower asymptote above zero, it suggests that the probability of getting the item correct (or endorsing the item) never reaches zero, for any level of the construct.\index{item response theory!item guessing}
On an educational test, this could correspond to the person's likelihood of being able to answer the item correctly by chance just by guessing.\index{item response theory!item guessing}
For example, for a 4-option multiple choice test, a respondent would be expected to get a given item correct 25% of the time just by guessing.\index{item response theory!item guessing}
See Figure \@ref(fig:iccGuessingTF) for an example of an item from a true/false exam and Figure \@ref(fig:iccGuessingMC) for an example of an item from a 4-option multiple choice exam.\index{item response theory!item guessing}
```{r, include = FALSE}
empiricalICCdata$guessingTF <- psych::logistic(-4:4, c = 0.5)
empiricalICCdata$guessingMC <- psych::logistic(-4:4, c = 0.25)
threePL_guessingTF <- drm(guessingTF ~ itemSum, data = empiricalICCdata, fct = LL.3u(), type = "continuous") #fix upper asymptote at 1
threePL_guessingMC <- drm(guessingMC ~ itemSum, data = empiricalICCdata, fct = LL.3u(), type = "continuous") #fix upper asymptote at 1
newdata$guessingTF <- predict(threePL_guessingTF, newdata = newdata)
newdata$guessingMC <- predict(threePL_guessingMC, newdata = newdata)
```
(ref:iccGuessingTFCaption) Item Characteristic Curve of an Item from a True/False Exam, There Test Takers Get the Item Correct at Least 50% of the Time.
```{r iccGuessingTF, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "(ref:iccGuessingTFCaption)"}
plot(newdata$itemSum, newdata$guessingTF, type = "l", lwd = 2, ylim = c(0,1), xlab = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), ylab = "Probability of Item Endorsement", xaxt = "n", yaxt = "n")
axis(1, at = seq(from = 1, to = 9, length.out = 9), labels = c(-4:4))
axis(2, at = c(0, .25, .5, .75, 1), labels = c(0, .25, .5, .75, 1))
```
(ref:iccGuessingMCCaption) Item Characteristic Curve of an Item From a 4-Option Multiple Choice Exam, Where Test Takers Get the Item Correct at Least 25% of the Time.
```{r iccGuessingMC, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "(ref:iccGuessingMCCaption)"}
plot(newdata$itemSum, newdata$guessingMC, type = "l", lwd = 2, ylim = c(0,1), xlab = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), ylab = "Probability of Item Endorsement", xaxt = "n", yaxt = "n")
axis(1, at = seq(from = 1, to = 9, length.out = 9), labels = c(-4:4))
axis(2, at = c(0, .25, .5, .75, 1), labels = c(0, .25, .5, .75, 1))
```
#### Inattention/Careless Errors {#itemCarelessErrors}
The item's *inattention* (or *careless error*) parameter is the reflected by the upper asymptote of the [ICC](#icc).\index{item response theory!item careless errors}
If the item has an upper asymptote below one, it suggests that the probability of getting the item correct (or endorsing the item) never reaches one, for any level on the construct.\index{item response theory!item careless errors}
See Figure \@ref(fig:iccInattention) for an example of an item whose probability of endorsement (or getting it correct) exceeds .85.\index{item response theory!item careless errors}
```{r, include = FALSE}
empiricalICCdata$inattention <- psych::logistic(-4:4, z = 0.85, a = 2)
threePL_inattention <- drm(inattention ~ itemSum, data = empiricalICCdata, fct = LL.3(), type = "continuous") #fix lower asymptote at 0
newdata$inattention <- predict(threePL_inattention, newdata = newdata)
```
```{r iccInattention, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "Item Characteristic Curve of an Item Where the Probability of Getting an Item Correct Never Exceeds .85."}
plot(newdata$itemSum, newdata$inattention, type = "l", lwd = 2, ylim = c(0,1), xlab = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), ylab = "Probability of Item Endorsement", xaxt = "n")
axis(1, at = seq(from = 1, to = 9, length.out = 9), labels = c(-4:4))
```
### Models {#models-irt}
IRT models can be fit that estimate one or more of these four item parameters.\index{item response theory}
#### 1-Parameter and Rasch models {#onePL}
A Rasch model estimates the [item difficulty](#itemDifficulty) parameter and holds everything else fixed across items.\index{item response theory!one-parameter model}\index{item response theory!item difficulty}
It fixes the [item discrimination](#itemDiscrimination) to be one for each item.\index{item response theory!one-parameter model}\index{item response theory!item discrimination}
In the Rasch model, the probability that a person $j$ with a level on the construct of $\theta$ gets a score of one (instead of zero) on item $i$, based on the difficulty ($b$) of the item, is estimated using Equation \@ref(eq:raschModel):\index{item response theory!one-parameter model}\index{item response theory!item difficulty}\index{item response theory!theta}
\begin{equation}
P(X = 1|\theta_j, b_i) = \frac{e^{\theta_j - b_i}}{1 + e^{\theta_j - b_i}}
(\#eq:raschModel)
\end{equation}
The [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] contains the `fourPL()` function that estimates the probability of item endorsement as function of the item characteristics from the Rasch model and the person's level on the construct (theta).\index{petersenlab package}\index{item response theory!one-parameter model}\index{item response theory!item difficulty}\index{item response theory!theta}
To estimate the probability of endorsement from the Rasch model, specify $b$ and $\theta$, while keeping the defaults for the other parameters.\index{item response theory!theta}
```{r, class.source = "fold-hide"}
library("petersenlab") #to install: install.packages("remotes"); remotes::install_github("DevPsyLab/petersenlab")
```
```{r, eval = FALSE, class.source = "fold-hide"}
fourPL <- function(a = 1, b, c = 0, d = 1, theta){
c + (d - c) * (exp(a * (theta - b))) / (1 + exp(a * (theta - b)))
}
```
```{r, eval = FALSE}
fourPL(b, theta)
```
```{r}
fourPL(b = 1, theta = 0)
```
A one-parameter logistic (1-PL) IRT model, similar to a Rasch model, estimates the [item difficulty](#itemDifficulty) parameter, and holds everything else fixed across items (see Figure \@ref(fig:irt1PL)).\index{item response theory!one-parameter model}\index{item response theory!item difficulty}
The one-parameter logistic model holds the [item discrimination](#itemDiscrimination) fixed across items, but does not fix it to one, unlike the Rasch model.\index{item response theory!one-parameter model}\index{item response theory!item difficulty}
In the one-parameter logistic model, the probability that a person $j$ with a level on the construct of $\theta$ gets a score of one (instead of zero) on item $i$, based on the [difficulty](#itemDifficulty) ($b$) of the item and the items' (fixed) [discrimination](#itemDiscrimination) ($a$), is estimated using Equation \@ref(eq:onePL):\index{item response theory!one-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!theta}
\begin{equation}
P(X = 1|\theta_j, b_i, a) = \frac{e^{a(\theta_j - b_i)}}{1 + e^{a(\theta_j - b_i)}}
(\#eq:onePL)
\end{equation}
The [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] contains the `fourPL()` function that estimates the probability of item endorsement as function of the item characteristics from the one-parameter logistic model and the person's level on the construct (theta).\index{petersenlab package}\index{item response theory!one-parameter model}\index{item response theory!theta}
To estimate the probability of endorsement from the one-parameter logistic model, specify $a$, $b$, and $\theta$, while keeping the defaults for the other parameters.\index{item response theory!one-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!theta}
```{r, eval = FALSE}
fourPL(a, b, theta)
```
Rasch and one-parameter logistic models are common and are the easiest to fit.\index{item response theory!one-parameter model}\index{item response theory!item difficulty}
However, they make fairly strict assumptions.
They assume that items have the same [discrimination](#itemDiscrimination).\index{item response theory!one-parameter model}\index{item response theory!item discrimination}
```{r, include = FALSE}
library("tidyverse")
onePLitems <- data.frame(
theta = -4:4,
item1 = fourPL(b = -2, theta = -4:4),
item2 = fourPL(b = -1, theta = -4:4),
item3 = fourPL(b = 0, theta = -4:4),
item4 = fourPL(b = 1, theta = -4:4),
item5 = fourPL(b = 2, theta = -4:4))
onePL_item1 <- nls(item1 ~ NLS.L1(theta, b), data = onePLitems, control = list(warnOnly = TRUE))
onePL_item2 <- nls(item2 ~ NLS.L1(theta, b), data = onePLitems, control = list(warnOnly = TRUE))
onePL_item3 <- nls(item3 ~ NLS.L1(theta, b), data = onePLitems, control = list(warnOnly = TRUE))
onePL_item4 <- nls(item4 ~ NLS.L1(theta, b), data = onePLitems, control = list(warnOnly = TRUE))
onePL_item5 <- nls(item5 ~ NLS.L1(theta, b), data = onePLitems, control = list(warnOnly = TRUE))
onePLitems_newdata <- data.frame(theta = seq(from = -4, to = 4, length.out = 1000))
onePLitems_newdata$item1 <- predict(onePL_item1, newdata = onePLitems_newdata)
onePLitems_newdata$item2 <- predict(onePL_item2, newdata = onePLitems_newdata)
onePLitems_newdata$item3 <- predict(onePL_item3, newdata = onePLitems_newdata)
onePLitems_newdata$item4 <- predict(onePL_item4, newdata = onePLitems_newdata)
onePLitems_newdata$item5 <- predict(onePL_item5, newdata = onePLitems_newdata)
onePLitems_long <- pivot_longer(onePLitems_newdata, cols = item1:item5) %>%
rename(item = name)
onePLitems_long$Item <- NA
onePLitems_long$Item[which(onePLitems_long$item == "item1")] <- 1
onePLitems_long$Item[which(onePLitems_long$item == "item2")] <- 2
onePLitems_long$Item[which(onePLitems_long$item == "item3")] <- 3
onePLitems_long$Item[which(onePLitems_long$item == "item4")] <- 4
onePLitems_long$Item[which(onePLitems_long$item == "item5")] <- 5
```
```{r irt1PL, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "One-Parameter Logistic Model in Item Response Theory."}
ggplot(onePLitems_long, aes(theta, value, group = factor(Item), color = factor(Item))) +
geom_line(linewidth = 1.5) +
labs(color = "Item") +
scale_color_viridis_d() +
scale_x_continuous(name = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), breaks = -4:4) +
scale_y_continuous(name = "Probability of Item Endorsement") +
theme_bw()
```
A one-parameter logistic model is only valid if there is not crossing of lines in empirical [ICCs](#icc) (see Figure \@ref(fig:empiricalICCnoCrossing)).\index{item response theory!one-parameter model}\index{item response theory!item characteristic curve}
```{r, include = FALSE}
empiricalICCdata_noCrossing <- data.frame(
item1 = c(.40, .63, .73, .85, .93, .95, .97, .99, .99),
item2 = c(.20, .53, .56, .76, .90, .94, .95, .98, .99),
item3 = c(.10, .28, .39, .58, .75, .82, .88, .94, .99),
item4 = c(.05, .24, .30, .50, .68, .73, .84, .92, .99),
item5 = c(.03, .19, .27, .48, .57, .65, .79, .89, .99),
item6 = c(.02, .15, .25, .38, .48, .59, .75, .85, .95),
item7 = c(.01, .07, .15, .29, .42, .53, .70, .82, .90),
item8 = c(.01, .05, .10, .26, .35, .46, .60, .80, .85),
item9 = c(.01, .02, .05, .16, .23, .35, .46, .55, .70),
item10 = c(.01, .02, .03, .05, .06, .07, .10, .14, .20),
itemSum = 1:9
)
```
```{r empiricalICCnoCrossing, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.height = 8, fig.cap = "Empirical Item Characteristic Curves of the Probability of Endorsement of a Given Item as a Function of the Person's Sum Score. The empirical item characteristic curves of these items do not cross each other.", fig.scap = "Empirical Item Characteristic Curves of the Probability of Endorsement of a Given Item as a Function of the Person's Sum Score."}
plot(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item10, type = "n", xlim = c(1,9), ylim = c(0,1), xlab = "Person's Sum Score", ylab = "Probability of Item Endorsement", xaxt = "n")
lines(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item1, type = "b", pch = "1")
lines(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item2, type = "b", pch = "2")
lines(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item3, type = "b", pch = "3")
lines(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item4, type = "b", pch = "4")
lines(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item5, type = "b", pch = "5")
lines(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item6, type = "b", pch = "6")
lines(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item7, type = "b", pch = "7")
lines(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item8, type = "b", pch = "8")
lines(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item9, type = "b", pch = "9")
lines(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item10, type = "l")
points(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item10, type = "p", pch = 19, col = "white", cex = 3)
text(empiricalICCdata_noCrossing$itemSum, empiricalICCdata_noCrossing$item10, labels = "10")
axis(1, at = 1:9, labels = 1:9)
```
#### 2-Parameter {#twoPL}
A two-parameter logistic (2-PL) IRT model estimates item [difficulty](#itemDifficulty) and [discrimination](#itemDiscrimination), and it holds the asymptotes fixed across items (see Figure \@ref(fig:irt2PL)).\index{item response theory!two-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}
Two-parameter logistic models are also common.\index{item response theory!two-parameter model}
In the two-parameter logistic model, the probability that a person $j$ with a level on the construct of $\theta$ gets a score of one (instead of zero) on item $i$, based on the [difficulty](#itemDifficulty) ($b$) and [discrimination](#itemDiscrimination) ($a$) of the item, is estimated using Equation \@ref(eq:twoPL):\index{item response theory!two-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!theta}
\begin{equation}
P(X = 1|\theta_j, b_i, a_i) = \frac{e^{a_i(\theta_j - b_i)}}{1 + e^{a_i(\theta_j - b_i)}}
(\#eq:twoPL)
\end{equation}
The [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] contains the `fourPL()` function that estimates the probability of item endorsement as function of the item characteristics from the two-parameter logistic model and the person's level on the construct (theta).\index{petersenlab package}\index{item response theory!two-parameter model}\index{item response theory!theta}
To estimate the probability of endorsement from the two-parameter logistic model, specify $a$, $b$, and $\theta$, while keeping the defaults for the other parameters.\index{item response theory!two-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!theta}
```{r, eval = FALSE}
fourPL(a, b, theta)
```
```{r}
fourPL(a = 0.6, b = 0, theta = -1)
```
```{r, include = FALSE}
twoPLitems <- data.frame(theta = -6:6,
item1 = fourPL(a = 1.3, b = -2, theta = -6:6),
item2 = fourPL(a = 0.8, b = -1, theta = -6:6),
item3 = fourPL(a = 0.6, b = 0, theta = -6:6),
item4 = fourPL(a = 1.5, b = 1, theta = -6:6),
item5 = fourPL(a = 2.3, b = 2, theta = -6:6))
twoPL_item1 <- nls(item1 ~ NLS.L2(theta, a, b), data = twoPLitems, control = list(warnOnly = TRUE))
twoPL_item2 <- nls(item2 ~ NLS.L2(theta, a, b), data = twoPLitems, control = list(warnOnly = TRUE))
twoPL_item3 <- nls(item3 ~ NLS.L2(theta, a, b), data = twoPLitems, control = list(warnOnly = TRUE))
twoPL_item4 <- nls(item4 ~ NLS.L2(theta, a, b), data = twoPLitems, control = list(warnOnly = TRUE))
twoPL_item5 <- nls(item5 ~ NLS.L2(theta, a, b), data = twoPLitems, control = list(warnOnly = TRUE))
twoPLitems_newdata <- data.frame(theta = seq(from = -6, to = 6, length.out = 1000))
twoPLitems_newdata$item1 <- predict(twoPL_item1, newdata = twoPLitems_newdata)
twoPLitems_newdata$item2 <- predict(twoPL_item2, newdata = twoPLitems_newdata)
twoPLitems_newdata$item3 <- predict(twoPL_item3, newdata = twoPLitems_newdata)
twoPLitems_newdata$item4 <- predict(twoPL_item4, newdata = twoPLitems_newdata)
twoPLitems_newdata$item5 <- predict(twoPL_item5, newdata = twoPLitems_newdata)
twoPLitems_long <- pivot_longer(twoPLitems_newdata, cols = item1:item5) %>%
rename(item = name)
twoPLitems_long$Item <- NA
twoPLitems_long$Item[which(twoPLitems_long$item == "item1")] <- 1
twoPLitems_long$Item[which(twoPLitems_long$item == "item2")] <- 2
twoPLitems_long$Item[which(twoPLitems_long$item == "item3")] <- 3
twoPLitems_long$Item[which(twoPLitems_long$item == "item4")] <- 4
twoPLitems_long$Item[which(twoPLitems_long$item == "item5")] <- 5
```
```{r irt2PL, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "Two-Parameter Logistic Model in Item Response Theory."}
ggplot(twoPLitems_long, aes(theta, value, group = factor(Item), color = factor(Item))) +
geom_line(linewidth = 1.5) +
labs(color = "Item") +
scale_color_viridis_d() +
scale_x_continuous(name = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), breaks = seq(from = -6, to = 6, by = 2), limits = c(-6,6)) +
scale_y_continuous(name = "Probability of Item Endorsement") +
theme_bw()
```
#### 3-Parameter {#threePL}
A three-parameter logistic (3-PL) IRT model estimates item [difficulty](#itemDifficulty), [discrimination](#itemDiscrimination), and [guessing](#itemGuessing) (lower asymptote), and it holds the upper asymptote fixed across items (see Figure \@ref(fig:irt3PL)).\index{item response theory!three-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!item guessing}
This model would provide information about where an item drops out.\index{item response theory!three-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!item guessing}
Three-parameter logistic models are less common to estimate because it adds considerable computational complexity and requires a large sample size, and the [guessing](#itemGuessing) parameter is often not as important as [difficulty](#itemDifficulty) and [discrimination](#itemDiscrimination).\index{item response theory!three-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!item guessing}
Nevertheless, 3-parameter logistic models are sometimes estimated in the education literature to account for getting items correct by random guessing.\index{item response theory!three-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!item guessing}
In the three-parameter logistic model, the probability that a person $j$ with a level on the construct of $\theta$ gets a score of one (instead of zero) on item $i$, based on the [difficulty](#itemDifficulty) ($b$), [discrimination](#itemDiscrimination) ($a$), and [guessing parameter](#itemGuessing) ($c$) of the item, is estimated using Equation \@ref(eq:threePL):\index{item response theory!three-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!item guessing}\index{item response theory!theta}
\begin{equation}
P(X = 1|\theta_j, b_i, a_i, c_i) = c_i + (1 - c_i) \frac{e^{a_i(\theta_j - b_i)}}{1 + e^{a_i(\theta_j - b_i)}}
(\#eq:threePL)
\end{equation}
The [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] contains the `fourPL()` function that estimates the probability of item endorsement as function of the item characteristics from the three-parameter logistic model and the person's level on the construct (theta).\index{petersenlab package}\index{item response theory!three-parameter model}\index{item response theory!item guessing}\index{item response theory!theta}
To estimate the probability of endorsement from the three-parameter logistic model, specify $a$, $b$, $c$, and $\theta$, while keeping the defaults for the other parameters.\index{item response theory!three-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!item guessing}\index{item response theory!theta}
```{r, eval = FALSE}
fourPL(a, b, c, theta)
```
```{r}
fourPL(a = 0.8, b = -1, c = .25, theta = -1)
```
```{r, include = FALSE}
threePLitems <- data.frame(theta = 1:13,
item1 = fourPL(a = 1.3, b = -2, c = 0, theta = -6:6),
item2 = fourPL(a = 0.8, b = -1, c = .25, theta = -6:6),
item3 = fourPL(a = 0.6, b = 0, c = .3, theta = -6:6),
item4 = fourPL(a = 1.5, b = 1, c = .15, theta = -6:6),
item5 = fourPL(a = 2.3, b = 2, c = 0, theta = -6:6))
threePL_item1 <- drm(item1 ~ theta, data = threePLitems, fct = LL.3u(), type = "continuous")
threePL_item2 <- drm(item2 ~ theta, data = threePLitems, fct = LL.3u(), type = "continuous")
threePL_item3 <- drm(item3 ~ theta, data = threePLitems, fct = LL.3u(), type = "continuous")
threePL_item4 <- drm(item4 ~ theta, data = threePLitems, fct = LL.3u(), type = "continuous")
threePL_item5 <- drm(item5 ~ theta, data = threePLitems, fct = LL.3u(), type = "continuous")
threePLitems_newdata <- data.frame(theta = seq(from = 1, to = 13, length.out = 1000))
threePLitems_newdata$item1 <- predict(threePL_item1, newdata = threePLitems_newdata)
threePLitems_newdata$item2 <- predict(threePL_item2, newdata = threePLitems_newdata)
threePLitems_newdata$item3 <- predict(threePL_item3, newdata = threePLitems_newdata)
threePLitems_newdata$item4 <- predict(threePL_item4, newdata = threePLitems_newdata)
threePLitems_newdata$item5 <- predict(threePL_item5, newdata = threePLitems_newdata)
threePLitems_newdata$theta <- seq(from = -6, to = 6, length.out = 1000)
threePLitems_long <- pivot_longer(threePLitems_newdata, cols = item1:item5) %>%
rename(item = name)
threePLitems_long$Item <- NA
threePLitems_long$Item[which(threePLitems_long$item == "item1")] <- 1
threePLitems_long$Item[which(threePLitems_long$item == "item2")] <- 2
threePLitems_long$Item[which(threePLitems_long$item == "item3")] <- 3
threePLitems_long$Item[which(threePLitems_long$item == "item4")] <- 4
threePLitems_long$Item[which(threePLitems_long$item == "item5")] <- 5
```
```{r irt3PL, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "Three-Parameter Logistic Model in Item Response Theory."}
ggplot(threePLitems_long, aes(theta, value, group = factor(Item), color = factor(Item))) +
geom_line(linewidth = 1.5) +
labs(color = "Item") +
scale_color_viridis_d() +
scale_x_continuous(name = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), breaks = seq(from = -6, to = 6, by = 2), limits = c(-6,6)) +
scale_y_continuous(name = "Probability of Item Endorsement") +
theme_bw()
```
#### 4-Parameter {#fourPL}
A four-parameter logistic (4-PL) IRT model estimates item [difficulty](#itemDifficulty), [discrimination](#itemDiscrimination), [guessing](#itemGuessing), and [careless errors](#itemCarelessErrors) (see Figure \@ref(fig:irt4PL)).\index{item response theory!four-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!item guessing}\index{item response theory!item careless errors}
The fourth parameter adds considerable computational complexity and is rare to estimate.\index{item response theory!four-parameter model}
In the four-parameter logistic model, the probability that a person $j$ with a level on the construct of $\theta$ gets a score of one (instead of zero) on item $i$, based on the [difficulty](#itemDifficulty) ($b$), [discrimination](#itemDiscrimination) ($a$), [guessing parameter](#itemGuessing) ($c$), and [careless error parameter](#itemCarelessErrors) ($d$) of the item, is estimated using Equation \@ref(eq:fourPL) [@Magis2013]:\index{item response theory!four-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!item guessing}\index{item response theory!item careless errors}\index{item response theory!theta}
\begin{equation}
P(X = 1|\theta_j, b_i, a_i, c_i, d_i) = c_i + (d_i - c_i) \frac{e^{a_i(\theta_j - b_i)}}{1 + e^{a_i(\theta_j - b_i)}}
(\#eq:fourPL)
\end{equation}
The [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] contains the `fourPL()` function that estimates the probability of item endorsement as function of the item characteristics from the four-parameter logistic model and the person's level on the construct (theta).\index{petersenlab package}
To estimate the probability of endorsement from the four-parameter logistic model, specify $a$, $b$, $c$, $d$, and $\theta$.\index{item response theory!four-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!item guessing}\index{item response theory!item careless errors}\index{item response theory!theta}
```{r, eval = FALSE}
fourPL(a, b, c, d, theta)
```
```{r}
fourPL(a = 1.5, b = 1, c = .15, d = 0.85, theta = 3)
```
```{r, include = FALSE}
fourPLitems <- data.frame(theta = 1:13,
item1 = fourPL(a = 1.3, b = -2, c = 0, d = 1, theta = -6:6),
item2 = fourPL(a = 0.8, b = -1, c = .25, d = 0.9, theta = -6:6),
item3 = fourPL(a = 0.6, b = 0, c = .3, d = 0.95, theta = -6:6),
item4 = fourPL(a = 1.5, b = 1, c = .15, d = 0.85, theta = -6:6),
item5 = fourPL(a = 2.3, b = 2, c = 0, d = 0.98, theta = -6:6))
fourPL_item1 <- drm(item1 ~ theta, data = fourPLitems, fct = LL.4(), type = "continuous")
fourPL_item2 <- drm(item2 ~ theta, data = fourPLitems, fct = LL.4(), type = "continuous")
fourPL_item3 <- drm(item3 ~ theta, data = fourPLitems, fct = LL.4(), type = "continuous")
fourPL_item4 <- drm(item4 ~ theta, data = fourPLitems, fct = LL.4(), type = "continuous")
fourPL_item5 <- drm(item5 ~ theta, data = fourPLitems, fct = LL.4(), type = "continuous")
fourPLitems_newdata <- data.frame(theta = seq(from = 1, to = 13, length.out = 1000))
fourPLitems_newdata$item1 <- predict(fourPL_item1, newdata = fourPLitems_newdata)
fourPLitems_newdata$item2 <- predict(fourPL_item2, newdata = fourPLitems_newdata)
fourPLitems_newdata$item3 <- predict(fourPL_item3, newdata = fourPLitems_newdata)
fourPLitems_newdata$item4 <- predict(fourPL_item4, newdata = fourPLitems_newdata)
fourPLitems_newdata$item5 <- predict(fourPL_item5, newdata = fourPLitems_newdata)
fourPLitems_newdata$theta <- seq(from = -6, to = 6, length.out = 1000)
fourPLitems_long <- pivot_longer(fourPLitems_newdata, cols = item1:item5) %>%
rename(item = name)
fourPLitems_long$Item <- NA
fourPLitems_long$Item[which(fourPLitems_long$item == "item1")] <- 1
fourPLitems_long$Item[which(fourPLitems_long$item == "item2")] <- 2
fourPLitems_long$Item[which(fourPLitems_long$item == "item3")] <- 3
fourPLitems_long$Item[which(fourPLitems_long$item == "item4")] <- 4
fourPLitems_long$Item[which(fourPLitems_long$item == "item5")] <- 5
```
```{r irt4PL, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "Four-Parameter Logistic Model in Item Response Theory."}
ggplot(fourPLitems_long, aes(theta, value, group = factor(Item), color = factor(Item))) +
geom_line(linewidth = 1.5) +
labs(color = "Item") +
scale_color_viridis_d() +
scale_x_continuous(name = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), breaks = seq(from = -6, to = 6, by = 2), limits = c(-6,6)) +
scale_y_continuous(name = "Probability of Item Endorsement") +
theme_bw()
```
#### Graded Response Model {#grm}
*Graded response models* and *generalized partial credit models* can be estimated with one, two, three, or four parameters.\index{item response theory!graded response model}\index{item response theory!one-parameter model}\index{item response theory!two-parameter model}\index{item response theory!three-parameter model}\index{item response theory!four-parameter model}
However, they use polytomous data (not dichotomous data), as described in the section below.\index{item response theory!graded response model}
The [two-parameter](#twoPL) graded response model takes the general form of Equation \@ref(eq:gradedResponseModel1):\index{item response theory!graded response model}\index{item response theory!two-parameter model}
\begin{equation}
P(X_{ji} = x_{ji}|\theta_j) = P^*_{x_{ji}}(\theta_j) - P^*_{x_{ji} + 1}(\theta_j)
(\#eq:gradedResponseModel1)
\end{equation}
where:
\begin{equation}
P^*_{x_{ji}}(\theta_j) = P(X_{ji} \geq x_{ji}|\theta_j, b_{ic}, a_i) = \frac{1}{1 + e^{a_i(\theta_j - b_{ic})}}
(\#eq:gradedResponseModel2)
\end{equation}
In the model, $a_i$ an item-specific [discrimination parameter](#itemDiscrimination), $b_{ic}$ is an item- and category-specific [difficulty parameter](#itemDifficulty), and $θ_n$ is an estimate of a person's standing on the latent variable.\index{item response theory!graded response model}\index{item response theory!two-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}
In the model, $i$ represents unique items, $c$ represents different categories that are rated, and $j$ represents participants.\index{item response theory!graded response model}
### Type of Data {#dataTypes-irt}
IRT models are most commonly estimated with binary or dichotomous data.\index{item response theory}\index{data!dichotomous}
For example, the measures have questions or items that can be considered collapsed into two groups (e.g., true/false, correct/incorrect, endorsed/not endorsed).\index{item response theory}\index{data!dichotomous}
IRT models can also be estimated with polytomous data (e.g., likert scale), which adds computational complexity.\index{data!polytomous}\index{Likert scale}
IRT models with polytomous data can be fit with a graded response model or generalized partial credit model.\index{item response theory}\index{item response theory!graded response model}\index{data!polytomous}
For example, see Figure \@ref(fig:polytomousItemBoundaryCurves) for an example of an item boundary characteristic curve for an item from a 5-level likert scale (based on a cumulative distribution).\index{item response theory}\index{item response theory!item boundary characteristic curve}\index{Likert scale}
If an item has $k$ response categories, it has $k - 1$ thresholds.\index{item response theory}\index{item response theory!item boundary characteristic curve}
For example, an item with 5-level likert scale (1 = strongly disagree; 2 = disagree; 3 = neither agree nor disagree; 4 = agree; 5 = strongly agree) has 4 thresholds: one from 1–2, one from 2–3, one from 3–4, and one from 4–5.\index{item response theory}\index{item response theory!item boundary characteristic curve}\index{Likert scale}
The item boundary characteristic curve is the probability that a person selects a response category higher than $k$ of a polytomous item.\index{item response theory}\index{item response theory!item boundary characteristic curve}\index{data!polytomous}
As depicted, one likert scale item does equivalent work as 4 binary items.\index{item response theory}\index{item response theory!item boundary characteristic curve}\index{data!dichotomous}\index{data!polytomous}\index{Likert scale}
See Figure \@ref(fig:polytomousItemResponseCategoryCurves) for the same 5-level likert scale item plotted with an item response category characteristic curve (based on a static, non-cumulative distribution).\index{item response theory}\index{item response theory!item response category characteristic curve}\index{Likert scale}
```{r, include = FALSE}
polytomousItemBoundary <- data.frame(theta = seq(from = -4, to = 4, length.out= 1000),
itemBoundary1 = psych::logistic(seq(from = -4, to = 4, length.out = 1000), d = -2),
itemBoundary2 = psych::logistic(seq(from = -4, to = 4, length.out = 1000), d = -0.6667),
itemBoundary3 = psych::logistic(seq(from = -4, to = 4, length.out = 1000), d = 0.6667),
itemBoundary4 = psych::logistic(seq(from = -4, to = 4, length.out = 1000), d = 2))
polytomousItemResponseCategory <- data.frame(theta = seq(from = -4, to = 4, length.out = 1000))
polytomousItemResponseCategory$itemResponseCategory1 <- 1 - polytomousItemBoundary$itemBoundary1
polytomousItemResponseCategory$itemResponseCategory5 <- polytomousItemBoundary$itemBoundary4
polytomousItemResponseCategory$itemResponseCategory4 <- polytomousItemBoundary$itemBoundary3 - polytomousItemResponseCategory$itemResponseCategory5
polytomousItemResponseCategory$itemResponseCategory3 <- polytomousItemBoundary$itemBoundary2 - rowSums(polytomousItemResponseCategory[,c("itemResponseCategory4","itemResponseCategory5")])
polytomousItemResponseCategory$itemResponseCategory2 <- polytomousItemBoundary$itemBoundary1 - rowSums(polytomousItemResponseCategory[,c("itemResponseCategory3","itemResponseCategory4","itemResponseCategory5")])
polytomousItemResponseCategory <- polytomousItemResponseCategory %>%
dplyr::select(theta, itemResponseCategory1, itemResponseCategory2, itemResponseCategory3, itemResponseCategory4, itemResponseCategory5)
polytomousItemBoundary_long <- pivot_longer(polytomousItemBoundary, cols = itemBoundary1:itemBoundary4) %>%
rename(item = name)
polytomousItemBoundary_long$boundary <- NA
polytomousItemBoundary_long$boundary[which(polytomousItemBoundary_long$item == "itemBoundary1")] <- 1
polytomousItemBoundary_long$boundary[which(polytomousItemBoundary_long$item == "itemBoundary2")] <- 2
polytomousItemBoundary_long$boundary[which(polytomousItemBoundary_long$item == "itemBoundary3")] <- 3
polytomousItemBoundary_long$boundary[which(polytomousItemBoundary_long$item == "itemBoundary4")] <- 4
polytomousItemResponseCategory_long <- pivot_longer(polytomousItemResponseCategory, cols = itemResponseCategory1:itemResponseCategory5) %>%
rename(item = name)
polytomousItemResponseCategory_long$responseCategory <- NA
polytomousItemResponseCategory_long$responseCategory[which(polytomousItemResponseCategory_long$item == "itemResponseCategory1")] <- 1
polytomousItemResponseCategory_long$responseCategory[which(polytomousItemResponseCategory_long$item == "itemResponseCategory2")] <- 2
polytomousItemResponseCategory_long$responseCategory[which(polytomousItemResponseCategory_long$item == "itemResponseCategory3")] <- 3
polytomousItemResponseCategory_long$responseCategory[which(polytomousItemResponseCategory_long$item == "itemResponseCategory4")] <- 4
polytomousItemResponseCategory_long$responseCategory[which(polytomousItemResponseCategory_long$item == "itemResponseCategory5")] <- 5
```
```{r polytomousItemBoundaryCurves, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "Item Boundary Characteristic Curves From Two-Parameter Graded Response Model in Item Response Theory."}
ggplot(polytomousItemBoundary_long, aes(theta, value, group = factor(boundary), color = factor(boundary))) +
geom_line(linewidth = 1.5) +
labs(color = "Boundary") +
scale_color_viridis_d() +
scale_x_continuous(name = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), breaks = -4:4) +
scale_y_continuous(name = "Probability of Endorsing an Item Response Category that is Higher than the Boundary") +
theme_bw() +
theme(axis.title.y = element_text(size = 9))
```
```{r polytomousItemResponseCategoryCurves, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "Item Response Category Characteristic Curves From Two-Parameter Graded Response Model in Item Response Theory."}
ggplot(polytomousItemResponseCategory_long, aes(theta, value, group = factor(responseCategory), color = factor(responseCategory))) +
geom_line(linewidth = 1.5) +
labs(color = "Response Category") +
scale_color_viridis_d() +
scale_x_continuous(name = expression(paste("Person's Level on the Latent Construct (", theta, ")", sep = "")), breaks = -4:4) +
scale_y_continuous(name = "Probability of Item Response Category Endorsement") +
theme_bw()
```
IRT does not handle continuous data well, with some exceptions [@Chen2019] such as in a Bayesian framework [@Buerkner2021].\index{item response theory}
If you want to use continuous data, you might consider moving to a [factor analysis](#factorAnalysis) framework.\index{factor analysis}
### Sample Size {#sampleSize-irt}
Sample size requirements depend on the complexity of the model.
A 1-parameter model often requires ~100 participants.\index{item response theory!one-parameter model}
A 2-parameter model often requires ~1,000 participants.\index{item response theory!two-parameter model}
A 3-parameter model often requires ~10,000 participants.\index{item response theory!three-parameter model}
### Reliability (Information) {#irtReliability}
IRT conceptualizes [reliability](#reliability) in a different way than [classical test theory](#ctt) does.\index{item response theory}\index{classical test theory}\index{item response theory!information}\index{reliability}
Both IRT and [classical test theory](#ctt) conceptualize [reliability](#reliability) as involving the *precision* of a measure's scores.\index{item response theory}\index{classical test theory}\index{item response theory!information}\index{reliability}\index{reliability!precision}
In [classical test theory](#ctt), (im)precision—as operationalized by the [standard error of measurement](#standardErrorOfMeasurement)—is estimated with a single index across the whole range of the construct.\index{classical test theory}\index{reliability}\index{reliability!precision}\index{reliability!standard error of measurement}
That is, in [classical test theory](#ctt), the same [standard error of measurement](#standardErrorOfMeasurement) applies to all scores in the population [@Embretson1996].\index{classical test theory}\index{reliability}\index{reliability!precision}\index{reliability!standard error of measurement}
However, IRT estimates how much measurement precision (information) or imprecision ([standard error of measurement](#standardErrorOfMeasurement)) each item, and the test as a whole, has at different construct levels.\index{item response theory!information}\index{reliability}
This allows IRT to conceptualize [reliability](#reliability) in such a way that precision/[reliability](#reliability) can *differ* at different construct levels, unlike in [classical test theory](#ctt) [@Embretson1996].\index{item response theory}\index{classical test theory}\index{item response theory!information}\index{reliability}
Thus, IRT does not have one index of [reliability](#reliability); rather, its estimate of [reliability](#reliability) differs at different levels on the construct.\index{item response theory}\index{classical test theory}\index{item response theory!information}\index{reliability}
Based on an item's [difficulty](#itemDifficulty) and [discrimination](#itemDiscrimination), we can calculate how much information each item provides.\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!item difficulty}\index{item response theory!item discrimination}
In IRT, *information* is how much measurement precision or consistency an item (or the measure) provides.\index{item response theory}\index{item response theory!information}\index{reliability}\index{reliability!precision}
In other words, information is the degree to which an item (or measure) reduces the [standard error of measurement](#standardErrorOfMeasurement), that is, how much it reduces uncertainty of a person's level on the construct.\index{item response theory}\index{item response theory!information}\index{reliability}\index{reliability!standard error of measurement}
As a reminder (from Equation \@ref(eq:standardErrorOfMeasurement)), the [standard error of measurement](#standardErrorOfMeasurement) is calculated as:\index{reliability!standard error of measurement}
$$
\text{standard error of measurement (SEM)} = \sigma_x \sqrt{1 - r_{xx}}
$$
where $\sigma_x = \text{standard deviation of observed scores on the item } x$, and $r_{xx} = \text{reliability of the item } x$.\index{reliability!standard error of measurement}
The [standard error of measurement](#standardErrorOfMeasurement) is used to generate confidence intervals for people's scores.\index{reliability!standard error of measurement}
In IRT, the [standard error of measurement](#standardErrorOfMeasurement) (at a given construct level) can be calculated as the inverse of the square root of the amount of test information at that construct level, as in Equation \@ref(eq:semIRT):\index{item response theory}\index{item response theory!information}\index{reliability}\index{reliability!standard error of measurement}\index{item response theory!standard error of measurement}\index{item response theory!test information curve}
\begin{equation}
\text{SEM}(\theta) = \frac{1}{\sqrt{\text{information}(\theta)}}
(\#eq:semIRT)
\end{equation}
The [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] contains the `standardErrorIRT()` function that estimates the [standard error of measurement](#standardErrorOfMeasurement) at a person's level on the construct (theta) from the amount of information that the item (or test) provides.\index{petersenlab package}\index{item response theory}\index{item response theory!information}\index{reliability}\index{reliability!standard error of measurement}\index{item response theory!standard error of measurement}\index{item response theory!theta}
```{r, eval = FALSE, class.source = "fold-hide"}
standardErrorIRT <- function(information){
1/sqrt(information)
}
```
```{r, eval = FALSE}
standardErrorIRT(information)
```
```{r}
standardErrorIRT(0.6)
```
The [standard error of measurement](#standardErrorOfMeasurement) tends to be higher (i.e., [reliability](#reliability)/information tends to be lower) at the extreme levels of the construct where there are fewer items.\index{item response theory}\index{item response theory!information}\index{reliability}\index{reliability!standard error of measurement}\index{item response theory!standard error of measurement}
The formula for information for item $i$ at construct level $\theta$ in a Rasch model is in Equation \@ref(eq:itemInformationRasch) [@Baker2017]:\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}
\begin{equation}
\text{information}_i(\theta) = P_i(\theta)Q_i(\theta)
(\#eq:itemInformationRasch)
\end{equation}
where $P_i(\theta)$ is the probability of getting a one instead of a zero on item $i$ at a given level on the latent construct, and $Q_i(\theta) = 1 - P_i(\theta)$.\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}
The [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] contains the `itemInformation()` function that estimates the amount of information an item provides as function of the item characteristics from the Rasch model and the person's level on the construct (theta).\index{petersenlab package}\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}
To estimate the amount of information an item provides in a Rasch model, specify $b$ and $\theta$, while keeping the defaults for the other parameters.\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}\index{item response theory!one-parameter model}\index{item response theory!item difficulty}\index{item response theory!theta}
```{r, eval = FALSE, class.source = "fold-hide"}
itemInformation <- function(a = 1, b, c = 0, d = 1, theta){
P <- NULL
information <- NULL
for(i in 1:length(theta)){
P[i] <- fourPL(b = b, a = a, c = c, d = d, theta = theta[i])
information[i] <- ((a^2) * (P[i] - c)^2 * (d - P[i])^2) / ((d - c)^2 * P[i] * (1 - P[i]))
}
return(information)
}
```
```{r, eval = FALSE}
itemInformation(b, theta)
```
```{r}
itemInformation(b = 1, theta = 0)
```
The formula for information for item $i$ at construct level $\theta$ in a two-parameter logistic model is in Equation \@ref(eq:itemInformation2PL) [@Baker2017]:\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}\index{item response theory!two-parameter model}
\begin{equation}
\text{information}_i(\theta) = a^2_iP_i(\theta)Q_i(\theta)
(\#eq:itemInformation2PL)
\end{equation}
The [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] contains the `itemInformation()` function that estimates the amount of information an item provides as function of the item characteristics from the two-parameter logistic model and the person's level on the construct (theta).\index{petersenlab package}\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}\index{item response theory!two-parameter model}
To estimate the amount of information an item provides in a two-parameter logistic model, specify $a$, $b$, and $\theta$, while keeping the defaults for the other parameters.\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}\index{item response theory!one-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!theta}
```{r, eval = FALSE}
itemInformation(a, b, theta)
```
```{r}
itemInformation(a = 0.6, b = 0, theta = -1)
```
The formula for information for item $i$ at construct level $\theta$ in a three-parameter logistic model is in Equation \@ref(eq:itemInformation3PL) [@Baker2017]:\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}\index{item response theory!three-parameter model}
\begin{equation}
\text{information}_i(\theta) = a^2_i\bigg[\frac{Q_i(\theta)}{P_i(\theta)}\bigg]\bigg[\frac{(P_i(\theta) - c_i)^2}{(1 - c_i)^2}\bigg]
(\#eq:itemInformation3PL)
\end{equation}
The [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] contains the `itemInformation()` function that estimates the amount of information an item provides as function of the item characteristics from the three-parameter logistic model and the person's level on the construct (theta).\index{petersenlab package}\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}\index{item response theory!three-parameter model}\index{item response theory!theta}
To estimate the amount of information an item provides in a three-parameter logistic model, specify $a$, $b$, $c$, and $\theta$, while keeping the defaults for the other parameters.\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}\index{item response theory!three-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!item guessing}\index{item response theory!theta}
```{r, eval = FALSE}
itemInformation(a, b, c, theta)
```
```{r}
itemInformation(a = 0.8, b = -1, c = .25, theta = -1)
```
The formula for information for item $i$ at construct level $\theta$ in a four-parameter logistic model is in Equation \@ref(eq:itemInformation4PL) [@Magis2013]:\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}\index{item response theory!four-parameter model}
\begin{equation}
\text{information}_i(\theta) = \frac{a^2_i[P_i(\theta) - c_i]^2[d_i - P_i(\theta)^2]}{(d_i - c_i)^2 P_i(\theta)[1 - P_i(\theta)]}
(\#eq:itemInformation4PL)
\end{equation}
The [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] contains the `itemInformation()` function that estimates the amount of information an item provides as function of the item characteristics from the four-parameter logistic model and the person's level on the construct (theta).\index{petersenlab package}\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}\index{item response theory!four-parameter model}\index{item response theory!theta}
To estimate the amount of information an item provides in a four-parameter logistic model, specify $a$, $b$, $c$, $d$, and $\theta$.\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}\index{item response theory!four-parameter model}\index{item response theory!item difficulty}\index{item response theory!item discrimination}\index{item response theory!item guessing}\index{item response theory!item careless errors}\index{item response theory!theta}
```{r, eval = FALSE}
itemInformation(a, b, c, d, theta)
```
```{r}
itemInformation(a = 1.5, b = 1, c = .15, d = 0.85, theta = 3)
```
[Reliability](#irtReliability) at a given level of the construct ($\theta$) can be estimated as in Equation \@ref(eq:reliabilityIRT):\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}
$$
\begin{aligned}
\text{reliability}(\theta) &= \frac{\text{information}(\theta)}{\text{information}(\theta) + \sigma^2(\theta)} \\
&= \frac{\text{information}(\theta)}{\text{information}(\theta) + 1}
\end{aligned}
(\#eq:reliabilityIRT)
$$
where $\sigma^2(\theta)$ is the variance of theta, which is fixed to one in most IRT models.\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}
The [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] contains the `reliabilityIRT()` function that estimates the amount of [reliability](#irtReliability) an item or a measure provides as function of its information and the variance of people's construct levels ($\theta$).\index{petersenlab package}\index{item response theory}\index{item response theory!information}\index{reliability}\index{item response theory!theta}
```{r, eval = FALSE, class.source = "fold-hide"}
reliabilityIRT <- function(information, varTheta = 1){
information / (information + varTheta)
}
```
```{r, eval = FALSE}
reliabilityIRT(information, varTheta = 1)
```
```{r}
reliabilityIRT(10)
```
Consider some hypothetical items depicted with [ICCs](#icc) in Figure \@ref(fig:reliabilityIRTicc).\index{item response theory!item characteristic curve}
```{r, include = FALSE}
#https://journals.sagepub.com/doi/full/10.1177/0146621613475471
irtReliability <- data.frame(theta = seq(from = -4, to = 4, length.out = 1000))
irtReliability$item1 <- fourPL(b = -1, a = 1, theta = irtReliability$theta)
irtReliability$item2 <- fourPL(b = 0, a = 0.6, theta = irtReliability$theta)
irtReliability$item3 <- fourPL(b = 1, a = 1.5, theta = irtReliability$theta)
irtReliability$item4 <- fourPL(b = 2, a = 2, theta = irtReliability$theta)
irtInformation <- data.frame(theta = seq(from = -4, to = 4, length.out = 1000))
irtInformation$information1 <- itemInformation(b = -1, a = 1, theta = irtInformation$theta)
irtInformation$information2 <- itemInformation(b = 0, a = 0.6, theta = irtInformation$theta)
irtInformation$information3 <- itemInformation(b = 1, a = 1.5, theta = irtInformation$theta)
irtInformation$information4 <- itemInformation(b = 2, a = 2, theta = irtInformation$theta)
midpoint_item1 <- irtReliability$theta[which.min(abs(irtReliability$item1 - 0.5))]
midpoint_item2 <- irtReliability$theta[which.min(abs(irtReliability$item2 - 0.5))]
midpoint_item3 <- irtReliability$theta[which.min(abs(irtReliability$item3 - 0.5))]
midpoint_item4 <- irtReliability$theta[which.min(abs(irtReliability$item4 - 0.5))]
irtInformation$testInformation <- rowSums(irtInformation[,c("information1","information2","information3","information4")])
irtInformation$standardError <- standardErrorIRT(irtInformation$testInformation)
irtInformation$reliability <- reliabilityIRT(irtInformation$testInformation)
irtReliability_long <- pivot_longer(irtReliability, cols = item1:item4) %>%
rename(item = name)