-
-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve minor inconsistencies. #142
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -41,9 +41,14 @@ In educational measurement, cognitive diagnosis models (CDMs) have been used to | |
|
||
The *deterministic inputs, noisy "and"" gate* (DINA) model [@Junker2001] is a popular conjunctive CDM, which assumes that a respondent must have mastered all required attributes in order to correctly respond to an item on an assessment. | ||
|
||
To estimate respondents' knowledge of attributes, we need information about which attributes are required for each item. For this, we use a Q-matrix which is an $I \times K$ matrix where $q_{ik}$=1 if item $i$ requires attribute $k$ and 0 if not. $I$ is the number of items and $K$ is the number of attributes in the assessment. | ||
To estimate respondents' mastery of attributes, we need information about which attributes are required for each item. For this, we use a Q-matrix which is an $I \times K$ matrix where $q_{ik}$=1 if item $i$ requires attribute $k$ and 0 if not. $I$ is the number of items and $K$ is the number of attributes in the assessment. | ||
|
||
A binary latent variable $\alpha_{jk}$ indicates respondent $j$'s knowledge of attribute $k$, where $\alpha_{jk}=1$ if respondent $j$ has mastered attribute $k$ and 0 if he or she has not. Then, an underlying attribute profile of respondent $j$, $\boldsymbol{\alpha_j}$, is a binary vector of length $K$ that indicates whether or not the respondent has mastered each of the $K$ attributes. | ||
A binary latent variable $\alpha_{jk}$ indicates respondent $j$'s mastery of | ||
attribute $k$, where $\alpha_{jk}=1$ if respondent $j$ has mastered attribute | ||
$k$ and 0 if he or she has not. Then, an underlying attribute profile of | ||
respondent $j$, $\boldsymbol{\alpha_j}$, is a binary vector of length $K$ that | ||
indicates whether or not the respondent has mastered each of the $K$ | ||
attributes. | ||
|
||
The deterministic element of the DINA model is a latent variable $\xi_{ij}$ that indicates whether or not respondent $j$ has mastered all attributes required for item $i$: | ||
$$ | ||
|
@@ -110,9 +115,9 @@ $$ | |
\mathrm{Pr}(\alpha_{jk}=1 \, | \, \boldsymbol{y}_j)=\sum_{c=1}^{C}\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c} \, | \, \boldsymbol{y}_j)\times\alpha_{ck}. | ||
$$ | ||
|
||
Instead of conditioning on the parameters $\nu_c,s_i,g_i$ to obtain $\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c}|\boldsymbol{Y}_j=\boldsymbol{y}_j)$, we want to derive the posterior probabilities, averaged over the posterior distribution of the parameters. This is achieved by evaluating the expressions above for posterior draws of the parameters and averaging these over the MCMC iterations. Let the vector of all parameters be denoted $\boldsymbol{\theta}$ and let the posterior draw in iteration $s$ be denoted $\boldsymbol{\theta}^{(s)}_{.}$ Then we estimate the posterior probability, not conditioning on the parameters, as | ||
Instead of conditioning on the parameters $\nu_c,s_i,g_i$ to obtain $\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c}|\boldsymbol{Y}_j=\boldsymbol{y}_j)$, we want to derive the posterior probabilities, averaged over the posterior distribution of the parameters. This is achieved by evaluating the expressions above for posterior draws of the parameters and averaging these over the MCMC iterations. Let the vector of all parameters be denoted $\boldsymbol{\theta}$ and let the posterior draw in iteration $t$ be denoted $\boldsymbol{\theta}^{(t)}_{.}$ Then we estimate the posterior probability, not conditioning on the parameters, as | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bob-carpenter Here, the change is only about using |
||
$$ | ||
\frac{1}{S}\sum_{s=1}^{S}\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c} \, | \, \boldsymbol{y}_j,\boldsymbol{\theta}^{(s)}). | ||
\frac{1}{T}\sum_{t=1}^{T}\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c} \, | \, \boldsymbol{y}_j,\boldsymbol{\theta}^{(t)}). | ||
$$ | ||
|
||
In [Section 1.4](#stan_nostructure), we introduce the **Stan** program with no structure for $\nu_c$. [Section 2](#stan_ind) describes modification of this **Stan** program to specify the independence model for $\nu_c$ and presents simulation results. | ||
|
@@ -309,7 +314,6 @@ for (k in 1:K){ | |
wanted_pars <- c(paste0("prob_resp_attr[", 1:dina_data_ind$J, ",", k, "]")) | ||
# Get predicted posterior probabilities of each attribute mastery for all respondents | ||
posterior_prob_attr <- sim_summary[wanted_pars, c("mean")] | ||
dim(posterior_prob_attr) | ||
# Calculate mean of the probabilities for respondents who have mastered the attributes and for those who do not | ||
table_mean[k,"Group 1"] <- mean(posterior_prob_attr[A[,k]==1]) | ||
table_mean[k,"Group 2"] <- mean(posterior_prob_attr[A[,k]==0]) | ||
|
@@ -371,10 +375,10 @@ colnames(alpha_patt) <- paste0("A", 1:5) | |
alpha_patt | ||
|
||
# Assemble data list for Stan | ||
I=ncol(y) | ||
J=nrow(y) | ||
K=ncol(Q) | ||
C=nrow(alpha_patt) | ||
I <- ncol(y) | ||
J <- nrow(y) | ||
K <- ncol(Q) | ||
C <- nrow(alpha_patt) | ||
|
||
xi <- matrix(0,I,C) | ||
for (i in 1:I){ | ||
|
@@ -449,4 +453,4 @@ ggplot(data=estimates, aes(x=mle, y=post.means, shape=pars)) + geom_point() + g | |
|
||
# References | ||
|
||
<!-- This comment causes section to be numbered --> | ||
<!-- This comment causes section to be numbered --> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [not required for this PR] Are these lines different? You can also just specify whether sections are numbered in the top-level yaml. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @bob-carpenter ! Thanks for reviewing.
I would think it's a healthy change although very minor. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, missed that---thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seung Yeon Lee's original text says:
So I think either "knowledge" or "mastery" would work here (knowledge being the set of attributes mastered), so I'd be reluctant to change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so attributes are 'mastered' (not 'known'). The high-level concept of knowledge is characterized by (low-level) attribute mastery. I'm going by the author's terminology. It's not a big deal, it's just that using another term to mean the same thing or using metonymy (as is the case here) can create unnecessary extra cognitive load, if not confusion. And you don't want this in materials which are already difficult to digest. But, if you insist, I will, of course, revert this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also find this terminology confusing. It's doubly confusing because it's trying to overload two commonly used nouns with technical meanings. The original author of the DINA model introduced the terminology, it looks like. We're often walking the line between our general preferences for naming, etc., and that of a subfield we work with. Andrew's particularly prone to try to change entrenched terminology.