-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: inter_rater.fleiss_kappa p-values and confidence interval #9207
Labels
Milestone
Comments
I needed this today as well coincidentally, so coded something up based on Fleiss, Nee, and Landis (1979) "Large sample variance of kappa in the case of different set of raters." Equation 3 in this paper (which says don't do it). This is what stata uses. If the number of raters is not the same for each subject, they don't produce anything for inference. def fleiss_standard_error(table):
n, k = table.shape # n_subjects, n_choices
m = table.sum(axis=1)[0] # assume they all have the same ratings count
p_bar = table.sum(axis=0) / (n * m)
q_bar = 1 - p_bar
return (
(2 ** .5 / (p_bar.dot(q_bar) * np.sqrt(n * m * (m - 1))))
* (
(p_bar.dot(q_bar) ** 2) - np.sum(p_bar * q_bar * (q_bar - p_bar))
) ** .5
) |
4 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
https://stackoverflow.com/questions/78323943/statistic-values-of-fleiss-kappa-using-statsmodels-stats-inter-rater/78324041#78324041
Note our fleiss_kappa includes also randolph's kappa, i.e. we would need p-values also for those.
(needs reference, I have not looked at this in a long time)
copy from answer
The text was updated successfully, but these errors were encountered: