credibility

credible_interval(positive, negative, credibility=0.5, prior=(1, 1))[source]

What is the shortest interval that contains probability(positive) with `credibility`% probability?

Parameters:

positive (int) – number of times the first possible outcome has been seen
negative (int) – number of times the second possible outcome has been seen
credibility (float) – The probability that the true p(positive) is contained within the reported interval
prior (tuple) – psueodcount for positives and negatives

Returns:

(lower bound, upper bound)

prob_below(positive, negative, cutoff, prior=(1, 1))[source]

What is the probability P(positive) is unacceptably low?

Parameters:

Returns:

Probability that P(positive) < cutoff

prob_greater_cmp(positive1, negative1, positive2, negative2, prior1=(1, 1), prior2=(1, 1), err=1e-05)[source]

Probability the first set comes from a distribution with a greater proportion of positive than the other.

Parameters:

positive1 (int) – number of positive instances in the first dataset
negative1 (int) – number of negative instances in the first dataset
positive1 – number of positive instances in the second dataset
negative1 – number of negative instances in the second dataset
prior1 (tuple) – psueodcount for positives and negatives
prior2 (tuple) – psueodcount for positives and negatives
err (float) – upper bound of frequentist sample std from monte carlo simulation.

roc_auc_preprocess(positives, negatives, roc_auc)[source]

ROC AUC analysis must be preprocessed using the number of positive and negative instances in the entire dataset and the AUC itself.

Parameters:

Returns:

(positive, negative) tuple that can be used for prob_below and: credible_interval