statistics Package¶
statistics Package¶
compute_CI Module¶
- WORC.statistics.compute_CI.compute_confidence(metric, N_train, N_test, alpha=0.95)[source]¶
Function to calculate the adjusted confidence interval for cross-validation. metric: numpy array containing the result for a metric for the different cross validations (e.g. If 20 cross-validations are performed it is a list of length 20 with the calculated accuracy for each cross validation) N_train: Integer, number of training samples N_test: Integer, number of test_samples alpha: float ranging from 0 to 1 to calculate the alpha*100% CI, default 0.95
- WORC.statistics.compute_CI.compute_confidence_bootstrap(bootstrap_metric, test_metric, N_1, alpha=0.95)[source]¶
Function to calculate confidence interval for bootstrapped samples. metric: numpy array containing the result for a metric for the different bootstrap iterations test_metric: the value of the metric evaluated on the true, full test set alpha: float ranging from 0 to 1 to calculate the alpha*100% CI, default 0.95
- WORC.statistics.compute_CI.compute_confidence_logit(metric, N_train, N_test, alpha=0.95)[source]¶
Function to calculate the adjusted confidence interval metric: numpy array containing the result for a metric for the different cross validations (e.g. If 20 cross-validations are performed it is a list of length 20 with the calculated accuracy for each cross validation) N_train: Integer, number of training samples N_test: Integer, number of test_samples alpha: float ranging from 0 to 1 to calculate the alpha*100% CI, default 95%
delong Module¶
Adopted from https://github.com/yandexdataschool/roc_comparison.
- WORC.statistics.delong.calc_pvalue(aucs, sigma)[source]¶
Computes log(10) of p-values.
- Args:
aucs: 1D array of AUCs sigma: AUC DeLong covariances
- Returns:
log10(pvalue)
- WORC.statistics.delong.compute_midrank(x)[source]¶
Computes midranks.
- Args:
x - a 1D numpy array
- Returns:
array of midranks
- WORC.statistics.delong.compute_midrank_weight(x, sample_weight)[source]¶
Computes midranks.
- Args:
x - a 1D numpy array
- Returns:
array of midranks
- WORC.statistics.delong.delong_roc_test(ground_truth, predictions_one, predictions_two)[source]¶
Computes log(p-value) for hypothesis that two ROC AUCs are different.
- Args:
ground_truth: np.array of 0 and 1 predictions_one: predictions of the first model,
np.array of floats of the probability of being class 1
- predictions_two: predictions of the second model,
np.array of floats of the probability of being class 1
- WORC.statistics.delong.delong_roc_variance(ground_truth, predictions)[source]¶
Computes ROC AUC variance for a single set of predictions.
- Args:
ground_truth: np.array of 0 and 1 predictions: np.array of floats of the probability of being class 1
- WORC.statistics.delong.fastDeLong(predictions_sorted_transposed, label_1_count)[source]¶
Fast DeLong test computation.
The fast version of DeLong’s method for computing the covariance of unadjusted AUC. Args:
- predictions_sorted_transposed: a 2D numpy.array[n_classifiers, n_examples]
sorted such as the examples with label “1” are first
- Returns:
(AUC value, DeLong covariance)
- Reference:
- @article{sun2014fast,
- title={Fast Implementation of DeLong’s Algorithm for
Comparing the Areas Under Correlated Receiver Oerating Characteristic Curves},
author={Xu Sun and Weichao Xu}, journal={IEEE Signal Processing Letters}, volume={21}, number={11}, pages={1389–1393}, year={2014}, publisher={IEEE}
}