{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to Adaptive Thresholding\n", "This tutorial will go over some basic concepts you may wish to consider when setting thresholds for production models or otherwise." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make Some Data\n", "This tutorial doesn't actually require real data--nor even a model! We'll make some fake data to get the idea. Don't worry too much about this step. Let's just assume we have a series of scores. These could represent model performance, divergences, or model scores themselves. Throughout this tutorial, we'll assume that increasing values of this score will be increasingly likely to represent a good alert. Then we are left to determine an appropriate threshold to balance true/false positive/negatives. This is balancing time wasted on bad alerts with the utility gained from finding a good alert that resulted from a lower score." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Ground Truth | \n", "Model Score | \n", "
---|---|---|
0 | \n", "0.0 | \n", "0.677909 | \n", "
1 | \n", "0.0 | \n", "0.327262 | \n", "
... | \n", "... | \n", "... | \n", "
3598 | \n", "0.0 | \n", "0.185715 | \n", "
3599 | \n", "0.0 | \n", "0.113929 | \n", "
3600 rows × 2 columns
\n", "