utils
- column_indexes(df: DataFrame, cols: List[str])[source]
- Parameters:
df – The dataframe
cols – a list of column names
- Returns:
The column indexes of the column names
- compute_divergence_crosstabs(data, datecol=None, format=None, show_progress=True, divergence=None)[source]
Compute the divergence crosstabs.
- Parameters:
data – The data to compute the divergences on
datecol – The column representing the date. If None, will use the index, if the index is a datetimeindex
format – A function applied to datecol values for formatting e.g.
format_date
show_progress – Whether the progress bar will be shown
divergence – The divergence function to use
- compute_divergence_crosstabs_split(subsets, dates, format=None, show_progress=True, divergence=None)[source]
Compute the divergence crosstabs.
- Parameters:
subsets – The data to compute the divergences on
dates – The list of dates for the subsets
format – A function applied to datecol values for formatting e.g.
format_date
show_progress – Whether the progress bar will be shown
divergence – The divergence function to use
- parallel(func, arr: Collection, max_workers=None, show_progress: bool = False)[source]
- NOTE: This code was adapted from the
parallel
function within Fastai’s Fastcore library. Key differences include returning a list with order preserved.
Run a function on a collection (list, set etc) of items :param func: The function to run :param arr: The collection to run on :param max_workers: How many workers to use. Will use
multiprocessing.cpu_count() if this is not provided
- Returns:
a list of the results
- NOTE: This code was adapted from the