Skip to content

Utils

votingsys.utils

Contain utility functions.

votingsys.utils.counter

Contain counter utility functions.

votingsys.utils.counter.check_non_empty_count

check_non_empty_count(counter: Counter) -> None

Check if the counter is not empty.

Parameters:

Name Type Description Default
counter Counter

The counter to check.

required

Raises:

Type Description
ValueError

if the counter is empty.

Example usage:

>>> from collections import Counter
>>> from votingsys.utils.counter import check_non_empty_count
>>> check_non_empty_count(Counter({"a": 10, "b": 2, "c": 5, "d": 3}))

votingsys.utils.counter.check_non_negative_count

check_non_negative_count(counter: Counter) -> None

Check if all the count values are non-negative (>=0).

Parameters:

Name Type Description Default
counter Counter

The counter to check.

required

Raises:

Type Description
ValueError

if at least one count is negative (<0).

Example usage:

>>> from collections import Counter
>>> from votingsys.utils.counter import check_non_negative_count
>>> check_non_negative_count(Counter({"a": 10, "b": 2, "c": 5, "d": 3}))

votingsys.utils.dataframe

Contain DataFrame utility functions.

votingsys.utils.dataframe.check_column_exist

check_column_exist(frame: DataFrame, col: str) -> None

Check if a column exists in a DataFrame.

Parameters:

Name Type Description Default
frame DataFrame

The DataFrame to check.

required
col str

The column that should exist in the DataFrame.

required

Raises:

Type Description
ValueError

if the column is missing in the DataFrame.

Example usage:

>>> import polars as pl
>>> from votingsys.utils.dataframe import check_column_exist
>>> check_column_exist(
...     pl.DataFrame({"a": [0, 1, 2, 1, 0], "b": [1, 2, 0, 2, 1], "c": [2, 0, 1, 0, 2]}),
...     col="a",
... )

votingsys.utils.dataframe.check_column_missing

check_column_missing(frame: DataFrame, col: str) -> None

Check if a column is missing in a DataFrame.

Parameters:

Name Type Description Default
frame DataFrame

The DataFrame to check.

required
col str

The column that should be missing in the DataFrame.

required

Raises:

Type Description
ValueError

if the column exists in the DataFrame.

Example usage:

>>> import polars as pl
>>> from votingsys.utils.dataframe import check_column_missing
>>> check_column_missing(
...     pl.DataFrame({"a": [0, 1, 2, 1, 0], "b": [1, 2, 0, 2, 1], "c": [2, 0, 1, 0, 2]}),
...     col="col",
... )

votingsys.utils.dataframe.remove_zero_weight_rows

remove_zero_weight_rows(
    frame: DataFrame, weight_col: str
) -> DataFrame

Remove all rows from a DataFrame where the weight value is zero.

Parameters:

Name Type Description Default
frame DataFrame

The input DataFrame from which rows should be filtered.

required
weight_col str

The name of the column that contains the weight values.

required

Returns:

Type Description
DataFrame

A new DataFrame with all rows removed where the weight is zero.

Raises:

Type Description
ValueError

if weight_col does not exist in the DataFrame.

Example usage:

>>> import polars as pl
>>> from votingsys.utils.dataframe import remove_zero_weight_rows
>>> out = remove_zero_weight_rows(
...     pl.DataFrame(
...         {
...             "a": [0, 1, 2, 0, 1, 2],
...             "b": [1, 2, 0, 1, 2, 0],
...             "c": [2, 0, 1, 2, 0, 1],
...             "weight": [3, 0, 2, 1, 2, 0],
...         }
...     ),
...     weight_col="weight",
... )
>>> out
shape: (4, 4)
┌─────┬─────┬─────┬────────┐
│ a   ┆ b   ┆ c   ┆ weight │
│ --- ┆ --- ┆ --- ┆ ---    │
│ i64 ┆ i64 ┆ i64 ┆ i64    │
╞═════╪═════╪═════╪════════╡
│ 0   ┆ 1   ┆ 2   ┆ 3      │
│ 2   ┆ 0   ┆ 1   ┆ 2      │
│ 0   ┆ 1   ┆ 2   ┆ 1      │
│ 1   ┆ 2   ┆ 0   ┆ 2      │
└─────┴─────┴─────┴────────┘

votingsys.utils.dataframe.sum_weights_by_group

sum_weights_by_group(
    frame: DataFrame, weight_col: str
) -> DataFrame

Aggregate a DataFrame by summing the weight values for rows with identical values in all columns except the weight column.

Parameters:

Name Type Description Default
frame DataFrame

The input DataFrame to aggregate.

required
weight_col str

The name of the column that contains the weight values to be summed.

required

Returns:

Type Description
DataFrame

A new DataFrame with rows grouped by all non-weight columns,

DataFrame

and the weight column summed within each group.

Raises:

Type Description
ValueError

if weight_col does not exist in the DataFrame.

Example usage:

>>> import polars as pl
>>> from votingsys.utils.dataframe import sum_weights_by_group
>>> out = sum_weights_by_group(
...     pl.DataFrame(
...         {
...             "a": [0, 1, 2, 0, 1, 2],
...             "b": [1, 2, 0, 1, 2, 0],
...             "c": [2, 0, 1, 2, 0, 1],
...             "weight": [3, 5, 2, 1, 2, -2],
...         }
...     ),
...     weight_col="weight",
... )
>>> out.sort("weight", descending=True)
shape: (3, 4)
┌─────┬─────┬─────┬────────┐
│ a   ┆ b   ┆ c   ┆ weight │
│ --- ┆ --- ┆ --- ┆ ---    │
│ i64 ┆ i64 ┆ i64 ┆ i64    │
╞═════╪═════╪═════╪════════╡
│ 1   ┆ 2   ┆ 0   ┆ 7      │
│ 0   ┆ 1   ┆ 2   ┆ 4      │
│ 2   ┆ 0   ┆ 1   ┆ 0      │
└─────┴─────┴─────┴────────┘

votingsys.utils.dataframe.value_count

value_count(frame: DataFrame, value: Any) -> dict[str, int]

Count the occurrences of a given value in each column of a DataFrame.

This function computes how many times a specified value appears in each column. Null values are ignored during the counting process.

Parameters:

Name Type Description Default
frame DataFrame

The input DataFrame.

required
value Any

The value to count in each column.

required

Returns:

Type Description
dict[str, int]

A dictionary mapping each column name to the number of times the specified value appears.

Raises:

Type Description
ValueError

If the specified value is None.

Example usage:

>>> import polars as pl
>>> from votingsys.utils.dataframe import value_count
>>> counts = value_count(
...     pl.DataFrame({"a": [0, 1, 2, 1, 0], "b": [1, 2, 0, 2, 1], "c": [2, 0, 1, 0, 2]}),
...     value=1,
... )
>>> counts
{'a': 2, 'b': 2, 'c': 1}

votingsys.utils.dataframe.weighted_value_count

weighted_value_count(
    frame: DataFrame, value: int, weight_col: str
) -> dict[str, int | float]

Count the weighted occurrences of a given value in each column of a DataFrame.

This function computes how many times a specified value appears in each column, weighted by the values in a separate count column. Null values are ignored during the counting process.

Parameters:

Name Type Description Default
frame DataFrame

The input DataFrame.

required
value int

The value to count in each column.

required
weight_col str

The name of the column that holds the weight for each row.

required

Returns:

Type Description
dict[str, int | float]

A dictionary mapping each column name (excluding the count column) to the weighted number of times the specified value appears.

Raises:

Type Description
ValueError

if the weight column is missing in the DataFrame.

Example usage:

>>> import polars as pl
>>> from votingsys.utils.dataframe import weighted_value_count
>>> counts = weighted_value_count(
...     pl.DataFrame({"a": [0, 1, 2], "b": [1, 2, 0], "c": [2, 0, 1], "count": [3, 5, 2]}),
...     value=1,
...     weight_col="count",
... )
>>> counts
{'a': 5, 'b': 3, 'c': 2}

votingsys.utils.mapping

Contain mapping utility functions.

votingsys.utils.mapping.find_max_in_mapping

find_max_in_mapping(
    mapping: Mapping[str, float],
) -> tuple[tuple[str, ...], float]

Find the maximum value in a mapping and returns the corresponding key(s) and the value.

If multiple keys have the same maximum value, all such keys are returned in a list.

Parameters:

Name Type Description Default
mapping Mapping[str, float]

A mapping from keys to numeric values.

required

Returns:

Type Description
tuple[tuple[str, ...], float]

A tuple containing the tuple of keys with the maximum value and the maximum value itself.

Raises:

Type Description
ValueError

if the mapping is empty.

Example usage:

>>> import polars as pl
>>> from votingsys.utils.mapping import find_max_in_mapping
>>> out = find_max_in_mapping({"x": 3, "y": 1})
>>> out
(('x',), 3)
>>> out = find_max_in_mapping({"a": 10, "b": 20, "c": 20})
>>> out
(('b', 'c'), 20)

votingsys.utils.timing

Contain utility functions to measure time.

votingsys.utils.timing.timeblock

timeblock(
    message: str = "Total time: {time}",
) -> Generator[None]

Implement a context manager to measure the execution time of a block of code.

Parameters:

Name Type Description Default
message str

The message displayed when the time is logged.

'Total time: {time}'

Example usage:

>>> from votingsys.utils.timing import timeblock
>>> with timeblock():
...     x = [1, 2, 3]
...
>>> with timeblock("Training: {time}"):
...     y = [1, 2, 3]
...