Skip to content

arkas.plot

arkas.plot

Contain plotting functionalities.

arkas.plot.bar_discrete

bar_discrete(
    ax: Axes,
    names: Sequence,
    counts: Sequence[int],
    yscale: str = "auto",
) -> None

Plot the histogram of an array containing discrete values.

Parameters:

Name Type Description Default
ax Axes

The axes of the matplotlib figure to update.

required
names Sequence

The name of the values to plot.

required
counts Sequence[int]

The number of value occurrences.

required
yscale str

The y-axis scale. If 'auto', the 'linear' or 'log'/'symlog' scale is chosen based on the distribution.

'auto'

Example usage:

>>> from matplotlib import pyplot as plt
>>> from arkas.plot import bar_discrete
>>> fig, ax = plt.subplots()
>>> bar_discrete(ax, names=["a", "b", "c", "d"], counts=[5, 100, 42, 27])

arkas.plot.bar_discrete_temporal

bar_discrete_temporal(
    ax: Axes,
    counts: ndarray,
    steps: Sequence | None = None,
    values: Sequence | None = None,
    proportion: bool = False,
) -> None

Plot the temporal distribution of discrete values.

Parameters:

Name Type Description Default
ax Axes

The axes of the matplotlib figure to update.

required
counts ndarray

A 2-d array that indicates the number of occurrences for each value and time step. The first dimension represents the value and the second dimension represents the steps.

required
steps Sequence | None

The name associated to each step.

None
values Sequence | None

The name associated to each value.

None
proportion bool

If True, it plots the normalized number of occurrences for each step.

False

Example usage:

>>> from matplotlib import pyplot as plt
>>> from arkas.plot import bar_discrete_temporal
>>> fig, ax = plt.subplots()
>>> bar_discrete_temporal(
...     ax, counts=np.ones((5, 20)), values=list(range(5)), steps=list(range(20))
... )

arkas.plot.binary_precision_recall_curve

binary_precision_recall_curve(
    ax: Axes,
    y_true: ndarray,
    y_pred: ndarray,
    **kwargs: Any
) -> None

Plot the precision-recall curve for binary labels.

Parameters:

Name Type Description Default
ax Axes

The axes of the matplotlib figure to update.

required
y_true ndarray

The ground truth target labels. This input must be an array of shape (n_samples,) with 0 and 1 values.

required
y_pred ndarray

The predicted labels. This input must be an array of shape (n_samples,) with 0 and 1 values.

required
**kwargs Any

Arbitrary keyword arguments that are passed to PrecisionRecallDisplay.from_predictions.

{}

Example usage:

>>> import numpy as np
>>> from matplotlib import pyplot as plt
>>> from arkas.plot import binary_precision_recall_curve
>>> fig, ax = plt.subplots()
>>> binary_precision_recall_curve(
...     ax=ax, y_true=np.array([1, 0, 0, 1, 1]), y_pred=np.array([1, 0, 0, 1, 1])
... )

arkas.plot.binary_roc_curve

binary_roc_curve(
    ax: Axes,
    y_true: ndarray,
    y_score: ndarray,
    **kwargs: Any
) -> None

Plot the Receiver Operating Characteristic Curve (ROC) for binary labels.

Parameters:

Name Type Description Default
ax Axes

The axes of the matplotlib figure to update.

required
y_true ndarray

The ground truth target labels. This input must be an array of shape (n_samples,).

required
y_score ndarray

The target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions. This input must be an array of shape (n_samples,).

required
**kwargs Any

Arbitrary keyword arguments that are passed to RocCurveDisplay.from_predictions.

{}

Example usage:

>>> import numpy as np
>>> from matplotlib import pyplot as plt
>>> from arkas.plot import binary_roc_curve
>>> fig, ax = plt.subplots()
>>> binary_roc_curve(
...     ax=ax, y_true=np.array([1, 0, 0, 1, 1]), y_score=np.array([2, -1, 0, 3, 1])
... )

arkas.plot.boxplot_continuous

boxplot_continuous(
    ax: Axes,
    array: ndarray,
    xmin: float | str | None = None,
    xmax: float | str | None = None,
) -> None

Plot the histogram of an array containing continuous values.

Parameters:

Name Type Description Default
ax Axes

The axes of the matplotlib figure to update.

required
array ndarray

The array with the data.

required
xmin float | str | None

The minimum value of the range or its associated quantile. q0.1 means the 10% quantile. 0 is the minimum value and 1 is the maximum value.

None
xmax float | str | None

The maximum value of the range or its associated quantile. q0.9 means the 90% quantile. 0 is the minimum value and 1 is the maximum value.

None

Example usage:

>>> import numpy as np
>>> from matplotlib import pyplot as plt
>>> from arkas.plot import boxplot_continuous
>>> fig, ax = plt.subplots()
>>> boxplot_continuous(ax, array=np.arange(101))

arkas.plot.boxplot_continuous_temporal

boxplot_continuous_temporal(
    ax: Axes,
    data: Sequence[ndarray],
    steps: Sequence,
    ymin: float | str | None = None,
    ymax: float | str | None = None,
    yscale: str = "linear",
) -> None

Plot the histogram of an array containing continuous values.

Parameters:

Name Type Description Default
ax Axes

The axes of the matplotlib figure to update.

required
data Sequence[ndarray]

The sequence of data where each item is a 1-d array with the values of the time step.

required
steps Sequence

The sequence time step names.

required
ymin float | str | None

The minimum value of the range or its associated quantile. q0.1 means the 10% quantile. 0 is the minimum value and 1 is the maximum value.

None
ymax float | str | None

The maximum value of the range or its associated quantile. q0.9 means the 90% quantile. 0 is the minimum value and 1 is the maximum value.

None
yscale str

The y-axis scale. If 'auto', the 'linear' or 'log'/'symlog' scale is chosen based on the distribution.

'linear'

Raises:

Type Description
RuntimeError

if data and steps have different lengths

Example usage:

>>> import numpy as np
>>> from matplotlib import pyplot as plt
>>> from arkas.plot import boxplot_continuous_temporal
>>> fig, ax = plt.subplots()
>>> rng = np.random.default_rng()
>>> data = [rng.standard_normal(1000) for _ in range(10)]
>>> boxplot_continuous_temporal(ax, data=data, steps=list(range(len(data))))

arkas.plot.hist_continuous

hist_continuous(
    ax: Axes,
    array: ndarray,
    nbins: int | None = None,
    density: bool = False,
    yscale: str = "linear",
    xmin: float | str | None = None,
    xmax: float | str | None = None,
    cdf: bool = True,
    quantile: bool = True,
) -> None

Plot the histogram of an array containing continuous values.

Parameters:

Name Type Description Default
ax Axes

The axes of the matplotlib figure to update.

required
array ndarray

The array with the data.

required
nbins int | None

The number of bins to use to plot.

None
density bool

If True, draw and return a probability density: each bin will display the bin's raw count divided by the total number of counts and the bin width, so that the area under the histogram integrates to 1.

False
yscale str

The y-axis scale. If 'auto', the 'linear' or 'log'/'symlog' scale is chosen based on the distribution.

'linear'
xmin float | str | None

The minimum value of the range or its associated quantile. q0.1 means the 10% quantile. 0 is the minimum value and 1 is the maximum value.

None
xmax float | str | None

The maximum value of the range or its associated quantile. q0.9 means the 90% quantile. 0 is the minimum value and 1 is the maximum value.

None
cdf bool

If True, the CDF is added to the plot.

True
quantile bool

If True, the 5% and 95% quantiles are added to the plot.

True

Example usage:

>>> import numpy as np
>>> from matplotlib import pyplot as plt
>>> from arkas.plot import hist_continuous
>>> fig, ax = plt.subplots()
>>> hist_continuous(ax, array=np.arange(101))

arkas.plot.hist_continuous2

hist_continuous2(
    ax: Axes,
    array1: ndarray,
    array2: ndarray,
    label1: str = "first",
    label2: str = "second",
    nbins: int | None = None,
    density: bool = False,
    yscale: str = "linear",
    xmin: float | str | None = None,
    xmax: float | str | None = None,
) -> None

Plot the histogram of two arrays to compare the distributions.

Parameters:

Name Type Description Default
ax Axes

The axes of the matplotlib figure to update.

required
array1 ndarray

The first array with the data.

required
array2 ndarray

The second array with the data.

required
label1 str

The label associated to the first array.

'first'
label2 str

The label associated to the second array.

'second'
nbins int | None

The number of bins to use to plot.

None
density bool

If True, draw and return a probability density: each bin will display the bin's raw count divided by the total number of counts and the bin width, so that the area under the histogram integrates to 1.

False
yscale str

The y-axis scale. If 'auto', the 'linear' or 'log'/'symlog' scale is chosen based on the distribution.

'linear'
xmin float | str | None

The minimum value of the range or its associated quantile. q0.1 means the 10% quantile. 0 is the minimum value and 1 is the maximum value.

None
xmax float | str | None

The maximum value of the range or its associated quantile. q0.9 means the 90% quantile. 0 is the minimum value and 1 is the maximum value.

None

Example usage:

>>> import numpy as np
>>> from matplotlib import pyplot as plt
>>> from arkas.plot import hist_continuous2
>>> fig, ax = plt.subplots()
>>> hist_continuous2(ax, array1=np.arange(101), array2=np.arange(51))

arkas.plot.plot_cdf

plot_cdf(
    ax: Axes,
    array: ndarray,
    nbins: int | None = None,
    xmin: float = float("-inf"),
    xmax: float = float("inf"),
    color: str = "tab:blue",
    labelcolor: str = "black",
) -> None

Plot the cumulative distribution function (CDF).

Parameters:

Name Type Description Default
ax Axes

The axes of the matplotlib figure to update.

required
array ndarray

The array with the data.

required
nbins int | None

The number of bins to use to plot the CDF.

None
xmin float

The minimum value of the range or its associated quantile. q0.1 means the 10% quantile. 0 is the minimum value and 1 is the maximum value.

float('-inf')
xmax float

The maximum value of the range or its associated quantile. q0.9 means the 90% quantile. 0 is the minimum value and 1 is the maximum value.

float('inf')
color str

The plot color.

'tab:blue'
labelcolor str

The label color.

'black'

Example usage:

>>> import numpy as np
>>> from matplotlib import pyplot as plt
>>> from arkas.plot import plot_cdf
>>> fig, ax = plt.subplots()
>>> plot_cdf(ax, array=np.arange(101))

arkas.plot.plot_null_temporal

plot_null_temporal(
    ax: Axes,
    nulls: Sequence,
    totals: Sequence,
    labels: Sequence,
) -> None

Plot the temporal distribution of the number of missing values.

nulls, totals, and labels must have the same length and have the same order.

Parameters:

Name Type Description Default
ax Axes

The Axes object that encapsulates all the elements of an individual (sub-)plot in a figure.

required
nulls Sequence

The number of null values for each temporal period.

required
totals Sequence

The number of total values for each temporal period.

required
labels Sequence

The labels for each temporal period.

required

Raises:

Type Description
RuntimeError

if nulls, totals, and labels have different lengths.

Example usage:

>>> from matplotlib import pyplot as plt
>>> from arkas.plot import plot_null_temporal
>>> fig, ax = plt.subplots()
>>> plot_null_temporal(
...     ax, nulls=[1, 2, 3, 4], totals=[10, 12, 14, 16], labels=["jan", "feb", "mar", "apr"]
... )