Skip to content

Summary

coola.summary

Contain functionalities to compute a text summary of nested data based on the type of data.

coola.summary.BaseCollectionSummarizer

Bases: BaseSummarizer[T]

Base class for summarizing collection-based data structures.

This class provides the foundation for summarizing various collection types with configurable formatting options. It handles item limiting and indentation for readable output.

Parameters:

Name Type Description Default
max_items int

The maximum number of items to display in the summary. If set to a negative value (e.g., -1), all items in the collection will be shown without truncation. Defaults to 5.

5
num_spaces int

The number of spaces to use for indentation in the formatted output. This affects the visual structure of nested summaries. Defaults to 2.

2

Attributes:

Name Type Description
_max_items

Stores the maximum number of items to display.

_num_spaces

Stores the number of spaces for indentation.

Example
>>> from coola.summary import SummarizerRegistry, MappingSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> summarizer = MappingSummarizer()
>>> output = summarizer.summarize({"key1": 1.2, "key2": "abc", "key3": 42}, registry)
>>> print(output)
<class 'dict'> (length=3)
  (key1): 1.2
  (key2): abc
  (key3): 42

coola.summary.BaseSummarizer

Bases: ABC, Generic[T]

Abstract base class for implementing data summarizers.

A summarizer converts Python objects into formatted string representations, with support for nested structures and configurable depth limits. This is useful for debugging, logging, and displaying complex data in a readable format.

The class is generic over type T, allowing concrete implementations to specialize for specific data types while maintaining type safety.

Notes

Concrete implementations must override the summary method to define how data should be formatted and displayed.

The depth mechanism allows for progressive disclosure of nested structures, preventing overwhelming output for deeply nested data.

Example
>>> from coola.summary import SummarizerRegistry, DefaultSummarizer
>>> registry = SummarizerRegistry()
>>> summarizer = DefaultSummarizer()
>>> print(summarizer.summarize(1, registry))
<class 'int'> 1

coola.summary.BaseSummarizer.equal abstractmethod

equal(other: object) -> bool

Check equality between this summarizer and another object.

Two summarizers are considered equal if they are of the exact same type and have identical configuration parameters.

Parameters:

Name Type Description Default
other object

The object to compare with this summarizer.

required

Returns:

Type Description
bool

True if the objects are equal, otherwise False.

Example
>>> from coola.summary import DefaultSummarizer, MappingSummarizer
>>> summarizer1 = DefaultSummarizer()
>>> summarizer2 = DefaultSummarizer()
>>> summarizer3 = MappingSummarizer()
>>> summarizer1.equal(summarizer2)
True
>>> summarizer1.equal(summarizer3)
False

coola.summary.BaseSummarizer.summarize abstractmethod

summarize(
    data: T,
    registry: SummarizerRegistry,
    depth: int = 0,
    max_depth: int = 1,
) -> str

Generate a formatted string summary of the provided data.

This method creates a human-readable representation of the input data, with support for nested structures up to a specified depth. When the current depth exceeds max_depth, nested structures are typically shown in a compact form without further expansion.

Parameters:

Name Type Description Default
data T

The data object to summarize. Can be any Python object, though behavior depends on the concrete implementation.

required
registry SummarizerRegistry

The summarizer registry used to look up summarizers for nested data structures of different types.

required
depth int

The current nesting level in the data structure. Used internally during recursive summarization. Typically starts at 0 for top-level calls. Must be non-negative.

0
max_depth int

The maximum nesting level to expand when summarizing. Structures deeper than this level are shown in compact form. Must be non-negative. Default is 1, which expands only the top level of nested structures.

1

Returns:

Type Description
str

A formatted string representation of the data. The exact format

str

depends on the concrete implementation, but typically includes

str

type information, size/length metadata, and indented content for

str

nested structures.

Raises:

Type Description
ValueError

May be raised by implementations for invalid depth parameters or other issues based on the data type being summarized. The base class doesn't specify exceptions, but implementations may raise this or other exceptions.

Notes
  • The depth parameter is primarily for internal use during recursion. Most external callers should use the default value of 0.
  • Setting max_depth=0 typically shows only top-level information without expanding any nested structures.
  • Higher max_depth values provide more detail but can produce very long output for deeply nested data.
Example
>>> from coola.summary import SummarizerRegistry, DefaultSummarizer
>>> registry = SummarizerRegistry()
>>> summarizer = DefaultSummarizer()
>>> print(summarizer.summarize(1, registry))
<class 'int'> 1

coola.summary.DefaultSummarizer

Bases: BaseSummarizer[object]

Implement the default summarizer.

Parameters:

Name Type Description Default
max_characters int

The maximum number of characters to show. If a negative value is provided, all the characters are shown.

-1
Example
>>> from coola.summary import SummarizerRegistry, DefaultSummarizer
>>> registry = SummarizerRegistry()
>>> summarizer = DefaultSummarizer()
>>> print(summarizer.summarize(1, registry))
<class 'int'> 1

coola.summary.MappingSummarizer

Bases: BaseCollectionSummarizer[Mapping[Any, Any]]

Summarizer for mapping-based data structures like dictionaries.

This class formats mapping types (dict, OrderedDict, etc.) into readable summaries that display the type, length, and key-value pairs with proper indentation. It respects the max_items limit and handles nested structures through the registry system. This class creates a multi-line summary showing the mapping's type, length, and contents. It handles depth limiting to prevent excessively deep nested summaries and truncates the output when the number of items exceeds max_items.

Parameters:

Name Type Description Default
max_items int

The maximum number of key-value pairs to display. If negative, shows all pairs. Defaults to 5.

5
num_spaces int

The number of spaces for indenting each level. Defaults to 2.

2
Example
>>> from coola.summary import SummarizerRegistry, MappingSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> summarizer = MappingSummarizer()
>>> output = summarizer.summarize({"key1": 1.2, "key2": "abc", "key3": 42}, registry)
>>> print(output)
<class 'dict'> (length=3)
  (key1): 1.2
  (key2): abc
  (key3): 42

coola.summary.NDArraySummarizer

Bases: BaseSummarizer[ndarray]

Implement a summarizer for numpy.ndarray objects.

This summarizer generates compact string representations of NumPy arrays. By default, it displays metadata (type, shape, dtype) rather than array values, making it suitable for logging and debugging large arrays. Optionally, it can show the full array representation.

Parameters:

Name Type Description Default
show_data bool

If True, returns the default array string representation (same as repr(array)), displaying actual values. If False (default), returns only metadata in a compact format: <class> | shape=<shape> | dtype=<dtype>. Default: False

False

Raises:

Type Description
RuntimeError

If NumPy is not installed or available.

Example
>>> import numpy as np
>>> from coola.summary import SummarizerRegistry, NDArraySummarizer
>>> registry = SummarizerRegistry()

>>> # Default behavior: show metadata only
>>> summarizer = NDArraySummarizer()
>>> print(summarizer.summarize(np.arange(11), registry))
<class 'numpy.ndarray'> | shape=(11,) | dtype=int64

>>> # Works with arrays of any shape and dtype
>>> print(summarizer.summarize(np.ones((2, 3, 4)), registry))
<class 'numpy.ndarray'> | shape=(2, 3, 4) | dtype=float64

>>> # Show full array data
>>> summarizer = NDArraySummarizer(show_data=True)
>>> print(summarizer.summarize(np.arange(11), registry))
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

coola.summary.SequenceSummarizer

Bases: BaseCollectionSummarizer[Sequence[Any]]

Summarizer for sequence-based data structures like lists and tuples.

This class formats sequence types (list, tuple, etc.) into readable summaries that display the type, length, and indexed items with proper indentation. It respects the max_items limit and handles nested structures through the registry system.

This class creates a multi-line summary showing the sequence's type, length, and contents. It handles depth limiting to prevent excessively deep nested summaries and truncates the output when the number of items exceeds max_items.

Parameters:

Name Type Description Default
max_items int

The maximum number of items to display. If negative, shows all items. Defaults to 5.

5
num_spaces int

The number of spaces for indenting each level. Defaults to 2.

2
Example
>>> from coola.summary import SummarizerRegistry, SequenceSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> summarizer = SequenceSummarizer()
>>> output = summarizer.summarize([1, 2, 3], registry)
>>> print(output)
<class 'list'> (length=3)
  (0): 1
  (1): 2
  (2): 3

coola.summary.SetSummarizer

Bases: BaseCollectionSummarizer[Set[Any]]

Summarizer for set-based data structures.

This class formats set types (set, frozenset, etc.) into readable summaries that display the type, length, and items with proper indentation. It respects the max_items limit and handles nested structures through the registry system.

This class creates a multi-line summary showing the set's type, length, and contents. It handles depth limiting to prevent excessively deep nested summaries and truncates the output when the number of items exceeds max_items.

Parameters:

Name Type Description Default
max_items int

The maximum number of items to display. If negative, shows all items. Defaults to 5.

5
num_spaces int

The number of spaces for indenting each level. Defaults to 2.

2
Example
>>> from coola.summary import SummarizerRegistry, SetSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> summarizer = SetSummarizer()
>>> output = summarizer.summarize({1}, registry)
>>> print(output)
<class 'set'> (length=1)
  (0): 1

coola.summary.SummarizerRegistry

Registry that manages and dispatches summarizers based on data type.

This registry maintains a mapping from Python types to summarizer instances and uses the Method Resolution Order (MRO) for type lookup. When summarizing data, it automatically selects the most specific registered summarizer for the data's type, falling back to parent types or a default summarizer if needed.

The registry includes an LRU cache for type lookups to optimize performance in applications that repeatedly summarize similar data structures.

Parameters:

Name Type Description Default
initial_state dict[type, BaseSummarizer[Any]] | None

Optional initial mapping of types to summarizers. If provided, the state is copied to prevent external mutations.

None

Attributes:

Name Type Description
_state TypeRegistry[BaseSummarizer]

Internal mapping of registered types to summarizers

Example

Basic usage with a sequence summarizer:

>>> from coola.summary import SummarizerRegistry, SequenceSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer(), list: SequenceSummarizer()})
>>> registry
SummarizerRegistry(
  (state): TypeRegistry(
      (<class 'object'>): DefaultSummarizer(max_characters=-1)
      (<class 'list'>): SequenceSummarizer(max_items=5, num_spaces=2)
    )
)
>>> print(registry.summarize([1, 2, 3]))
<class 'list'> (length=3)
  (0): 1
  (1): 2
  (2): 3

Registering custom summarizers:

>>> from coola.summary import SummarizerRegistry, SequenceSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> registry.register(tuple, SequenceSummarizer())
>>> print(registry.summarize((1, 2, 3)))
<class 'tuple'> (length=3)
  (0): 1
  (1): 2
  (2): 3

Working with nested structures:

>>> from coola.summary import get_default_registry
>>> registry = get_default_registry()
>>> print(registry.summarize({"a": [1, 2], "b": [3, 4]}))
<class 'dict'> (length=2)
  (a): [1, 2]
  (b): [3, 4]

coola.summary.SummarizerRegistry.find_summarizer

find_summarizer(data_type: type) -> BaseSummarizer[Any]

Find the appropriate summarizer for a given type.

Uses the Method Resolution Order (MRO) to find the most specific registered summarizer. For example, if you register a summarizer for Sequence but not for list, lists will use the Sequence summarizer.

Results are cached using an LRU cache (256 entries) for performance, as summarizer lookup is a hot path in recursive summarizations.

Parameters:

Name Type Description Default
data_type type

The Python type to find a summarizer for

required

Returns:

Type Description
BaseSummarizer[Any]

The most specific registered summarizer for this type, a parent

BaseSummarizer[Any]

type's summarizer via MRO, or the default summarizer

Example
>>> from coola.summary import get_default_registry
>>> registry = get_default_registry()
>>> summarizer = registry.find_summarizer(list)
>>> summarizer
SequenceSummarizer(max_items=5, num_spaces=2)

coola.summary.SummarizerRegistry.has_summarizer

has_summarizer(data_type: type) -> bool

Check if a summarizer is explicitly registered for the given type.

Note that this only checks for direct registration. Even if this returns False, find_summarizer() may still return a summarizer via MRO lookup or the default summarizer.

Parameters:

Name Type Description Default
data_type type

The type to check

required

Returns:

Type Description
bool

True if a summarizer is explicitly registered for this type,

bool

False otherwise

Example
>>> from coola.summary import SummarizerRegistry, SequenceSummarizer
>>> registry = SummarizerRegistry({list: SequenceSummarizer()})
>>> registry.has_summarizer(list)
True
>>> registry.has_summarizer(tuple)
False

coola.summary.SummarizerRegistry.register

register(
    data_type: type,
    summarizer: BaseSummarizer[Any],
    exist_ok: bool = False,
) -> None

Register a summarizer for a given data type.

This method associates a summarizer instance with a specific Python type. When data of this type is summarized, the registered summarizer will be used. The cache is automatically cleared after registration to ensure consistency.

Parameters:

Name Type Description Default
data_type type

The Python type to register (e.g., list, dict, custom classes)

required
summarizer BaseSummarizer[Any]

The summarizer instance that handles this type

required
exist_ok bool

If False (default), raises an error if the type is already registered. If True, overwrites the existing registration silently.

False

Raises:

Type Description
RuntimeError

If the type is already registered and exist_ok is False

Example
>>> from coola.summary import SummarizerRegistry, SequenceSummarizer
>>> registry = SummarizerRegistry()
>>> registry.register(list, SequenceSummarizer())
>>> registry.has_summarizer(list)
True

coola.summary.SummarizerRegistry.register_many

register_many(
    mapping: Mapping[type, BaseSummarizer[Any]],
    exist_ok: bool = False,
) -> None

Register multiple summarizers at once.

This is a convenience method for bulk registration that internally calls register() for each type-summarizer pair.

Parameters:

Name Type Description Default
mapping Mapping[type, BaseSummarizer[Any]]

Dictionary mapping Python types to summarizer instances

required
exist_ok bool

If False (default), raises an error if any type is already registered. If True, overwrites existing registrations silently.

False

Raises:

Type Description
RuntimeError

If any type is already registered and exist_ok is False

Example
>>> from coola.summary import SummarizerRegistry, SequenceSummarizer, MappingSummarizer
>>> registry = SummarizerRegistry()
>>> registry.register_many(
...     {
...         list: SequenceSummarizer(),
...         dict: MappingSummarizer(),
...     }
... )
>>> registry
SummarizerRegistry(
  (state): TypeRegistry(
      (<class 'list'>): SequenceSummarizer(max_items=5, num_spaces=2)
      (<class 'dict'>): MappingSummarizer(max_items=5, num_spaces=2)
    )
)

coola.summary.SummarizerRegistry.summarize

summarize(
    data: object, depth: int = 0, max_depth: int = 1
) -> str

Generate a formatted string summary of the provided data.

This method creates a human-readable representation of the input data, with support for nested structures up to a specified depth. When the current depth exceeds max_depth, nested structures are typically shown in a compact form without further expansion.

Parameters:

Name Type Description Default
data object

The data object to summarize. Can be any Python object, though behavior depends on the registered summarizers.

required
depth int

The current nesting level in the data structure. Used internally during recursive summarization. Typically starts at 0 for top-level calls. Must be non-negative.

0
max_depth int

The maximum nesting level to expand when summarizing. Structures deeper than this level are shown in compact form. Must be non-negative. Default is 1, which expands only the top level of nested structures.

1

Returns:

Type Description
str

A formatted string representation of the data. The exact format

str

depends on the registered summarizer, but typically includes

str

type information, size/length metadata, and indented content for

str

nested structures.

Raises:

Type Description
ValueError

May be raised by individual summarizers for invalid depth parameters or other issues based on the data type being summarized. The registry itself doesn't raise exceptions, but delegates to registered summarizers which may raise this or other exceptions.

Notes
  • The depth parameter is primarily for internal use during recursion. Most external callers should use the default value of 0.
  • Setting max_depth=0 typically shows only top-level information without expanding any nested structures.
  • Higher max_depth values provide more detail but can produce very long output for deeply nested data.
Example
>>> from coola.summary import get_default_registry
>>> registry = get_default_registry()

>>> # Simple value
>>> print(registry.summarize(1))
<class 'int'> 1

>>> # List with default depth (expands first level only)
>>> print(registry.summarize(["abc", "def"]))
<class 'list'> (length=2)
  (0): abc
  (1): def

>>> # Nested list, default max_depth=1 (inner list not expanded)
>>> print(registry.summarize([[0, 1, 2], {"key1": "abc", "key2": "def"}]))
<class 'list'> (length=2)
  (0): [0, 1, 2]
  (1): {'key1': 'abc', 'key2': 'def'}

>>> # Nested list with max_depth=2 (expands both levels)
>>> print(registry.summarize([[0, 1, 2], {"key1": "abc", "key2": "def"}], max_depth=2))
<class 'list'> (length=2)
  (0): <class 'list'> (length=3)
      (0): 0
      (1): 1
      (2): 2
  (1): <class 'dict'> (length=2)
      (key1): abc
      (key2): def

>>> # Control depth for very nested structures
>>> deeply_nested = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
>>> print(registry.summarize(deeply_nested))
<class 'list'> (length=2)
  (0): [[1, 2], [3, 4]]
  (1): [[5, 6], [7, 8]]

coola.summary.TensorSummarizer

Bases: BaseSummarizer[Tensor]

Implement a summarizer for torch.Tensor objects.

This summarizer generates compact string representations of PyTorch tensors. By default, it displays metadata (type, shape, dtype, device) rather than tensor values, making it suitable for logging and debugging large tensors. Optionally, it can show the full tensor representation.

Parameters:

Name Type Description Default
show_data bool

If True, returns the default tensor string representation (same as repr(tensor)), displaying actual values. If False (default), returns only metadata in a compact format: <class> | shape=<shape> | dtype=<dtype> | device=<device>. Default: False

False

Raises:

Type Description
RuntimeError

If PyTorch is not installed or available.

Example
>>> import torch
>>> from coola.summary import SummarizerRegistry, TensorSummarizer
>>> registry = SummarizerRegistry()

>>> # Default behavior: show metadata only
>>> summarizer = TensorSummarizer()
>>> print(summarizer.summarize(torch.arange(11), registry))  # doctest: +ELLIPSIS
<class 'torch.Tensor'> | shape=torch.Size([11]) | dtype=torch.int64 | device=cpu | requires_grad=False

>>> # Works with tensors of any shape and dtype
>>> print(summarizer.summarize(torch.ones(2, 3, 4), registry))  # doctest: +ELLIPSIS
<class 'torch.Tensor'> | shape=torch.Size([2, 3, 4]) | dtype=torch.float32 | device=cpu | requires_grad=False

>>> # Show full tensor data
>>> summarizer = TensorSummarizer(show_data=True)
>>> print(summarizer.summarize(torch.arange(11), registry))  # doctest: +ELLIPSIS
tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

coola.summary.get_default_registry

get_default_registry() -> SummarizerRegistry

Get or create the default global registry with common Python types.

Returns a singleton registry instance that is pre-configured with summarizers for Python's built-in types including sequences (list, tuple), mappings (dict), sets, and scalar types (int, float, str, bool).

This function uses a singleton pattern to ensure the same registry instance is returned on subsequent calls, which is efficient and maintains consistency across an application.

Returns:

Type Description
SummarizerRegistry

A SummarizerRegistry instance with summarizers registered for: - Scalar types (int, float, complex, bool, str) - Sequences (list, tuple, Sequence ABC) - Sets (set, frozenset) - Mappings (dict, Mapping ABC)

Notes

The singleton pattern means modifications to the returned registry affect all future calls to this function. If you need an isolated registry, create a new SummarizerRegistry instance directly.

Example
>>> from coola.summary import get_default_registry
>>> registry = get_default_registry()
>>> # Registry is ready to use with common Python types
>>> print(registry.summarize([1, 2, 3]))
<class 'list'> (length=3)
  (0): 1
  (1): 2
  (2): 3
>>> print(registry.summarize({"a": 1, "b": 2}))
<class 'dict'> (length=2)
  (a): 1
  (b): 2

coola.summary.register_summarizers

register_summarizers(
    mapping: Mapping[type, BaseSummarizer[Any]],
    exist_ok: bool = False,
) -> None

Register custom summarizers to the default global registry.

This allows users to add support for custom types without modifying global state directly.

Parameters:

Name Type Description Default
mapping Mapping[type, BaseSummarizer[Any]]

Dictionary mapping types to summarizer instances

required
exist_ok bool

If False, raises error if any type already registered

False
Example
>>> from coola.summary import register_summarizers, BaseSummarizer, SummarizerRegistry
>>> class MyType:
...     def __init__(self, value):
...         self.value = value
...
>>> class MySummarizer(BaseSummarizer[MyType]):
...     def equal(self, other: object) -> bool:
...         return type(object) is type(self)
...     def summarize(
...         self,
...         data: MyType,
...         registry: SummarizerRegistry,
...         depth: int = 0,
...         max_depth: int = 1,
...     ) -> str:
...         return f"<MyType> value={data.value}"
...
>>> register_summarizers({MyType: MySummarizer()})

coola.summary.summarize

summarize(
    data: object,
    max_depth: int = 1,
    registry: SummarizerRegistry | None = None,
) -> str

Create a summary string representation of nested data.

Parameters:

Name Type Description Default
data object

Input data (can be nested)

required
max_depth int

The maximum nesting level to expand when summarizing. Structures deeper than this level are shown in compact form. Must be non-negative. Default is 1, which expands only the top level of nested structures.

1
registry SummarizerRegistry | None

Registry to resolve summarizers for nested data. If None, uses the default registry.

None

Returns:

Type Description
str

String summary of the data

Example
>>> from coola.summary import summarize
>>> print(summarize({"a": 1, "b": "abc"}))
<class 'dict'> (length=2)
  (a): 1
  (b): abc