Summary

coola.summary ¶

Contain functionalities to compute a text summary of nested data based on the type of data.

coola.summary.BaseCollectionSummarizer ¶

Bases: BaseSummarizer[T]

Base class for summarizing collection-based data structures.

This class provides the foundation for summarizing various collection types with configurable formatting options. It handles item limiting and indentation for readable output.

Parameters:

Name	Type	Description	Default
`max_items`	`int`	The maximum number of items to display in the summary. If set to a negative value (e.g., -1), all items in the collection will be shown without truncation. Defaults to 5.	`5`
`num_spaces`	`int`	The number of spaces to use for indentation in the formatted output. This affects the visual structure of nested summaries. Defaults to 2.	`2`

Attributes:

Name	Type	Description
`_max_items`		Stores the maximum number of items to display.
`_num_spaces`		Stores the number of spaces for indentation.

Example

>>> from coola.summary import SummarizerRegistry, MappingSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> summarizer = MappingSummarizer()
>>> output = summarizer.summarize({"key1": 1.2, "key2": "abc", "key3": 42}, registry)
>>> print(output)
<class 'dict'> (length=3)
  (key1): 1.2
  (key2): abc
  (key3): 42

coola.summary.BaseSummarizer ¶

Bases: ABC, Generic[T]

Abstract base class for implementing data summarizers.

A summarizer converts Python objects into formatted string representations, with support for nested structures and configurable depth limits. This is useful for debugging, logging, and displaying complex data in a readable format.

The class is generic over type T, allowing concrete implementations to specialize for specific data types while maintaining type safety.

Notes

Concrete implementations must override the summary method to define how data should be formatted and displayed.

The depth mechanism allows for progressive disclosure of nested structures, preventing overwhelming output for deeply nested data.

Example

>>> from coola.summary import SummarizerRegistry, DefaultSummarizer
>>> registry = SummarizerRegistry()
>>> summarizer = DefaultSummarizer()
>>> print(summarizer.summarize(1, registry))
<class 'int'> 1

coola.summary.BaseSummarizer.equal `abstractmethod` ¶

equal(other: object) -> bool

Check equality between this summarizer and another object.

Two summarizers are considered equal if they are of the exact same type and have identical configuration parameters.

Parameters:

Name	Type	Description	Default
`other`	`object`	The object to compare with this summarizer.	required

Returns:

Type	Description
`bool`	`True` if the objects are equal, otherwise `False`.

Example

>>> from coola.summary import DefaultSummarizer, MappingSummarizer
>>> summarizer1 = DefaultSummarizer()
>>> summarizer2 = DefaultSummarizer()
>>> summarizer3 = MappingSummarizer()
>>> summarizer1.equal(summarizer2)
True
>>> summarizer1.equal(summarizer3)
False

coola.summary.BaseSummarizer.summarize `abstractmethod` ¶

summarize(
    data: T,
    registry: SummarizerRegistry,
    depth: int = 0,
    max_depth: int = 1,
) -> str

Generate a formatted string summary of the provided data.

This method creates a human-readable representation of the input data, with support for nested structures up to a specified depth. When the current depth exceeds max_depth, nested structures are typically shown in a compact form without further expansion.

Parameters:

Name	Type	Description	Default
`data`	`T`	The data object to summarize. Can be any Python object, though behavior depends on the concrete implementation.	required
`registry`	`SummarizerRegistry`	The summarizer registry used to look up summarizers for nested data structures of different types.	required
`depth`	`int`	The current nesting level in the data structure. Used internally during recursive summarization. Typically starts at 0 for top-level calls. Must be non-negative.	`0`
`max_depth`	`int`	The maximum nesting level to expand when summarizing. Structures deeper than this level are shown in compact form. Must be non-negative. Default is 1, which expands only the top level of nested structures.	`1`

Returns:

Type	Description
`str`	A formatted string representation of the data. The exact format
`str`	depends on the concrete implementation, but typically includes
`str`	type information, size/length metadata, and indented content for
`str`	nested structures.

Raises:

Type	Description
`ValueError`	May be raised by implementations for invalid depth parameters or other issues based on the data type being summarized. The base class doesn't specify exceptions, but implementations may raise this or other exceptions.

Notes

The depth parameter is primarily for internal use during recursion. Most external callers should use the default value of 0.
Setting max_depth=0 typically shows only top-level information without expanding any nested structures.
Higher max_depth values provide more detail but can produce very long output for deeply nested data.

Example

>>> from coola.summary import SummarizerRegistry, DefaultSummarizer
>>> registry = SummarizerRegistry()
>>> summarizer = DefaultSummarizer()
>>> print(summarizer.summarize(1, registry))
<class 'int'> 1

coola.summary.DefaultSummarizer ¶

Bases: BaseSummarizer[object]

Implement the default summarizer.

Parameters:

Name	Type	Description	Default
`max_characters`	`int`	The maximum number of characters to show. If a negative value is provided, all the characters are shown.	`-1`

Example

>>> from coola.summary import SummarizerRegistry, DefaultSummarizer
>>> registry = SummarizerRegistry()
>>> summarizer = DefaultSummarizer()
>>> print(summarizer.summarize(1, registry))
<class 'int'> 1

coola.summary.MappingSummarizer ¶

Bases: BaseCollectionSummarizer[Mapping[Any, Any]]

Summarizer for mapping-based data structures like dictionaries.

This class formats mapping types (dict, OrderedDict, etc.) into readable summaries that display the type, length, and key-value pairs with proper indentation. It respects the max_items limit and handles nested structures through the registry system. This class creates a multi-line summary showing the mapping's type, length, and contents. It handles depth limiting to prevent excessively deep nested summaries and truncates the output when the number of items exceeds max_items.

Parameters:

Name	Type	Description	Default
`max_items`	`int`	The maximum number of key-value pairs to display. If negative, shows all pairs. Defaults to 5.	`5`
`num_spaces`	`int`	The number of spaces for indenting each level. Defaults to 2.	`2`

Example

>>> from coola.summary import SummarizerRegistry, MappingSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> summarizer = MappingSummarizer()
>>> output = summarizer.summarize({"key1": 1.2, "key2": "abc", "key3": 42}, registry)
>>> print(output)
<class 'dict'> (length=3)
  (key1): 1.2
  (key2): abc
  (key3): 42

coola.summary.NDArraySummarizer ¶

Bases: BaseSummarizer[ndarray]

Implement a summarizer for numpy.ndarray objects.

This summarizer generates compact string representations of NumPy arrays. By default, it displays metadata (type, shape, dtype) rather than array values, making it suitable for logging and debugging large arrays. Optionally, it can show the full array representation.

Parameters:

Name	Type	Description	Default
`show_data`	`bool`	If `True`, returns the default array string representation (same as `repr(array)`), displaying actual values. If `False` (default), returns only metadata in a compact format: `<class> \| shape=<shape> \| dtype=<dtype>`. Default: `False`	`False`

Raises:

Type	Description
`RuntimeError`	If NumPy is not installed or available.

Example

>>> import numpy as np
>>> from coola.summary import SummarizerRegistry, NDArraySummarizer
>>> registry = SummarizerRegistry()

>>> # Default behavior: show metadata only
>>> summarizer = NDArraySummarizer()
>>> print(summarizer.summarize(np.arange(11), registry))
<class 'numpy.ndarray'> | shape=(11,) | dtype=int64

>>> # Works with arrays of any shape and dtype
>>> print(summarizer.summarize(np.ones((2, 3, 4)), registry))
<class 'numpy.ndarray'> | shape=(2, 3, 4) | dtype=float64

>>> # Show full array data
>>> summarizer = NDArraySummarizer(show_data=True)
>>> print(summarizer.summarize(np.arange(11), registry))
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

coola.summary.SequenceSummarizer ¶

Bases: BaseCollectionSummarizer[Sequence[Any]]

Summarizer for sequence-based data structures like lists and tuples.

This class formats sequence types (list, tuple, etc.) into readable summaries that display the type, length, and indexed items with proper indentation. It respects the max_items limit and handles nested structures through the registry system.

This class creates a multi-line summary showing the sequence's type, length, and contents. It handles depth limiting to prevent excessively deep nested summaries and truncates the output when the number of items exceeds max_items.

Parameters:

Name	Type	Description	Default
`max_items`	`int`	The maximum number of items to display. If negative, shows all items. Defaults to 5.	`5`
`num_spaces`	`int`	The number of spaces for indenting each level. Defaults to 2.	`2`

Example

>>> from coola.summary import SummarizerRegistry, SequenceSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> summarizer = SequenceSummarizer()
>>> output = summarizer.summarize([1, 2, 3], registry)
>>> print(output)
<class 'list'> (length=3)
  (0): 1
  (1): 2
  (2): 3

coola.summary.SetSummarizer ¶

Bases: BaseCollectionSummarizer[Set[Any]]

Summarizer for set-based data structures.

This class formats set types (set, frozenset, etc.) into readable summaries that display the type, length, and items with proper indentation. It respects the max_items limit and handles nested structures through the registry system.

This class creates a multi-line summary showing the set's type, length, and contents. It handles depth limiting to prevent excessively deep nested summaries and truncates the output when the number of items exceeds max_items.

Parameters:

Name	Type	Description	Default
`max_items`	`int`	The maximum number of items to display. If negative, shows all items. Defaults to 5.	`5`
`num_spaces`	`int`	The number of spaces for indenting each level. Defaults to 2.	`2`

Example

>>> from coola.summary import SummarizerRegistry, SetSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> summarizer = SetSummarizer()
>>> output = summarizer.summarize({1}, registry)
>>> print(output)
<class 'set'> (length=1)
  (0): 1

coola.summary.SummarizerRegistry ¶

Registry that manages and dispatches summarizers based on data type.

This registry maintains a mapping from Python types to summarizer instances and uses the Method Resolution Order (MRO) for type lookup. When summarizing data, it automatically selects the most specific registered summarizer for the data's type, falling back to parent types or a default summarizer if needed.

The registry includes an LRU cache for type lookups to optimize performance in applications that repeatedly summarize similar data structures.

Parameters:

Name	Type	Description	Default
`initial_state`	`dict[type, BaseSummarizer[Any]] \| None`	Optional initial mapping of types to summarizers. If provided, the state is copied to prevent external mutations.	`None`

Attributes:

Name	Type	Description
`_state`	`TypeRegistry[BaseSummarizer]`	Internal mapping of registered types to summarizers

Example

Basic usage with a sequence summarizer:

>>> from coola.summary import SummarizerRegistry, SequenceSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer(), list: SequenceSummarizer()})
>>> registry
SummarizerRegistry(
  (state): TypeRegistry(
      (<class 'object'>): DefaultSummarizer(max_characters=-1)
      (<class 'list'>): SequenceSummarizer(max_items=5, num_spaces=2)
    )
)
>>> print(registry.summarize([1, 2, 3]))
<class 'list'> (length=3)
  (0): 1
  (1): 2
  (2): 3

Registering custom summarizers:

>>> from coola.summary import SummarizerRegistry, SequenceSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> registry.register(tuple, SequenceSummarizer())
>>> print(registry.summarize((1, 2, 3)))
<class 'tuple'> (length=3)
  (0): 1
  (1): 2
  (2): 3

Working with nested structures:

>>> from coola.summary import get_default_registry
>>> registry = get_default_registry()
>>> print(registry.summarize({"a": [1, 2], "b": [3, 4]}))
<class 'dict'> (length=2)
  (a): [1, 2]
  (b): [3, 4]

coola.summary.SummarizerRegistry.find_summarizer ¶

find_summarizer(data_type: type) -> BaseSummarizer[Any]

Find the appropriate summarizer for a given type.

Uses the Method Resolution Order (MRO) to find the most specific registered summarizer. For example, if you register a summarizer for Sequence but not for list, lists will use the Sequence summarizer.

Results are cached using an LRU cache (256 entries) for performance, as summarizer lookup is a hot path in recursive summarizations.

Parameters:

Name	Type	Description	Default
`data_type`	`type`	The Python type to find a summarizer for	required

Returns:

Type	Description
`BaseSummarizer[Any]`	The most specific registered summarizer for this type, a parent
`BaseSummarizer[Any]`	type's summarizer via MRO, or the default summarizer

Example

>>> from coola.summary import get_default_registry
>>> registry = get_default_registry()
>>> summarizer = registry.find_summarizer(list)
>>> summarizer
SequenceSummarizer(max_items=5, num_spaces=2)

coola.summary.SummarizerRegistry.has_summarizer ¶

has_summarizer(data_type: type) -> bool

Check if a summarizer is explicitly registered for the given type.

Note that this only checks for direct registration. Even if this returns False, find_summarizer() may still return a summarizer via MRO lookup or the default summarizer.

Parameters:

Name	Type	Description	Default
`data_type`	`type`	The type to check	required

Returns:

Type	Description
`bool`	True if a summarizer is explicitly registered for this type,
`bool`	False otherwise

Example

>>> from coola.summary import SummarizerRegistry, SequenceSummarizer
>>> registry = SummarizerRegistry({list: SequenceSummarizer()})
>>> registry.has_summarizer(list)
True
>>> registry.has_summarizer(tuple)
False

coola.summary.SummarizerRegistry.register ¶

register(
    data_type: type,
    summarizer: BaseSummarizer[Any],
    exist_ok: bool = False,
) -> None

Register a summarizer for a given data type.

This method associates a summarizer instance with a specific Python type. When data of this type is summarized, the registered summarizer will be used. The cache is automatically cleared after registration to ensure consistency.

Parameters:

Name	Type	Description	Default
`data_type`	`type`	The Python type to register (e.g., list, dict, custom classes)	required
`summarizer`	`BaseSummarizer[Any]`	The summarizer instance that handles this type	required
`exist_ok`	`bool`	If False (default), raises an error if the type is already registered. If True, overwrites the existing registration silently.	`False`

Raises:

Type	Description
`RuntimeError`	If the type is already registered and exist_ok is False

Example

>>> from coola.summary import SummarizerRegistry, SequenceSummarizer
>>> registry = SummarizerRegistry()
>>> registry.register(list, SequenceSummarizer())
>>> registry.has_summarizer(list)
True

coola.summary.SummarizerRegistry.register_many ¶

register_many(
    mapping: Mapping[type, BaseSummarizer[Any]],
    exist_ok: bool = False,
) -> None

Register multiple summarizers at once.

This is a convenience method for bulk registration that internally calls register() for each type-summarizer pair.

Parameters:

Name	Type	Description	Default
`mapping`	`Mapping[type, BaseSummarizer[Any]]`	Dictionary mapping Python types to summarizer instances	required
`exist_ok`	`bool`	If False (default), raises an error if any type is already registered. If True, overwrites existing registrations silently.	`False`

Raises:

Type	Description
`RuntimeError`	If any type is already registered and exist_ok is False

Example

>>> from coola.summary import SummarizerRegistry, SequenceSummarizer, MappingSummarizer
>>> registry = SummarizerRegistry()
>>> registry.register_many(
...     {
...         list: SequenceSummarizer(),
...         dict: MappingSummarizer(),
...     }
... )
>>> registry
SummarizerRegistry(
  (state): TypeRegistry(
      (<class 'list'>): SequenceSummarizer(max_items=5, num_spaces=2)
      (<class 'dict'>): MappingSummarizer(max_items=5, num_spaces=2)
    )
)

coola.summary.SummarizerRegistry.summarize ¶

summarize(
    data: object, depth: int = 0, max_depth: int = 1
) -> str

Generate a formatted string summary of the provided data.

This method creates a human-readable representation of the input data, with support for nested structures up to a specified depth. When the current depth exceeds max_depth, nested structures are typically shown in a compact form without further expansion.

Parameters:

Name	Type	Description	Default
`data`	`object`	The data object to summarize. Can be any Python object, though behavior depends on the registered summarizers.	required
`depth`	`int`	The current nesting level in the data structure. Used internally during recursive summarization. Typically starts at 0 for top-level calls. Must be non-negative.	`0`
`max_depth`	`int`	The maximum nesting level to expand when summarizing. Structures deeper than this level are shown in compact form. Must be non-negative. Default is 1, which expands only the top level of nested structures.	`1`

Returns:

Type	Description
`str`	A formatted string representation of the data. The exact format
`str`	depends on the registered summarizer, but typically includes
`str`	type information, size/length metadata, and indented content for
`str`	nested structures.

Raises:

Type	Description
`ValueError`	May be raised by individual summarizers for invalid depth parameters or other issues based on the data type being summarized. The registry itself doesn't raise exceptions, but delegates to registered summarizers which may raise this or other exceptions.

Notes

The depth parameter is primarily for internal use during recursion. Most external callers should use the default value of 0.
Setting max_depth=0 typically shows only top-level information without expanding any nested structures.
Higher max_depth values provide more detail but can produce very long output for deeply nested data.

Example

>>> from coola.summary import get_default_registry
>>> registry = get_default_registry()

>>> # Simple value
>>> print(registry.summarize(1))
<class 'int'> 1

>>> # List with default depth (expands first level only)
>>> print(registry.summarize(["abc", "def"]))
<class 'list'> (length=2)
  (0): abc
  (1): def

>>> # Nested list, default max_depth=1 (inner list not expanded)
>>> print(registry.summarize([[0, 1, 2], {"key1": "abc", "key2": "def"}]))
<class 'list'> (length=2)
  (0): [0, 1, 2]
  (1): {'key1': 'abc', 'key2': 'def'}

>>> # Nested list with max_depth=2 (expands both levels)
>>> print(registry.summarize([[0, 1, 2], {"key1": "abc", "key2": "def"}], max_depth=2))
<class 'list'> (length=2)
  (0): <class 'list'> (length=3)
      (0): 0
      (1): 1
      (2): 2
  (1): <class 'dict'> (length=2)
      (key1): abc
      (key2): def

>>> # Control depth for very nested structures
>>> deeply_nested = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
>>> print(registry.summarize(deeply_nested))
<class 'list'> (length=2)
  (0): [[1, 2], [3, 4]]
  (1): [[5, 6], [7, 8]]

coola.summary.TensorSummarizer ¶

Bases: BaseSummarizer[Tensor]

Implement a summarizer for torch.Tensor objects.

This summarizer generates compact string representations of PyTorch tensors. By default, it displays metadata (type, shape, dtype, device) rather than tensor values, making it suitable for logging and debugging large tensors. Optionally, it can show the full tensor representation.

Parameters:

Name	Type	Description	Default
`show_data`	`bool`	If `True`, returns the default tensor string representation (same as `repr(tensor)`), displaying actual values. If `False` (default), returns only metadata in a compact format: `<class> \| shape=<shape> \| dtype=<dtype> \| device=<device>`. Default: `False`	`False`

Raises:

Type	Description
`RuntimeError`	If PyTorch is not installed or available.

Example

>>> import torch
>>> from coola.summary import SummarizerRegistry, TensorSummarizer
>>> registry = SummarizerRegistry()

>>> # Default behavior: show metadata only
>>> summarizer = TensorSummarizer()
>>> print(summarizer.summarize(torch.arange(11), registry))  # doctest: +ELLIPSIS
<class 'torch.Tensor'> | shape=torch.Size([11]) | dtype=torch.int64 | device=cpu | requires_grad=False

>>> # Works with tensors of any shape and dtype
>>> print(summarizer.summarize(torch.ones(2, 3, 4), registry))  # doctest: +ELLIPSIS
<class 'torch.Tensor'> | shape=torch.Size([2, 3, 4]) | dtype=torch.float32 | device=cpu | requires_grad=False

>>> # Show full tensor data
>>> summarizer = TensorSummarizer(show_data=True)
>>> print(summarizer.summarize(torch.arange(11), registry))  # doctest: +ELLIPSIS
tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

coola.summary.get_default_registry ¶

get_default_registry() -> SummarizerRegistry

Get or create the default global registry with common Python types.

Returns a singleton registry instance that is pre-configured with summarizers for Python's built-in types including sequences (list, tuple), mappings (dict), sets, and scalar types (int, float, str, bool).

This function uses a singleton pattern to ensure the same registry instance is returned on subsequent calls, which is efficient and maintains consistency across an application.

Returns:

Type	Description
`SummarizerRegistry`	A SummarizerRegistry instance with summarizers registered for: - Scalar types (int, float, complex, bool, str) - Sequences (list, tuple, Sequence ABC) - Sets (set, frozenset) - Mappings (dict, Mapping ABC)

Notes

The singleton pattern means modifications to the returned registry affect all future calls to this function. If you need an isolated registry, create a new SummarizerRegistry instance directly.

Example

>>> from coola.summary import get_default_registry
>>> registry = get_default_registry()
>>> # Registry is ready to use with common Python types
>>> print(registry.summarize([1, 2, 3]))
<class 'list'> (length=3)
  (0): 1
  (1): 2
  (2): 3
>>> print(registry.summarize({"a": 1, "b": 2}))
<class 'dict'> (length=2)
  (a): 1
  (b): 2

coola.summary.register_summarizers ¶

register_summarizers(
    mapping: Mapping[type, BaseSummarizer[Any]],
    exist_ok: bool = False,
) -> None

Register custom summarizers to the default global registry.

This allows users to add support for custom types without modifying global state directly.

Parameters:

Name	Type	Description	Default
`mapping`	`Mapping[type, BaseSummarizer[Any]]`	Dictionary mapping types to summarizer instances	required
`exist_ok`	`bool`	If False, raises error if any type already registered	`False`

Example

>>> from coola.summary import register_summarizers, BaseSummarizer, SummarizerRegistry
>>> class MyType:
...     def __init__(self, value):
...         self.value = value
...
>>> class MySummarizer(BaseSummarizer[MyType]):
...     def equal(self, other: object) -> bool:
...         return type(object) is type(self)
...     def summarize(
...         self,
...         data: MyType,
...         registry: SummarizerRegistry,
...         depth: int = 0,
...         max_depth: int = 1,
...     ) -> str:
...         return f"<MyType> value={data.value}"
...
>>> register_summarizers({MyType: MySummarizer()})

coola.summary.summarize ¶

summarize(
    data: object,
    max_depth: int = 1,
    registry: SummarizerRegistry | None = None,
) -> str

Create a summary string representation of nested data.

Parameters:

Name	Type	Description	Default
`data`	`object`	Input data (can be nested)	required
`max_depth`	`int`	The maximum nesting level to expand when summarizing. Structures deeper than this level are shown in compact form. Must be non-negative. Default is 1, which expands only the top level of nested structures.	`1`
`registry`	`SummarizerRegistry \| None`	Registry to resolve summarizers for nested data. If None, uses the default registry.	`None`

Returns:

Type	Description
`str`	String summary of the data

Example

>>> from coola.summary import summarize
>>> print(summarize({"a": 1, "b": "abc"}))
<class 'dict'> (length=2)
  (a): 1
  (b): abc

Summary

coola.summary ¶

coola.summary.BaseCollectionSummarizer ¶

coola.summary.BaseSummarizer ¶

coola.summary.BaseSummarizer.equal abstractmethod ¶

coola.summary.BaseSummarizer.summarize abstractmethod ¶

coola.summary.DefaultSummarizer ¶

coola.summary.MappingSummarizer ¶

coola.summary.NDArraySummarizer ¶

coola.summary.SequenceSummarizer ¶

coola.summary.SetSummarizer ¶

coola.summary.SummarizerRegistry ¶

coola.summary.SummarizerRegistry.find_summarizer ¶

coola.summary.SummarizerRegistry.has_summarizer ¶

coola.summary.SummarizerRegistry.register ¶

coola.summary.SummarizerRegistry.register_many ¶

coola.summary.SummarizerRegistry.summarize ¶

coola.summary.TensorSummarizer ¶

coola.summary.get_default_registry ¶

coola.summary.register_summarizers ¶

coola.summary.summarize ¶

coola.summary.BaseSummarizer.equal `abstractmethod` ¶

coola.summary.BaseSummarizer.summarize `abstractmethod` ¶