Summary
coola.summary ¶
Contain functionalities to compute a text summary of nested data based on the type of data.
coola.summary.BaseCollectionSummarizer ¶
Bases: BaseSummarizer[T]
Base class for summarizing collection-based data structures.
This class provides the foundation for summarizing various collection types with configurable formatting options. It handles item limiting and indentation for readable output.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
max_items
|
int
|
The maximum number of items to display in the summary. If set to a negative value (e.g., -1), all items in the collection will be shown without truncation. Defaults to 5. |
5
|
num_spaces
|
int
|
The number of spaces to use for indentation in the formatted output. This affects the visual structure of nested summaries. Defaults to 2. |
2
|
Attributes:
| Name | Type | Description |
|---|---|---|
_max_items |
Stores the maximum number of items to display. |
|
_num_spaces |
Stores the number of spaces for indentation. |
Example
>>> from coola.summary import SummarizerRegistry, MappingSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> summarizer = MappingSummarizer()
>>> output = summarizer.summarize({"key1": 1.2, "key2": "abc", "key3": 42}, registry)
>>> print(output)
<class 'dict'> (length=3)
(key1): 1.2
(key2): abc
(key3): 42
coola.summary.BaseSummarizer ¶
Bases: ABC, Generic[T]
Abstract base class for implementing data summarizers.
A summarizer converts Python objects into formatted string representations, with support for nested structures and configurable depth limits. This is useful for debugging, logging, and displaying complex data in a readable format.
The class is generic over type T, allowing concrete implementations to specialize for specific data types while maintaining type safety.
Notes
Concrete implementations must override the summary method to define
how data should be formatted and displayed.
The depth mechanism allows for progressive disclosure of nested structures, preventing overwhelming output for deeply nested data.
Example
>>> from coola.summary import SummarizerRegistry, DefaultSummarizer
>>> registry = SummarizerRegistry()
>>> summarizer = DefaultSummarizer()
>>> print(summarizer.summarize(1, registry))
<class 'int'> 1
coola.summary.BaseSummarizer.equal
abstractmethod
¶
equal(other: object) -> bool
Check equality between this summarizer and another object.
Two summarizers are considered equal if they are of the exact same type and have identical configuration parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
object
|
The object to compare with this summarizer. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
|
Example
>>> from coola.summary import DefaultSummarizer, MappingSummarizer
>>> summarizer1 = DefaultSummarizer()
>>> summarizer2 = DefaultSummarizer()
>>> summarizer3 = MappingSummarizer()
>>> summarizer1.equal(summarizer2)
True
>>> summarizer1.equal(summarizer3)
False
coola.summary.BaseSummarizer.summarize
abstractmethod
¶
summarize(
data: T,
registry: SummarizerRegistry,
depth: int = 0,
max_depth: int = 1,
) -> str
Generate a formatted string summary of the provided data.
This method creates a human-readable representation of the input data, with support for nested structures up to a specified depth. When the current depth exceeds max_depth, nested structures are typically shown in a compact form without further expansion.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
T
|
The data object to summarize. Can be any Python object, though behavior depends on the concrete implementation. |
required |
registry
|
SummarizerRegistry
|
The summarizer registry used to look up summarizers for nested data structures of different types. |
required |
depth
|
int
|
The current nesting level in the data structure. Used internally during recursive summarization. Typically starts at 0 for top-level calls. Must be non-negative. |
0
|
max_depth
|
int
|
The maximum nesting level to expand when summarizing. Structures deeper than this level are shown in compact form. Must be non-negative. Default is 1, which expands only the top level of nested structures. |
1
|
Returns:
| Type | Description |
|---|---|
str
|
A formatted string representation of the data. The exact format |
str
|
depends on the concrete implementation, but typically includes |
str
|
type information, size/length metadata, and indented content for |
str
|
nested structures. |
Raises:
| Type | Description |
|---|---|
ValueError
|
May be raised by implementations for invalid depth parameters or other issues based on the data type being summarized. The base class doesn't specify exceptions, but implementations may raise this or other exceptions. |
Notes
- The depth parameter is primarily for internal use during recursion. Most external callers should use the default value of 0.
- Setting max_depth=0 typically shows only top-level information without expanding any nested structures.
- Higher max_depth values provide more detail but can produce very long output for deeply nested data.
Example
>>> from coola.summary import SummarizerRegistry, DefaultSummarizer
>>> registry = SummarizerRegistry()
>>> summarizer = DefaultSummarizer()
>>> print(summarizer.summarize(1, registry))
<class 'int'> 1
coola.summary.DefaultSummarizer ¶
Bases: BaseSummarizer[object]
Implement the default summarizer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
max_characters
|
int
|
The maximum number of characters to show. If a negative value is provided, all the characters are shown. |
-1
|
Example
>>> from coola.summary import SummarizerRegistry, DefaultSummarizer
>>> registry = SummarizerRegistry()
>>> summarizer = DefaultSummarizer()
>>> print(summarizer.summarize(1, registry))
<class 'int'> 1
coola.summary.MappingSummarizer ¶
Bases: BaseCollectionSummarizer[Mapping[Any, Any]]
Summarizer for mapping-based data structures like dictionaries.
This class formats mapping types (dict, OrderedDict, etc.) into readable summaries that display the type, length, and key-value pairs with proper indentation. It respects the max_items limit and handles nested structures through the registry system. This class creates a multi-line summary showing the mapping's type, length, and contents. It handles depth limiting to prevent excessively deep nested summaries and truncates the output when the number of items exceeds max_items.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
max_items
|
int
|
The maximum number of key-value pairs to display. If negative, shows all pairs. Defaults to 5. |
5
|
num_spaces
|
int
|
The number of spaces for indenting each level. Defaults to 2. |
2
|
Example
>>> from coola.summary import SummarizerRegistry, MappingSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> summarizer = MappingSummarizer()
>>> output = summarizer.summarize({"key1": 1.2, "key2": "abc", "key3": 42}, registry)
>>> print(output)
<class 'dict'> (length=3)
(key1): 1.2
(key2): abc
(key3): 42
coola.summary.NDArraySummarizer ¶
Bases: BaseSummarizer[ndarray]
Implement a summarizer for numpy.ndarray objects.
This summarizer generates compact string representations of NumPy arrays. By default, it displays metadata (type, shape, dtype) rather than array values, making it suitable for logging and debugging large arrays. Optionally, it can show the full array representation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_data
|
bool
|
If |
False
|
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If NumPy is not installed or available. |
Example
>>> import numpy as np
>>> from coola.summary import SummarizerRegistry, NDArraySummarizer
>>> registry = SummarizerRegistry()
>>> # Default behavior: show metadata only
>>> summarizer = NDArraySummarizer()
>>> print(summarizer.summarize(np.arange(11), registry))
<class 'numpy.ndarray'> | shape=(11,) | dtype=int64
>>> # Works with arrays of any shape and dtype
>>> print(summarizer.summarize(np.ones((2, 3, 4)), registry))
<class 'numpy.ndarray'> | shape=(2, 3, 4) | dtype=float64
>>> # Show full array data
>>> summarizer = NDArraySummarizer(show_data=True)
>>> print(summarizer.summarize(np.arange(11), registry))
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
coola.summary.SequenceSummarizer ¶
Bases: BaseCollectionSummarizer[Sequence[Any]]
Summarizer for sequence-based data structures like lists and tuples.
This class formats sequence types (list, tuple, etc.) into readable summaries that display the type, length, and indexed items with proper indentation. It respects the max_items limit and handles nested structures through the registry system.
This class creates a multi-line summary showing the sequence's type, length, and contents. It handles depth limiting to prevent excessively deep nested summaries and truncates the output when the number of items exceeds max_items.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
max_items
|
int
|
The maximum number of items to display. If negative, shows all items. Defaults to 5. |
5
|
num_spaces
|
int
|
The number of spaces for indenting each level. Defaults to 2. |
2
|
Example
>>> from coola.summary import SummarizerRegistry, SequenceSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> summarizer = SequenceSummarizer()
>>> output = summarizer.summarize([1, 2, 3], registry)
>>> print(output)
<class 'list'> (length=3)
(0): 1
(1): 2
(2): 3
coola.summary.SetSummarizer ¶
Bases: BaseCollectionSummarizer[Set[Any]]
Summarizer for set-based data structures.
This class formats set types (set, frozenset, etc.) into readable summaries that display the type, length, and items with proper indentation. It respects the max_items limit and handles nested structures through the registry system.
This class creates a multi-line summary showing the set's type, length, and contents. It handles depth limiting to prevent excessively deep nested summaries and truncates the output when the number of items exceeds max_items.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
max_items
|
int
|
The maximum number of items to display. If negative, shows all items. Defaults to 5. |
5
|
num_spaces
|
int
|
The number of spaces for indenting each level. Defaults to 2. |
2
|
Example
>>> from coola.summary import SummarizerRegistry, SetSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> summarizer = SetSummarizer()
>>> output = summarizer.summarize({1}, registry)
>>> print(output)
<class 'set'> (length=1)
(0): 1
coola.summary.SummarizerRegistry ¶
Registry that manages and dispatches summarizers based on data type.
This registry maintains a mapping from Python types to summarizer instances and uses the Method Resolution Order (MRO) for type lookup. When summarizing data, it automatically selects the most specific registered summarizer for the data's type, falling back to parent types or a default summarizer if needed.
The registry includes an LRU cache for type lookups to optimize performance in applications that repeatedly summarize similar data structures.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
initial_state
|
dict[type, BaseSummarizer[Any]] | None
|
Optional initial mapping of types to summarizers. If provided, the state is copied to prevent external mutations. |
None
|
Attributes:
| Name | Type | Description |
|---|---|---|
_state |
TypeRegistry[BaseSummarizer]
|
Internal mapping of registered types to summarizers |
Example
Basic usage with a sequence summarizer:
>>> from coola.summary import SummarizerRegistry, SequenceSummarizer, DefaultSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer(), list: SequenceSummarizer()})
>>> registry
SummarizerRegistry(
(state): TypeRegistry(
(<class 'object'>): DefaultSummarizer(max_characters=-1)
(<class 'list'>): SequenceSummarizer(max_items=5, num_spaces=2)
)
)
>>> print(registry.summarize([1, 2, 3]))
<class 'list'> (length=3)
(0): 1
(1): 2
(2): 3
Registering custom summarizers:
>>> from coola.summary import SummarizerRegistry, SequenceSummarizer
>>> registry = SummarizerRegistry({object: DefaultSummarizer()})
>>> registry.register(tuple, SequenceSummarizer())
>>> print(registry.summarize((1, 2, 3)))
<class 'tuple'> (length=3)
(0): 1
(1): 2
(2): 3
Working with nested structures:
>>> from coola.summary import get_default_registry
>>> registry = get_default_registry()
>>> print(registry.summarize({"a": [1, 2], "b": [3, 4]}))
<class 'dict'> (length=2)
(a): [1, 2]
(b): [3, 4]
coola.summary.SummarizerRegistry.find_summarizer ¶
find_summarizer(data_type: type) -> BaseSummarizer[Any]
Find the appropriate summarizer for a given type.
Uses the Method Resolution Order (MRO) to find the most specific registered summarizer. For example, if you register a summarizer for Sequence but not for list, lists will use the Sequence summarizer.
Results are cached using an LRU cache (256 entries) for performance, as summarizer lookup is a hot path in recursive summarizations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_type
|
type
|
The Python type to find a summarizer for |
required |
Returns:
| Type | Description |
|---|---|
BaseSummarizer[Any]
|
The most specific registered summarizer for this type, a parent |
BaseSummarizer[Any]
|
type's summarizer via MRO, or the default summarizer |
Example
>>> from coola.summary import get_default_registry
>>> registry = get_default_registry()
>>> summarizer = registry.find_summarizer(list)
>>> summarizer
SequenceSummarizer(max_items=5, num_spaces=2)
coola.summary.SummarizerRegistry.has_summarizer ¶
has_summarizer(data_type: type) -> bool
Check if a summarizer is explicitly registered for the given type.
Note that this only checks for direct registration. Even if this returns False, find_summarizer() may still return a summarizer via MRO lookup or the default summarizer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_type
|
type
|
The type to check |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if a summarizer is explicitly registered for this type, |
bool
|
False otherwise |
Example
>>> from coola.summary import SummarizerRegistry, SequenceSummarizer
>>> registry = SummarizerRegistry({list: SequenceSummarizer()})
>>> registry.has_summarizer(list)
True
>>> registry.has_summarizer(tuple)
False
coola.summary.SummarizerRegistry.register ¶
register(
data_type: type,
summarizer: BaseSummarizer[Any],
exist_ok: bool = False,
) -> None
Register a summarizer for a given data type.
This method associates a summarizer instance with a specific Python type. When data of this type is summarized, the registered summarizer will be used. The cache is automatically cleared after registration to ensure consistency.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_type
|
type
|
The Python type to register (e.g., list, dict, custom classes) |
required |
summarizer
|
BaseSummarizer[Any]
|
The summarizer instance that handles this type |
required |
exist_ok
|
bool
|
If False (default), raises an error if the type is already registered. If True, overwrites the existing registration silently. |
False
|
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If the type is already registered and exist_ok is False |
Example
>>> from coola.summary import SummarizerRegistry, SequenceSummarizer
>>> registry = SummarizerRegistry()
>>> registry.register(list, SequenceSummarizer())
>>> registry.has_summarizer(list)
True
coola.summary.SummarizerRegistry.register_many ¶
register_many(
mapping: Mapping[type, BaseSummarizer[Any]],
exist_ok: bool = False,
) -> None
Register multiple summarizers at once.
This is a convenience method for bulk registration that internally calls register() for each type-summarizer pair.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mapping
|
Mapping[type, BaseSummarizer[Any]]
|
Dictionary mapping Python types to summarizer instances |
required |
exist_ok
|
bool
|
If False (default), raises an error if any type is already registered. If True, overwrites existing registrations silently. |
False
|
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If any type is already registered and exist_ok is False |
Example
>>> from coola.summary import SummarizerRegistry, SequenceSummarizer, MappingSummarizer
>>> registry = SummarizerRegistry()
>>> registry.register_many(
... {
... list: SequenceSummarizer(),
... dict: MappingSummarizer(),
... }
... )
>>> registry
SummarizerRegistry(
(state): TypeRegistry(
(<class 'list'>): SequenceSummarizer(max_items=5, num_spaces=2)
(<class 'dict'>): MappingSummarizer(max_items=5, num_spaces=2)
)
)
coola.summary.SummarizerRegistry.summarize ¶
summarize(
data: object, depth: int = 0, max_depth: int = 1
) -> str
Generate a formatted string summary of the provided data.
This method creates a human-readable representation of the input data, with support for nested structures up to a specified depth. When the current depth exceeds max_depth, nested structures are typically shown in a compact form without further expansion.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
object
|
The data object to summarize. Can be any Python object, though behavior depends on the registered summarizers. |
required |
depth
|
int
|
The current nesting level in the data structure. Used internally during recursive summarization. Typically starts at 0 for top-level calls. Must be non-negative. |
0
|
max_depth
|
int
|
The maximum nesting level to expand when summarizing. Structures deeper than this level are shown in compact form. Must be non-negative. Default is 1, which expands only the top level of nested structures. |
1
|
Returns:
| Type | Description |
|---|---|
str
|
A formatted string representation of the data. The exact format |
str
|
depends on the registered summarizer, but typically includes |
str
|
type information, size/length metadata, and indented content for |
str
|
nested structures. |
Raises:
| Type | Description |
|---|---|
ValueError
|
May be raised by individual summarizers for invalid depth parameters or other issues based on the data type being summarized. The registry itself doesn't raise exceptions, but delegates to registered summarizers which may raise this or other exceptions. |
Notes
- The depth parameter is primarily for internal use during recursion. Most external callers should use the default value of 0.
- Setting max_depth=0 typically shows only top-level information without expanding any nested structures.
- Higher max_depth values provide more detail but can produce very long output for deeply nested data.
Example
>>> from coola.summary import get_default_registry
>>> registry = get_default_registry()
>>> # Simple value
>>> print(registry.summarize(1))
<class 'int'> 1
>>> # List with default depth (expands first level only)
>>> print(registry.summarize(["abc", "def"]))
<class 'list'> (length=2)
(0): abc
(1): def
>>> # Nested list, default max_depth=1 (inner list not expanded)
>>> print(registry.summarize([[0, 1, 2], {"key1": "abc", "key2": "def"}]))
<class 'list'> (length=2)
(0): [0, 1, 2]
(1): {'key1': 'abc', 'key2': 'def'}
>>> # Nested list with max_depth=2 (expands both levels)
>>> print(registry.summarize([[0, 1, 2], {"key1": "abc", "key2": "def"}], max_depth=2))
<class 'list'> (length=2)
(0): <class 'list'> (length=3)
(0): 0
(1): 1
(2): 2
(1): <class 'dict'> (length=2)
(key1): abc
(key2): def
>>> # Control depth for very nested structures
>>> deeply_nested = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
>>> print(registry.summarize(deeply_nested))
<class 'list'> (length=2)
(0): [[1, 2], [3, 4]]
(1): [[5, 6], [7, 8]]
coola.summary.TensorSummarizer ¶
Bases: BaseSummarizer[Tensor]
Implement a summarizer for torch.Tensor objects.
This summarizer generates compact string representations of PyTorch tensors. By default, it displays metadata (type, shape, dtype, device) rather than tensor values, making it suitable for logging and debugging large tensors. Optionally, it can show the full tensor representation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_data
|
bool
|
If |
False
|
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If PyTorch is not installed or available. |
Example
>>> import torch
>>> from coola.summary import SummarizerRegistry, TensorSummarizer
>>> registry = SummarizerRegistry()
>>> # Default behavior: show metadata only
>>> summarizer = TensorSummarizer()
>>> print(summarizer.summarize(torch.arange(11), registry)) # doctest: +ELLIPSIS
<class 'torch.Tensor'> | shape=torch.Size([11]) | dtype=torch.int64 | device=cpu | requires_grad=False
>>> # Works with tensors of any shape and dtype
>>> print(summarizer.summarize(torch.ones(2, 3, 4), registry)) # doctest: +ELLIPSIS
<class 'torch.Tensor'> | shape=torch.Size([2, 3, 4]) | dtype=torch.float32 | device=cpu | requires_grad=False
>>> # Show full tensor data
>>> summarizer = TensorSummarizer(show_data=True)
>>> print(summarizer.summarize(torch.arange(11), registry)) # doctest: +ELLIPSIS
tensor([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
coola.summary.get_default_registry ¶
get_default_registry() -> SummarizerRegistry
Get or create the default global registry with common Python types.
Returns a singleton registry instance that is pre-configured with summarizers for Python's built-in types including sequences (list, tuple), mappings (dict), sets, and scalar types (int, float, str, bool).
This function uses a singleton pattern to ensure the same registry instance is returned on subsequent calls, which is efficient and maintains consistency across an application.
Returns:
| Type | Description |
|---|---|
SummarizerRegistry
|
A SummarizerRegistry instance with summarizers registered for: - Scalar types (int, float, complex, bool, str) - Sequences (list, tuple, Sequence ABC) - Sets (set, frozenset) - Mappings (dict, Mapping ABC) |
Notes
The singleton pattern means modifications to the returned registry affect all future calls to this function. If you need an isolated registry, create a new SummarizerRegistry instance directly.
Example
>>> from coola.summary import get_default_registry
>>> registry = get_default_registry()
>>> # Registry is ready to use with common Python types
>>> print(registry.summarize([1, 2, 3]))
<class 'list'> (length=3)
(0): 1
(1): 2
(2): 3
>>> print(registry.summarize({"a": 1, "b": 2}))
<class 'dict'> (length=2)
(a): 1
(b): 2
coola.summary.register_summarizers ¶
register_summarizers(
mapping: Mapping[type, BaseSummarizer[Any]],
exist_ok: bool = False,
) -> None
Register custom summarizers to the default global registry.
This allows users to add support for custom types without modifying global state directly.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mapping
|
Mapping[type, BaseSummarizer[Any]]
|
Dictionary mapping types to summarizer instances |
required |
exist_ok
|
bool
|
If False, raises error if any type already registered |
False
|
Example
>>> from coola.summary import register_summarizers, BaseSummarizer, SummarizerRegistry
>>> class MyType:
... def __init__(self, value):
... self.value = value
...
>>> class MySummarizer(BaseSummarizer[MyType]):
... def equal(self, other: object) -> bool:
... return type(object) is type(self)
... def summarize(
... self,
... data: MyType,
... registry: SummarizerRegistry,
... depth: int = 0,
... max_depth: int = 1,
... ) -> str:
... return f"<MyType> value={data.value}"
...
>>> register_summarizers({MyType: MySummarizer()})
coola.summary.summarize ¶
summarize(
data: object,
max_depth: int = 1,
registry: SummarizerRegistry | None = None,
) -> str
Create a summary string representation of nested data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
object
|
Input data (can be nested) |
required |
max_depth
|
int
|
The maximum nesting level to expand when summarizing. Structures deeper than this level are shown in compact form. Must be non-negative. Default is 1, which expands only the top level of nested structures. |
1
|
registry
|
SummarizerRegistry | None
|
Registry to resolve summarizers for nested data. If None, uses the default registry. |
None
|
Returns:
| Type | Description |
|---|---|
str
|
String summary of the data |
Example
>>> from coola.summary import summarize
>>> print(summarize({"a": 1, "b": "abc"}))
<class 'dict'> (length=2)
(a): 1
(b): abc