Home¶
Overview¶
coola is a lightweight Python library that makes it easy to compare complex and nested data
structures.
It provides simple, extensible functions to check equality between objects containing
PyTorch tensors,
NumPy arrays,
pandas/polars DataFrames, and other scientific
computing objects.
Quick Links:
Why coola?¶
Python's native equality operator (==) doesn't work well with complex nested structures
containing tensors, arrays, or DataFrames. You'll often encounter errors or unexpected behavior.
coola solves this with intuitive comparison functions:
Check exact equality:
>>> import numpy as np
>>> import torch
>>> from coola.equality import objects_are_equal
>>> data1 = {"torch": torch.ones(2, 3), "numpy": np.zeros((2, 3))}
>>> data2 = {"torch": torch.ones(2, 3), "numpy": np.zeros((2, 3))}
>>> objects_are_equal(data1, data2)
True
Compare with numerical tolerance:
>>> from coola.equality import objects_are_allclose
>>> data1 = {"value": 1.0}
>>> data2 = {"value": 1.0 + 1e-9}
>>> objects_are_allclose(data1, data2)
True
See the user guide for detailed examples.
Features¶
coola provides a comprehensive set of utilities for working with complex data structures:
π Equality Comparison¶
Compare complex nested objects with support for multiple data types:
- Exact equality:
objects_are_equal()for strict comparison - Approximate equality:
objects_are_allclose()for numerical tolerance - User-friendly difference reporting: Clear, structured output showing exactly what differs
- Extensible: Add custom comparators for your own types
Supported types: JAX β’ NumPy β’ pandas β’ polars β’ PyArrow β’ PyTorch β’ xarray β’ Python built-ins (dict, list, tuple, set, etc.)
Learn more about supported types β
π Data Summarization¶
Generate human-readable summaries of nested data structures for debugging and logging:
- Configurable depth control
- Type-specific formatting
- Truncation for large collections
π Data Conversion¶
Transform data between different nested structures:
- Convert between list-of-dicts and dict-of-lists formats
- Useful for working with tabular data and different data representations
ποΈ Mapping Utilities¶
Work with nested dictionaries efficiently:
- Flatten nested dictionaries into flat key-value pairs
- Extract specific values from complex nested structures
- Filter dictionary keys based on patterns or criteria
π Iteration¶
Traverse nested data structures systematically:
- Depth-first search (DFS) traversal for nested containers
- Breadth-first search (BFS) traversal for level-by-level processing
- Filter and extract specific types from heterogeneous collections
π Reduction¶
Compute statistics on sequences with flexible backends:
- Calculate min, max, mean, median, quantile, std on numeric sequences
- Support for multiple backends: native Python, NumPy, PyTorch
- Consistent API regardless of backend choice
Contributing¶
Contributions are welcome! We appreciate bug fixes, feature additions, documentation improvements, and more. Please check the contributing guidelines for details on:
- Setting up the development environment
- Code style and testing requirements
- Submitting pull requests
Whether you're fixing a bug or proposing a new feature, please open an issue first to discuss your changes.
API Stability¶
Important: As
coola is under active development, its API is not yet stable and may
change between releases. We recommend pinning a specific version in your projectβs dependencies to
ensure consistent behavior.
License¶
coola is licensed under BSD 3-Clause "New" or "Revised" license available
in LICENSE
file.