Constants¶
The batchtensor.constants module defines important constants used throughout the library to
identify batch and sequence dimensions.
Overview¶
The constants module provides standardized dimension indices that ensure consistency across all batchtensor operations. These constants are used internally by the library and can also be used in your own code when working with batchtensor functions.
Available Constants¶
BATCH_DIM¶
The batch dimension constant identifies the dimension used for batching in tensors.
>>> from batchtensor.constants import BATCH_DIM
>>> BATCH_DIM
0
Usage: This constant is set to 0, indicating that the batch dimension is always the first
dimension (dimension 0) of tensors when using batchtensor functions.
Convention:
- For batch tensors: shape is
(batch_size, *) - The batch dimension contains independent samples
- Operations along this dimension process each sample in the batch
SEQ_DIM¶
The sequence dimension constant identifies the dimension used for sequences in tensors.
>>> from batchtensor.constants import SEQ_DIM
>>> SEQ_DIM
1
Usage: This constant is set to 1, indicating that the sequence dimension is always the second
dimension (dimension 1) of tensors when using batchtensor functions.
Convention:
- For sequence tensors: shape is
(batch_size, seq_len, *) - The sequence dimension contains sequential/temporal data
- Operations along this dimension process time steps or sequence positions
Why Use Constants?¶
Using constants instead of hard-coded numbers provides several benefits:
- Code Clarity:
BATCH_DIMis more readable than0 - Consistency: Ensures all operations use the same dimension conventions
- Maintainability: If dimension conventions ever change, updating the constant updates all uses
- Self-Documentation: Code using these constants is self-explanatory
Practical Examples¶
Using Constants in Your Code¶
When working with batchtensor functions, you can use these constants to make your code more explicit:
>>> import torch
>>> from batchtensor.constants import BATCH_DIM, SEQ_DIM
>>> # Create a batch of sequences
>>> # Shape: (batch_size=2, seq_len=3, features=4)
>>> data = torch.randn(2, 3, 4)
>>> # Check dimensions
>>> batch_size = data.size(BATCH_DIM)
>>> seq_len = data.size(SEQ_DIM)
>>> print(f"Batch size: {batch_size}, Sequence length: {seq_len}")
Batch size: 2, Sequence length: 3
Manual Operations with Constants¶
When you need to perform operations directly with PyTorch but want to maintain consistency with batchtensor conventions:
>>> import torch
>>> from batchtensor.constants import BATCH_DIM, SEQ_DIM
>>> data = torch.tensor([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
>>> # Sum along batch dimension
>>> batch_sum = data.sum(dim=BATCH_DIM)
>>> batch_sum
tensor([[ 6, 8],
[10, 12]])
>>> # Sum along sequence dimension
>>> seq_sum = data.sum(dim=SEQ_DIM)
>>> seq_sum
tensor([[ 4, 6],
[12, 14]])
Verifying Tensor Shapes¶
Use constants to verify that your tensors have the expected shape:
>>> import torch
>>> from batchtensor.constants import BATCH_DIM, SEQ_DIM
>>> def validate_batch_tensor(tensor, expected_batch_size):
... """Validate that a tensor has the expected batch size."""
... actual_batch_size = tensor.size(BATCH_DIM)
... if actual_batch_size != expected_batch_size:
... raise ValueError(
... f"Expected batch size {expected_batch_size}, " f"got {actual_batch_size}"
... )
... return True
...
>>> tensor = torch.randn(32, 10) # batch_size=32, features=10
>>> validate_batch_tensor(tensor, expected_batch_size=32)
True
Creating Batches and Sequences¶
Use constants when manually creating or reshaping tensors:
>>> import torch
>>> from batchtensor.constants import BATCH_DIM, SEQ_DIM
>>> # Create individual samples
>>> sample1 = torch.tensor([[1, 2], [3, 4]]) # seq_len=2, features=2
>>> sample2 = torch.tensor([[5, 6], [7, 8]])
>>> # Stack into a batch along batch dimension
>>> batch = torch.stack([sample1, sample2], dim=BATCH_DIM)
>>> batch.shape
torch.Size([2, 2, 2])
>>> # Verify dimensions
>>> print(f"Batch size: {batch.size(BATCH_DIM)}")
Batch size: 2
>>> print(f"Sequence length: {batch.size(SEQ_DIM)}")
Sequence length: 2
Internal Usage¶
These constants are used internally throughout batchtensor. For example (simplified pseudocode):
# In batchtensor.tensor.reduction
def sum_along_batch(tensor: torch.Tensor, keepdim: bool = False) -> torch.Tensor:
"""Sum tensor along the batch dimension."""
return tensor.sum(dim=BATCH_DIM, keepdim=keepdim)
# In batchtensor.tensor.slicing (simplified for illustration)
def select_along_seq(tensor: torch.Tensor, index: int) -> torch.Tensor:
"""Select a specific index along the sequence dimension."""
return tensor.select(dim=SEQ_DIM, index=index)
This ensures consistency across all functions in the library.
Compatibility with Other Libraries¶
While batchtensor uses these specific dimension conventions, they are compatible with common PyTorch practices:
- Batch-first convention: Most PyTorch modules (like
nn.Linear,nn.GRUwithbatch_first=True) expect batch as the first dimension - Standard computer vision: Images are typically
(batch, channels, height, width), compatible withBATCH_DIM=0 - NLP sequence models: With
batch_first=True, sequences are(batch, seq_len, features), matching our conventions
Best Practices¶
- Import the constants when you need to reference dimensions explicitly
- Use in assertions to validate tensor shapes in your code
- Prefer batchtensor functions over manual dimension handling when possible
- Document assumptions about tensor shapes in your own functions
See Also¶
- Tensor Operations - Functions that use these dimension constants
- Nested Operations - Nested operations following the same conventions