Frequently Asked Questions (FAQ)¶

General Questions¶

What is minrecord?¶

minrecord is a minimalist Python library designed to record and track values in machine learning workflows. It helps you track the best values, monitor recent metrics, and determine if your models are improving.

Why use minrecord instead of storing values in a list?¶

While you can use a list, minrecord provides:

Memory efficiency: Only stores recent values (configurable)
Best value tracking: Automatically tracks the best value even if it's no longer in the recent history
Improvement detection: Easy to check if the latest value improved
Structured management: RecordManager helps organize multiple metrics
Serialization: Built-in support for saving/loading state

Is minrecord only for machine learning?¶

No! While designed with ML in mind, minrecord can be used anywhere you need to track:

Best values over time
Recent measurements
Whether metrics are improving

Installation & Setup¶

How do I install minrecord?¶

pip install minrecord

For all optional dependencies:

pip install minrecord[all]

What Python versions are supported?¶

Python 3.10 and higher are officially supported.

Do I need PyTorch or other ML frameworks?¶

No, minrecord has minimal dependencies. It only requires coola, which is used for equality testing.

Usage Questions¶

When should I use Record vs ComparableRecord?¶

Use Record when you only need to track recent values
Use ComparableRecord/MinScalarRecord/MaxScalarRecord when you need to:
- Track the best value
- Check if values are improving
- Implement early stopping

How do I choose max_size?¶

Consider:

Small values (10-50): For simple monitoring during training
Large values (100-1000): For detailed analysis or plotting trends
Memory constraints: Larger max_size × number of records × value size

Default is 10, which works well for most ML training scenarios.

Can I change max_size after creating a record?¶

No, max_size is fixed at creation. If you need to change it:

Create a new record with the desired max_size
Optionally transfer values using update(old_record.get_most_recent())

What happens when the record is full?¶

The oldest value is automatically removed when adding a new value. The record uses a deque internally with a fixed maximum length.

Why does ComparableRecord constructor behave differently?¶

If you pass elements to ComparableRecord, you must also pass best_value and improved to maintain correctness. This is because the best value might not be in the recent history.

Recommended approach:

# Good: Use from_elements
record = MinScalarRecord.from_elements("loss", [(0, 1.5), (1, 1.3)])

# Or: Add values after creation
record = MinScalarRecord("loss")
record.update([(0, 1.5), (1, 1.3)])

# Avoid: Passing elements to constructor without best_value
record = MinScalarRecord("loss", elements=[(0, 1.5), (1, 1.3)])  # ⚠️

Comparators & Improvements¶

How does has_improved() work?¶

has_improved() returns True if the last added value is equal to or better than the best value seen so far.

record = MinScalarRecord("loss")
record.add_value(1.5)
record.has_improved()  # True (first value is always improvement)

record.add_value(1.3)
record.has_improved()  # True (1.3 < 1.5)

record.add_value(1.4)
record.has_improved()  # False (1.4 > 1.3)

Why does has_improved() return True for equal values?¶

By design, equal values are considered improvements. This is useful when a metric plateaus at the optimal value.

Can I create custom comparators?¶

Yes! Implement BaseComparator:

from minrecord.comparator import BaseComparator


class MyComparator(BaseComparator[float]):
    def equal(self, other: object) -> bool:
        return isinstance(other, self.__class__)

    def get_initial_best_value(self) -> float:
        return 0.0  # Starting point

    def is_better(self, old_value: float, new_value: float) -> bool:
        # Define your comparison logic
        return new_value > old_value

RecordManager Questions¶

What's the benefit of RecordManager?¶

RecordManager provides:

Organized storage: All records in one place
Batch updates: Update multiple records at once
Convenience methods: get_best_values(), get_last_values()
State management: Save/load all records together

Does RecordManager create records automatically?¶

Yes, get_record() creates a default Record if the name doesn't exist:

manager = RecordManager()
record = manager.get_record("new_metric")  # Auto-created

To check if a record exists first:

if manager.has_record("my_metric"):
    record = manager.get_record("my_metric")

Can I use different record types in the same manager?¶

Yes! RecordManager accepts any BaseRecord implementation:

manager = RecordManager()
manager.add_record(MinScalarRecord("loss"))
manager.add_record(MaxScalarRecord("accuracy"))
manager.add_record(Record("custom_metric"))

Error Handling¶

What is EmptyRecordError?¶

This error occurs when you try to get a value from an empty record:

record = Record("my_metric")
# record.get_last_value()  # Raises EmptyRecordError

# Check first
if not record.is_empty():
    value = record.get_last_value()

What is NotAComparableRecordError?¶

This error occurs when you call comparison methods on a non-comparable record:

record = Record("my_metric")  # Not comparable
# record.get_best_value()  # Raises NotAComparableRecordError

# Use is_comparable() to check
if record.is_comparable():
    best = record.get_best_value()

Performance & Scalability¶

How many records can I have?¶

There's no hard limit. Each record has minimal overhead. With max_size=10:

Memory per record: ~200-500 bytes + stored values
1000 records with scalars: ~1-2 MB

Is minrecord thread-safe?¶

No, records are not thread-safe. If using multiple threads:

Use separate records per thread, or
Implement your own locking mechanism

Can I use minrecord in distributed training?¶

Yes, but each process needs its own records. After training:

Gather metrics from all processes
Aggregate as needed for logging

Integration Questions¶

How do I integrate with TensorBoard?¶

from torch.utils.tensorboard import SummaryWriter
from minrecord import RecordManager

writer = SummaryWriter()
manager = RecordManager()

for epoch in range(epochs):
    # Training...
    # Add values to records
    for name, value in metrics.items():
        manager.get_record(name).add_value(value, step=epoch)

    # Log to TensorBoard
    for name in manager.get_records().keys():
        record = manager.get_record(name)
        if not record.is_empty():
            writer.add_scalar(name, record.get_last_value(), epoch)

How do I integrate with Weights & Biases (wandb)?¶

import wandb
from minrecord import RecordManager

wandb.init(project="my-project")
manager = RecordManager()

for epoch in range(epochs):
    # Training...
    # Add values to records
    for name, value in metrics.items():
        manager.get_record(name).add_value(value, step=epoch)

    wandb.log(
        {
            "epoch": epoch,
            **manager.get_last_values(prefix="train/"),
            **manager.get_best_values(prefix="best/"),
        }
    )

Does minrecord work with PyTorch Lightning?¶

Yes! Use records in your LightningModule:

import pytorch_lightning as pl
from minrecord import MinScalarRecord


class MyModel(pl.LightningModule):
    def __init__(self):
        super().__init__()
        self.val_loss = MinScalarRecord("val_loss")

    def validation_epoch_end(self, outputs):
        avg_loss = torch.stack([x["loss"] for x in outputs]).mean()
        self.val_loss.add_value(avg_loss.item(), step=self.current_epoch)

        if self.val_loss.has_improved():
            self.save_checkpoint("best_model.ckpt")

Troubleshooting¶

Values seem incorrect after loading state¶

Make sure to use from_elements() or properly initialize best_value:

# Correct
record = MinScalarRecord.from_elements("loss", elements)

# Also correct
record = MinScalarRecord("loss")
record.update(elements)

get_best_values() returns fewer items than expected¶

get_best_values() only returns values for:

Non-empty records
Comparable records

Use get_last_values() for all records:

last_values = manager.get_last_values()  # All non-empty records
best_values = manager.get_best_values()  # Only comparable records

Record not found errors¶

Check that record names match exactly:

manager.add_record(MinScalarRecord("train/loss"))  # Note the "/"
# Wrong name:
manager.get_record("train_loss").add_value(1.5, step=0)  # KeyError if not auto-created
# Correct:
manager.get_record("train/loss").add_value(1.5, step=0)

Need More Help?¶

Check the Usage Guide for detailed patterns
See Examples for complete code
Open an Issue on GitHub
Read the API Reference for detailed documentation