Home¶
Overview¶
grizz
is a light library to ingest and transform data
in polars DataFrame.
grizz
uses an object-oriented strategy, where ingestors and transformers are building blocks that
can be combined together.
grizz
can be extend to add custom DataFrame ingestors and transformers.
For example, the following example shows how to change the casting of some columns.
>>> import polars as pl
>>> from grizz.transformer import InplaceCast
>>> transformer = InplaceCast(columns=["col1", "col3"], dtype=pl.Int32)
>>> frame = pl.DataFrame(
... {
... "col1": [1, 2, 3, 4, 5],
... "col2": ["1", "2", "3", "4", "5"],
... "col3": ["1", "2", "3", "4", "5"],
... "col4": ["a", "b", "c", "d", "e"],
... }
... )
>>> out = transformer.transform(frame)
>>> out
shape: (5, 4)
┌──────┬──────┬──────┬──────┐
│ col1 ┆ col2 ┆ col3 ┆ col4 │
│ --- ┆ --- ┆ --- ┆ --- │
│ i32 ┆ str ┆ i32 ┆ str │
╞══════╪══════╪══════╪══════╡
│ 1 ┆ 1 ┆ 1 ┆ a │
│ 2 ┆ 2 ┆ 2 ┆ b │
│ 3 ┆ 3 ┆ 3 ┆ c │
│ 4 ┆ 4 ┆ 4 ┆ d │
│ 5 ┆ 5 ┆ 5 ┆ e │
└──────┴──────┴──────┴──────┘
API stability¶
While
grizz
is in development stage, no API is guaranteed to be stable from one
release to the next.
In fact, it is very likely that the API will change multiple times before a stable 1.0.0 release.
In practice, this means that upgrading grizz
to a new version will possibly break any code
that was using the old version of grizz
.
License¶
grizz
is licensed under BSD 3-Clause "New" or "Revised" license available
in LICENSE file.