analyzer

flamme.analyzer ¶

Contain DataFrame analyzers.

flamme.analyzer.BaseAnalyzer ¶

Bases: ABC

Define the base class to analyze a DataFrame.

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import NullValueAnalyzer
>>> analyzer = NullValueAnalyzer()
>>> analyzer
NullValueAnalyzer(figsize=None)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> analyzer.analyze(frame)

flamme.analyzer.BaseAnalyzer.analyze ¶

analyze(frame: DataFrame) -> BaseSection

Analyze the data in a DataFrame.

Parameters:

Name	Type	Description	Default
`frame`	`DataFrame`	The DataFrame with the data to analyze.	required

Returns:

Type	Description
`BaseSection`	The section report.

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import NullValueAnalyzer
>>> analyzer = NullValueAnalyzer()
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> analyzer.analyze(frame)

flamme.analyzer.ChoiceAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer to analyze multiple analyzers.

Parameters:

Name	Type	Description	Default
`analyzers`	`Mapping[str, BaseAnalyzer \| dict]`	The mappings to analyze. The key of each analyzer is used to organize the metrics and report.	required
`selection_fn`	`Callable[[DataFrame], str]`	Specifies a callable with the selection logic. The callable returns the key of the analyzer to use based on the data in the input DataFrame.	required

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import (
...     ChoiceAnalyzer,
...     FilteredAnalyzer,
...     NullValueAnalyzer,
...     DuplicatedRowAnalyzer,
... )
>>> analyzer = ChoiceAnalyzer(
...     {"null": NullValueAnalyzer(), "duplicate": DuplicatedRowAnalyzer()},
...     selection_fn=lambda frame: "null" if frame.isna().values.any() else "duplicate",
... )
>>> analyzer
ChoiceAnalyzer(
  (null): NullValueAnalyzer(figsize=None)
  (duplicate): DuplicatedRowAnalyzer(columns=None, figsize=None)
)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> section = analyzer.analyze(frame)
>>> section.__class__.__qualname__
NullValueSection
>>> frame = pd.DataFrame({"col": np.arange(10)})
>>> section = analyzer.analyze(frame)
>>> section.__class__.__qualname__
DuplicatedRowSection

flamme.analyzer.ColumnContinuousAdvancedAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer to show the temporal distribution of continuous values.

Parameters:

Name	Type	Description	Default
`column`	`str`	The column name.	required
`nbins`	`int \| None`	The number of bins in the histogram.	`None`
`yscale`	`str`	The y-axis scale. If `'auto'`, the `'linear'` or `'log'/'symlog'` scale is chosen based on the distribution.	`'auto'`
`figsize`	`tuple[float, float] \| None`	The figure size in inches. The first dimension is the width and the second is the height.	`None`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import ColumnContinuousAdvancedAnalyzer
>>> analyzer = ColumnContinuousAdvancedAnalyzer(column="float")
>>> analyzer
ColumnContinuousAdvancedAnalyzer(column=float, nbins=None, yscale=auto, figsize=None)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.ColumnContinuousAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer to show the temporal distribution of continuous values.

Parameters:

Name	Type	Description	Default
`column`	`str`	The column name.	required
`nbins`	`int \| None`	The number of bins in the histogram.	`None`
`yscale`	`str`	The y-axis scale. If `'auto'`, the `'linear'` or `'log'/'symlog'` scale is chosen based on the distribution.	`'auto'`
`xmin`	`float \| str \| None`	The minimum value of the range or its associated quantile. `q0.1` means the 10% quantile. `0` is the minimum value and `1` is the maximum value.	`'q0'`
`xmax`	`float \| str \| None`	The maximum value of the range or its associated quantile. `q0.9` means the 90% quantile. `0` is the minimum value and `1` is the maximum value.	`'q1'`
`figsize`	`tuple[float, float] \| None`	The figure size in inches. The first dimension is the width and the second is the height.	`None`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import ColumnContinuousAnalyzer
>>> analyzer = ColumnContinuousAnalyzer(column="float")
>>> analyzer
ColumnContinuousAnalyzer(column=float, nbins=None, yscale=auto, xmin=q0, xmax=q1, figsize=None)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.ColumnDiscreteAnalyzer ¶

Bases: BaseAnalyzer

Implement a discrete distribution analyzer.

Parameters:

Name	Type	Description	Default
`column`	`str`	The column to analyze.	required
`dropna`	`bool`	If `True`, the NaN values are not included in the analysis.	`False`
`max_rows`	`int`	The maximum number of rows to show in the table.	`20`
`yscale`	`str`	The y-axis scale. If `'auto'`, the `'linear'` or `'log'` scale is chosen based on the distribution.	`'auto'`
`figsize`	`tuple[float, float] \| None`	The figure size in inches. The first dimension is the width and the second is the height.	`None`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import ColumnDiscreteAnalyzer
>>> analyzer = ColumnDiscreteAnalyzer(column="str")
>>> analyzer
ColumnDiscreteAnalyzer(column=str, dropna=False, max_rows=20, yscale=auto, figsize=None)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.ColumnSubsetAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer to analyze only a subset of the columns.

Parameters:

Name	Type	Description	Default
`columns`	`Sequence[str]`	Soecifies the columns to select.	required
`analyzer`	`BaseAnalyzer \| dict`	The analyzer or its configuration.	required

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import ColumnSubsetAnalyzer, NullValueAnalyzer
>>> analyzer = ColumnSubsetAnalyzer(columns=["int", "float"], analyzer=NullValueAnalyzer())
>>> analyzer
ColumnSubsetAnalyzer(
  (columns): 2 ['int', 'float']
  (analyzer): NullValueAnalyzer(figsize=None)
)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.ColumnTemporalContinuousAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer to show the temporal distribution of continuous values.

Parameters:

Name	Type	Description	Default
`column`	`str`	The column to analyze.	required
`dt_column`	`str`	The datetime column used to analyze the temporal distribution.	required
`period`	`str`	The temporal period e.g. monthly or daily.	required
`yscale`	`str`	The y-axis scale. If `'auto'`, the `'linear'` or `'log'/'symlog'` scale is chosen based on the distribution.	`'auto'`
`figsize`	`tuple[float, float] \| None`	The figure size in inches. The first dimension is the width and the second is the height.	`None`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import TemporalNullValueAnalyzer
>>> analyzer = ColumnTemporalContinuousAnalyzer(
...     column="float", dt_column="datetime", period="M"
... )
>>> analyzer
ColumnTemporalContinuousAnalyzer(column=float, dt_column=datetime, period=M, yscale=auto, figsize=None)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...         "datetime": pd.to_datetime(
...             ["2020-01-03", "2020-02-03", "2020-03-03", "2020-04-03"]
...         ),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.ColumnTemporalDiscreteAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer to show the temporal distribution of discrete values.

Parameters:

Name	Type	Description	Default
`column`	`str`	The column to analyze.	required
`dt_column`	`str`	The datetime column used to analyze the temporal distribution.	required
`period`	`str`	The temporal period e.g. monthly or daily.	required
`figsize`	`tuple[float, float] \| None`	The figure size in inches. The first dimension is the width and the second is the height.	`None`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import ColumnTemporalDiscreteAnalyzer
>>> analyzer = ColumnTemporalDiscreteAnalyzer(
...     column="str", dt_column="datetime", period="M"
... )
>>> analyzer
ColumnTemporalDiscreteAnalyzer(column=str, dt_column=datetime, period=M, figsize=None)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...         "datetime": pd.to_datetime(
...             ["2020-01-03", "2020-02-03", "2020-03-03", "2020-04-03"]
...         ),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.ColumnTemporalNullValueAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer to show the temporal distribution of null values for all columns.

A plot is generated for each column.

Parameters:

Name	Type	Description	Default
`dt_column`	`str`	The datetime column used to analyze the temporal distribution.	required
`period`	`str`	The temporal period e.g. monthly or daily.	required
`columns`	`Sequence[str] \| None`	The list of columns to analyze. A plot is generated for each column. `None` means all the columns.	`None`
`ncols`	`int`	The number of columns.	`2`
`figsize`	`tuple[float, float]`	The figure size in inches. The first dimension is the width and the second is the height.	`(7, 5)`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import ColumnTemporalNullValueAnalyzer
>>> analyzer = ColumnTemporalNullValueAnalyzer("datetime", period="M")
>>> analyzer
ColumnTemporalNullValueAnalyzer(
  (columns): None
  (dt_column): datetime
  (period): M
  (ncols): 2
  (figsize): (7, 5)
)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...         "datetime": pd.to_datetime(
...             ["2020-01-03", "2020-02-03", "2020-03-03", "2020-04-03"]
...         ),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.ContentAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer that generates the given custom content.

Parameters:

Name	Type	Description	Default
`content`	`str`	The content to use in the HTML code.	required

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import ContentAnalyzer
>>> analyzer = ContentAnalyzer(content="meow")
>>> analyzer
ContentAnalyzer()
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.DataFrameSummaryAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer to show a summary of the DataFrame.

Parameters:

Name	Type	Description	Default
`top`	`int`	The number of most frequent values to show.	`5`
`sort`	`bool`	If `True`, sort the columns by alphabetical order.	`False`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import DataFrameSummaryAnalyzer
>>> analyzer = DataFrameSummaryAnalyzer()
>>> analyzer
DataFrameSummaryAnalyzer(top=5, sort=False)
>>> frame = pd.DataFrame(
...     {
...         "col1": np.array([0, 1, 0, 1]),
...         "col2": np.array([1, 0, 1, 0]),
...         "col3": np.array([1, 1, 1, 1]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.DataTypeAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer to find all the value types in each column.

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import DataTypeAnalyzer
>>> analyzer = DataTypeAnalyzer()
>>> analyzer
DataTypeAnalyzer()
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.DuplicatedRowAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer to show the number of duplicated rows.

Parameters:

Name	Type	Description	Default
`columns`	`Sequence[str] \| None`	The columns used to compute the duplicated rows. `None` means all the columns.	`None`
`figsize`	`tuple[float, float] \| None`	The figure size in inches. The first dimension is the width and the second is the height.	`None`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import DuplicatedRowAnalyzer
>>> analyzer = DuplicatedRowAnalyzer()
>>> analyzer
DuplicatedRowAnalyzer(columns=None, figsize=None)
>>> frame = pd.DataFrame(
...     {
...         "col1": np.array([0, 1, 0, 1]),
...         "col2": np.array([1, 0, 1, 0]),
...         "col3": np.array([1, 1, 1, 1]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.FilteredAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer that filters the data before to analyze the data.

Parameters:

Name	Type	Description	Default
`query`	`str`	Soecifies the query.	required
`analyzer`	`BaseAnalyzer \| dict`	The analyzer or its configuration.	required

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import FilteredAnalyzer, NullValueAnalyzer
>>> analyzer = FilteredAnalyzer(query="float >= 2.0", analyzer=NullValueAnalyzer())
>>> analyzer
FilteredAnalyzer(
  (query): float >= 2.0
  (analyzer): NullValueAnalyzer(figsize=None)
)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.MappingAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer that combine multiple analyzers.

Parameters:

Name	Type	Description	Default
`analyzers`	`Mapping[str, BaseAnalyzer \| dict]`	The mappings to analyze. The key of each analyzer is used to organize the metrics and report.	required
`max_toc_depth`	`int`	The maximum level to show in the table of content. Set this value to `0` to not show the table of content at the beginning of the section.	`0`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import (
...     FilteredAnalyzer,
...     NullValueAnalyzer,
...     DuplicatedRowAnalyzer,
...     MappingAnalyzer,
... )
>>> analyzer = MappingAnalyzer(
...     {"null": NullValueAnalyzer(), "duplicate": DuplicatedRowAnalyzer()}
... )
>>> analyzer
MappingAnalyzer(
  (null): NullValueAnalyzer(figsize=None)
  (duplicate): DuplicatedRowAnalyzer(columns=None, figsize=None)
)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.MappingAnalyzer.add_analyzer ¶

add_analyzer(
    key: str,
    analyzer: BaseAnalyzer,
    replace_ok: bool = False,
) -> None

Add an analyzer to the current analyzer.

Parameters:

Name	Type	Description	Default
`key`	`str`	The key of the analyzer.	required
`analyzer`	`BaseAnalyzer`	The analyzer to add.	required
`replace_ok`	`bool`	If `False`, `KeyError` is raised if an analyzer with the same key exists. If `True`, the new analyzer will replace the existing analyzer.	`False`

Raises:

Type	Description
`KeyError`	if an analyzer with the same key exists.

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import MappingAnalyzer, NullValueAnalyzer, DuplicatedRowAnalyzer
>>> analyzer = MappingAnalyzer({"null": NullValueAnalyzer()})
>>> analyzer.add_analyzer("duplicate", DuplicatedRowAnalyzer())
>>> analyzer
MappingAnalyzer(
  (null): NullValueAnalyzer(figsize=None)
  (duplicate): DuplicatedRowAnalyzer(columns=None, figsize=None)
)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.MarkdownAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer that adds a mardown string to the report.

Parameters:

Name	Type	Description	Default
`desc`	`str`	The markdown description.	required

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import MarkdownAnalyzer
>>> analyzer = MarkdownAnalyzer(desc="hello cats!")
>>> analyzer
MarkdownAnalyzer()
>>> frame = pd.DataFrame({})
>>> section = analyzer.analyze(frame)

flamme.analyzer.MostFrequentValuesAnalyzer ¶

Bases: BaseAnalyzer

Implement a most frequent values analyzer for a given column.

Parameters:

Name	Type	Description	Default
`column`	`str`	The column to analyze.	required
`dropna`	`bool`	If `True`, the NaN values are not included in the analysis.	`False`
`top`	`int`	The maximum number of values to show.	`100`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import MostFrequentValuesAnalyzer
>>> analyzer = MostFrequentValuesAnalyzer(column="str")
>>> analyzer
MostFrequentValuesAnalyzer(column=str, dropna=False, top=100)
>>> frame = pd.DataFrame({"col": np.array([np.nan, 1, 0, 1])})
>>> section = analyzer.analyze(frame)

flamme.analyzer.NullValueAnalyzer ¶

Bases: BaseAnalyzer

Implement a null value analyzer.

Parameters:

Name	Type	Description	Default
`figsize`	`tuple[float, float] \| None`	The figure size in inches. The first dimension is the width and the second is the height.	`None`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import NullValueAnalyzer
>>> analyzer = NullValueAnalyzer()
>>> analyzer
NullValueAnalyzer(figsize=None)
>>> frame = pd.DataFrame(
...     {
...         "int": np.array([np.nan, 1, 0, 1]),
...         "float": np.array([1.2, 4.2, np.nan, 2.2]),
...         "str": np.array(["A", "B", None, np.nan]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.TableOfContentAnalyzer ¶

Bases: BaseAnalyzer

Implement a wrapper around an analyzer to add a table of content to the generated section report.

Parameters:

Name	Type	Description	Default
`analyzer`	`BaseAnalyzer \| dict`	The analyzer or its configuration.	required
`max_toc_depth`	`int`	The maximum level to show in the table of content.	`1`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import TableOfContentAnalyzer, DuplicatedRowAnalyzer
>>> analyzer = TableOfContentAnalyzer(DuplicatedRowAnalyzer())
>>> analyzer
TableOfContentAnalyzer(
  (analyzer): DuplicatedRowAnalyzer(columns=None, figsize=None)
  (max_toc_depth): 1
)
>>> frame = pd.DataFrame(
...     {
...         "col1": np.array([0, 1, 0, 1]),
...         "col2": np.array([1, 0, 1, 0]),
...         "col3": np.array([1, 1, 1, 1]),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.TemporalNullValueAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer to show the temporal distribution of null values for all columns.

Parameters:

Name	Type	Description	Default
`dt_column`	`str`	The datetime column used to analyze the temporal distribution.	required
`period`	`str`	The temporal period e.g. monthly or daily.	required
`figsize`	`tuple[float, float] \| None`	The figure size in inches. The first dimension is the width and the second is the height.	`None`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import TemporalNullValueAnalyzer
>>> analyzer = TemporalNullValueAnalyzer(dt_column="datetime", period="M")
>>> analyzer
TemporalNullValueAnalyzer(
  (columns): None
  (dt_column): datetime
  (period): M
  (figsize): None
)
>>> frame = pd.DataFrame(
...     {
...         "col": np.array([np.nan, 1, 0, 1]),
...         "datetime": pd.to_datetime(
...             ["2020-01-03", "2020-02-03", "2020-03-03", "2020-04-03"]
...         ),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.TemporalRowCountAnalyzer ¶

Bases: BaseAnalyzer

Implement an analyzer to show the number of rows per temporal window.

Parameters:

Name	Type	Description	Default
`dt_column`	`str`	The datetime column used to analyze the temporal distribution.	required
`period`	`str`	The temporal period e.g. monthly or daily.	required
`figsize`	`tuple[float, float] \| None`	The figure size in inches. The first dimension is the width and the second is the height.	`None`

Example usage:

>>> import numpy as np
>>> import pandas as pd
>>> from flamme.analyzer import TemporalRowCountAnalyzer
>>> analyzer = TemporalRowCountAnalyzer(dt_column="datetime", period="M")
>>> analyzer
TemporalRowCountAnalyzer(dt_column=datetime, period=M, figsize=None)
>>> frame = pd.DataFrame(
...     {
...         "datetime": pd.to_datetime(
...             ["2020-01-03", "2020-02-03", "2020-03-03", "2020-04-03"]
...         ),
...     }
... )
>>> section = analyzer.analyze(frame)

flamme.analyzer.is_analyzer_config ¶

is_analyzer_config(config: dict) -> bool

Indicate if the input configuration is a configuration for a BaseAnalyzer.

This function only checks if the value of the key _target_ is valid. It does not check the other values. If _target_ indicates a function, the returned type hint is used to check the class.

Parameters:

Name	Type	Description	Default
`config`	`dict`	The configuration to check.	required

Returns:

Type	Description
`bool`	`True` if the input configuration is a configuration for a `BaseAnalyzer` object.

Example usage:

>>> from flamme.analyzer import is_analyzer_config
>>> is_analyzer_config({"_target_": "flamme.analyzer.NullValueAnalyzer"})
True

flamme.analyzer.setup_analyzer ¶

setup_analyzer(
    analyzer: BaseAnalyzer | dict,
) -> BaseAnalyzer

Set up an analyzer.

The analyzer is instantiated from its configuration by using the BaseAnalyzer factory function.

Parameters:

Name	Type	Description	Default
`analyzer`	`BaseAnalyzer \| dict`	Specifies an analyzer or its configuration.	required

Returns:

Type	Description
`BaseAnalyzer`	An instantiated analyzer.

Example usage:

>>> from flamme.analyzer import setup_analyzer
>>> analyzer = setup_analyzer({"_target_": "flamme.analyzer.NullValueAnalyzer"})
>>> analyzer
NullValueAnalyzer(figsize=None)