Data Visualization

Plotting utilities for Macrodata Refinement (MDR).

This module provides functions for creating and saving data visualizations.

class mdr.visualization.plots.PlotConfig(title=None, figsize=(10.0, 6.0), dpi=100, xlabel=None, ylabel=None, xlim=None, ylim=None, legend=True, grid=True, style='seaborn-v0_8-whitegrid', palette='viridis', font_family='sans-serif', font_size=12, figure_bgcolor='#ffffff', axis_bgcolor='#f8f8f8')[source]

Bases: object

Configuration for plot appearance and behavior.

Parameters:
title: str | None = None
figsize: Tuple[float, float] = (10.0, 6.0)
dpi: int = 100
xlabel: str | None = None
ylabel: str | None = None
xlim: Tuple[float, float] | None = None
ylim: Tuple[float, float] | None = None
legend: bool = True
grid: bool = True
style: str = 'seaborn-v0_8-whitegrid'
palette: str = 'viridis'
font_family: str = 'sans-serif'
font_size: int = 12
figure_bgcolor: str = '#ffffff'
axis_bgcolor: str = '#f8f8f8'
__post_init__()[source]

Validate configuration parameters.

Return type:

None

mdr.visualization.plots.plot_time_series(data, timestamps=None, config=None)[source]

Plot time series data.

Parameters:
  • data (Dict[str, <MagicMock id='136017403718400'>]) – Dictionary mapping variable names to data arrays

  • timestamps (<MagicMock id='136017403942064'> | None) – Optional array of timestamps for the x-axis

  • config (PlotConfig | None) – Plot configuration

Returns:

Tuple of (figure, axes)

Return type:

Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]

mdr.visualization.plots.plot_histogram(data, bins=30, density=False, config=None)[source]

Plot a histogram of data.

Parameters:
  • data (<MagicMock id='136017404189120'> | Dict[str, <MagicMock id='136017404084080'>]) – Array of data or dictionary mapping variable names to data arrays

  • bins (int) – Number of histogram bins

  • density (bool) – Whether to normalize the histogram

  • config (PlotConfig | None) – Plot configuration

Returns:

Tuple of (figure, axes)

Return type:

Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]

mdr.visualization.plots.plot_boxplot(data, vert=True, showfliers=True, config=None)[source]

Plot a box plot of data.

Parameters:
  • data (<MagicMock id='136017403959648'> | Dict[str, <MagicMock id='136017402283344'>]) – Array of data or dictionary mapping variable names to data arrays

  • vert (bool) – Whether to draw the boxes vertically

  • showfliers (bool) – Whether to show outliers

  • config (PlotConfig | None) – Plot configuration

Returns:

Tuple of (figure, axes)

Return type:

Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]

mdr.visualization.plots.plot_heatmap(data, row_labels=None, col_labels=None, cmap='viridis', vmin=None, vmax=None, config=None)[source]

Plot a heatmap of 2D data.

Parameters:
  • data (<MagicMock id='136017402221744'>) – 2D array of data

  • row_labels (List[str] | None) – Labels for the rows

  • col_labels (List[str] | None) – Labels for the columns

  • cmap (str) – Colormap name

  • vmin (float | None) – Minimum value for color scaling

  • vmax (float | None) – Maximum value for color scaling

  • config (PlotConfig | None) – Plot configuration

Returns:

Tuple of (figure, axes)

Return type:

Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]

mdr.visualization.plots.plot_scatter(x, y, labels=None, sizes=None, alpha=0.7, config=None)[source]

Plot a scatter plot of data.

Parameters:
  • x (<MagicMock id='136017402197184'>) – X-coordinates

  • y (<MagicMock id='136017402189360'>) – Y-coordinates

  • labels (<MagicMock id='136017402148752'> | None) – Labels or categories for the points

  • sizes (<MagicMock id='136017402308896'> | None) – Sizes for the points

  • alpha (float) – Transparency for the points

  • config (PlotConfig | None) – Plot configuration

Returns:

Tuple of (figure, axes)

Return type:

Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]

mdr.visualization.plots.plot_correlation_matrix(data, method='pearson', cmap='coolwarm', config=None)[source]

Plot a correlation matrix.

Parameters:
  • data (<MagicMock id='136017403812336'> | Dict[str, <MagicMock id='136017403752416'>] | <MagicMock id='136017402313552'>) – 2D array of data, dictionary mapping variable names to data arrays, or pandas DataFrame

  • method (str) – Correlation method (‘pearson’, ‘kendall’, ‘spearman’)

  • cmap (str) – Colormap name

  • config (PlotConfig | None) – Plot configuration

Returns:

Tuple of (figure, axes)

Return type:

Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]

mdr.visualization.plots.plot_validation_results(results, config=None)[source]

Plot validation results.

Parameters:
  • results (Dict[str, Dict[str, Any]]) – Dictionary mapping variable names to validation results

  • config (PlotConfig | None) – Plot configuration

Returns:

Tuple of (figure, list of axes)

Return type:

Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, List[<MagicMock name=’mock.Axes’ id=’136017405142512’>]]

mdr.visualization.plots.plot_refinement_comparison(original_data, refined_data, timestamps=None, config=None)[source]

Plot a comparison of original and refined data.

Parameters:
  • original_data (<MagicMock id='136017402432816'>) – Original data array

  • refined_data (<MagicMock id='136017402441984'>) – Refined data array

  • timestamps (<MagicMock id='136017402482592'> | None) – Optional array of timestamps for the x-axis

  • config (PlotConfig | None) – Plot configuration

Returns:

Tuple of (figure, list of axes)

Return type:

Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, List[<MagicMock name=’mock.Axes’ id=’136017405142512’>]]

mdr.visualization.plots.save_plot(fig, filepath, dpi=None, format=None, transparent=False)[source]

Save a figure to a file.

Parameters:
  • fig (<MagicMock name='mock.Figure' id='136017405145248'>) – Matplotlib figure

  • filepath (str) – Path to the output file

  • dpi (int | None) – Resolution in dots per inch

  • format (str | None) – File format (auto-detected from extension if None)

  • transparent (bool) – Whether to use a transparent background

Return type:

None

Overview

The plots module provides functions for visualizing data, refinement results, and validation outcomes. These visualizations help understand the data and the effects of refinement operations.

Plot Types

The module provides the following types of plots:

  • Time Series: Visualize data variables over time

  • Refinement Comparison: Compare original and refined data

  • Validation Results: Visualize data quality assessment results

  • Distribution: Show data distributions before and after processing

  • Correlation: Display relationships between variables

Core Functions

mdr.visualization.plots.plot_time_series(data, timestamps=None, config=None)[source]

Plot time series data.

Parameters:
  • data (Dict[str, <MagicMock id='136017403718400'>]) – Dictionary mapping variable names to data arrays

  • timestamps (<MagicMock id='136017403942064'> | None) – Optional array of timestamps for the x-axis

  • config (PlotConfig | None) – Plot configuration

Returns:

Tuple of (figure, axes)

Return type:

Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]

mdr.visualization.plots.plot_refinement_comparison(original_data, refined_data, timestamps=None, config=None)[source]

Plot a comparison of original and refined data.

Parameters:
  • original_data (<MagicMock id='136017402432816'>) – Original data array

  • refined_data (<MagicMock id='136017402441984'>) – Refined data array

  • timestamps (<MagicMock id='136017402482592'> | None) – Optional array of timestamps for the x-axis

  • config (PlotConfig | None) – Plot configuration

Returns:

Tuple of (figure, list of axes)

Return type:

Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, List[<MagicMock name=’mock.Axes’ id=’136017405142512’>]]

mdr.visualization.plots.plot_validation_results(results, config=None)[source]

Plot validation results.

Parameters:
  • results (Dict[str, Dict[str, Any]]) – Dictionary mapping variable names to validation results

  • config (PlotConfig | None) – Plot configuration

Returns:

Tuple of (figure, list of axes)

Return type:

Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, List[<MagicMock name=’mock.Axes’ id=’136017405142512’>]]

mdr.visualization.plots.plot_correlation_matrix(data, method='pearson', cmap='coolwarm', config=None)[source]

Plot a correlation matrix.

Parameters:
  • data (<MagicMock id='136017403812336'> | Dict[str, <MagicMock id='136017403752416'>] | <MagicMock id='136017402313552'>) – 2D array of data, dictionary mapping variable names to data arrays, or pandas DataFrame

  • method (str) – Correlation method (‘pearson’, ‘kendall’, ‘spearman’)

  • cmap (str) – Colormap name

  • config (PlotConfig | None) – Plot configuration

Returns:

Tuple of (figure, axes)

Return type:

Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]

mdr.visualization.plots.save_plot(fig, filepath, dpi=None, format=None, transparent=False)[source]

Save a figure to a file.

Parameters:
  • fig (<MagicMock name='mock.Figure' id='136017405145248'>) – Matplotlib figure

  • filepath (str) – Path to the output file

  • dpi (int | None) – Resolution in dots per inch

  • format (str | None) – File format (auto-detected from extension if None)

  • transparent (bool) – Whether to use a transparent background

Return type:

None

Input Format Compatibility

Many of the visualization functions accept multiple input formats:

  • NumPy arrays: For single variable visualizations

  • Dictionary of arrays: For multi-variable visualizations with named variables

  • Pandas DataFrames: For direct use of pandas data structures (supported by most functions)

Customization Options

Most plotting functions accept the following customization parameters:

  • figsize: Tuple specifying the figure dimensions

  • title: Custom title for the plot

  • labels: Dictionary mapping variable names to display labels

  • colors: Custom color scheme for the plot

  • style: Matplotlib style sheet to use

Usage Examples

Time series plot:

import numpy as np
from mdr.visualization.plots import plot_time_series
import matplotlib.pyplot as plt

# Create sample data
time = np.array([0, 1, 2, 3, 4])
data_dict = {
    "temperature": np.array([20.5, 21.3, 22.1, 21.7, 23.0]),
    "pressure": np.array([101.3, 101.4, 101.5, 101.2, 101.1])
}

# Create a time series plot
fig, ax = plot_time_series(
    data_dict,
    time,
    title="Sensor Readings",
    labels={"temperature": "Temperature (°C)", "pressure": "Pressure (hPa)"}
)
plt.show()

Refinement comparison:

import numpy as np
from mdr.core.refinement import RefinementConfig, refine_data
from mdr.visualization.plots import plot_refinement_comparison
import matplotlib.pyplot as plt

# Create sample data with outliers
data = np.array([1.0, 2.0, 3.0, 20.0, 5.0])

# Configure and apply refinement
config = RefinementConfig(
    smoothing_factor=0.2,
    outlier_threshold=2.5,
    imputation_method="linear",
    normalization_type="minmax"
)
refined_data = refine_data(data, config)

# Create a comparison plot
fig, axes = plot_refinement_comparison(
    data,
    refined_data,
    title="Data Refinement Results"
)
plt.tight_layout()
plt.show()

Correlation matrix with pandas DataFrame:

import numpy as np
import pandas as pd
from mdr.visualization.plots import plot_correlation_matrix, PlotConfig
import matplotlib.pyplot as plt

# Create a sample DataFrame
data = {
    "temperature": [20.5, 21.3, 22.1, 21.7, 23.0],
    "pressure": [101.3, 101.4, 101.5, 101.2, 101.1],
    "humidity": [45.0, 47.0, 48.5, 50.2, 49.8]
}
df = pd.DataFrame(data)

# Create a correlation matrix plot directly from the DataFrame
fig, ax = plot_correlation_matrix(
    df,
    method="pearson",
    cmap="coolwarm",
    config=PlotConfig(title="Correlation Matrix")
)
plt.show()