Data Visualization
Plotting utilities for Macrodata Refinement (MDR).
This module provides functions for creating and saving data visualizations.
- class mdr.visualization.plots.PlotConfig(title=None, figsize=(10.0, 6.0), dpi=100, xlabel=None, ylabel=None, xlim=None, ylim=None, legend=True, grid=True, style='seaborn-v0_8-whitegrid', palette='viridis', font_family='sans-serif', font_size=12, figure_bgcolor='#ffffff', axis_bgcolor='#f8f8f8')[source]
Bases:
objectConfiguration for plot appearance and behavior.
- Parameters:
- mdr.visualization.plots.plot_time_series(data, timestamps=None, config=None)[source]
Plot time series data.
- Parameters:
data (Dict[str, <MagicMock id='136017403718400'>]) – Dictionary mapping variable names to data arrays
timestamps (<MagicMock id='136017403942064'> | None) – Optional array of timestamps for the x-axis
config (PlotConfig | None) – Plot configuration
- Returns:
Tuple of (figure, axes)
- Return type:
Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]
- mdr.visualization.plots.plot_histogram(data, bins=30, density=False, config=None)[source]
Plot a histogram of data.
- Parameters:
data (<MagicMock id='136017404189120'> | Dict[str, <MagicMock id='136017404084080'>]) – Array of data or dictionary mapping variable names to data arrays
bins (int) – Number of histogram bins
density (bool) – Whether to normalize the histogram
config (PlotConfig | None) – Plot configuration
- Returns:
Tuple of (figure, axes)
- Return type:
Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]
- mdr.visualization.plots.plot_boxplot(data, vert=True, showfliers=True, config=None)[source]
Plot a box plot of data.
- Parameters:
data (<MagicMock id='136017403959648'> | Dict[str, <MagicMock id='136017402283344'>]) – Array of data or dictionary mapping variable names to data arrays
vert (bool) – Whether to draw the boxes vertically
showfliers (bool) – Whether to show outliers
config (PlotConfig | None) – Plot configuration
- Returns:
Tuple of (figure, axes)
- Return type:
Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]
- mdr.visualization.plots.plot_heatmap(data, row_labels=None, col_labels=None, cmap='viridis', vmin=None, vmax=None, config=None)[source]
Plot a heatmap of 2D data.
- Parameters:
data (<MagicMock id='136017402221744'>) – 2D array of data
cmap (str) – Colormap name
vmin (float | None) – Minimum value for color scaling
vmax (float | None) – Maximum value for color scaling
config (PlotConfig | None) – Plot configuration
- Returns:
Tuple of (figure, axes)
- Return type:
Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]
- mdr.visualization.plots.plot_scatter(x, y, labels=None, sizes=None, alpha=0.7, config=None)[source]
Plot a scatter plot of data.
- Parameters:
x (<MagicMock id='136017402197184'>) – X-coordinates
y (<MagicMock id='136017402189360'>) – Y-coordinates
labels (<MagicMock id='136017402148752'> | None) – Labels or categories for the points
sizes (<MagicMock id='136017402308896'> | None) – Sizes for the points
alpha (float) – Transparency for the points
config (PlotConfig | None) – Plot configuration
- Returns:
Tuple of (figure, axes)
- Return type:
Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]
- mdr.visualization.plots.plot_correlation_matrix(data, method='pearson', cmap='coolwarm', config=None)[source]
Plot a correlation matrix.
- Parameters:
data (<MagicMock id='136017403812336'> | Dict[str, <MagicMock id='136017403752416'>] | <MagicMock id='136017402313552'>) – 2D array of data, dictionary mapping variable names to data arrays, or pandas DataFrame
method (str) – Correlation method (‘pearson’, ‘kendall’, ‘spearman’)
cmap (str) – Colormap name
config (PlotConfig | None) – Plot configuration
- Returns:
Tuple of (figure, axes)
- Return type:
Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]
- mdr.visualization.plots.plot_validation_results(results, config=None)[source]
Plot validation results.
- mdr.visualization.plots.plot_refinement_comparison(original_data, refined_data, timestamps=None, config=None)[source]
Plot a comparison of original and refined data.
- Parameters:
original_data (<MagicMock id='136017402432816'>) – Original data array
refined_data (<MagicMock id='136017402441984'>) – Refined data array
timestamps (<MagicMock id='136017402482592'> | None) – Optional array of timestamps for the x-axis
config (PlotConfig | None) – Plot configuration
- Returns:
Tuple of (figure, list of axes)
- Return type:
Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, List[<MagicMock name=’mock.Axes’ id=’136017405142512’>]]
- mdr.visualization.plots.save_plot(fig, filepath, dpi=None, format=None, transparent=False)[source]
Save a figure to a file.
- Parameters:
- Return type:
None
Overview
The plots module provides functions for visualizing data, refinement results,
and validation outcomes. These visualizations help understand the data and the
effects of refinement operations.
Plot Types
The module provides the following types of plots:
Time Series: Visualize data variables over time
Refinement Comparison: Compare original and refined data
Validation Results: Visualize data quality assessment results
Distribution: Show data distributions before and after processing
Correlation: Display relationships between variables
Core Functions
- mdr.visualization.plots.plot_time_series(data, timestamps=None, config=None)[source]
Plot time series data.
- Parameters:
data (Dict[str, <MagicMock id='136017403718400'>]) – Dictionary mapping variable names to data arrays
timestamps (<MagicMock id='136017403942064'> | None) – Optional array of timestamps for the x-axis
config (PlotConfig | None) – Plot configuration
- Returns:
Tuple of (figure, axes)
- Return type:
Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]
- mdr.visualization.plots.plot_refinement_comparison(original_data, refined_data, timestamps=None, config=None)[source]
Plot a comparison of original and refined data.
- Parameters:
original_data (<MagicMock id='136017402432816'>) – Original data array
refined_data (<MagicMock id='136017402441984'>) – Refined data array
timestamps (<MagicMock id='136017402482592'> | None) – Optional array of timestamps for the x-axis
config (PlotConfig | None) – Plot configuration
- Returns:
Tuple of (figure, list of axes)
- Return type:
Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, List[<MagicMock name=’mock.Axes’ id=’136017405142512’>]]
- mdr.visualization.plots.plot_validation_results(results, config=None)[source]
Plot validation results.
- mdr.visualization.plots.plot_correlation_matrix(data, method='pearson', cmap='coolwarm', config=None)[source]
Plot a correlation matrix.
- Parameters:
data (<MagicMock id='136017403812336'> | Dict[str, <MagicMock id='136017403752416'>] | <MagicMock id='136017402313552'>) – 2D array of data, dictionary mapping variable names to data arrays, or pandas DataFrame
method (str) – Correlation method (‘pearson’, ‘kendall’, ‘spearman’)
cmap (str) – Colormap name
config (PlotConfig | None) – Plot configuration
- Returns:
Tuple of (figure, axes)
- Return type:
Tuple[<MagicMock name=’mock.Figure’ id=’136017405145248’>, <MagicMock name=’mock.Axes’ id=’136017405142512’>]
- mdr.visualization.plots.save_plot(fig, filepath, dpi=None, format=None, transparent=False)[source]
Save a figure to a file.
- Parameters:
- Return type:
None
Input Format Compatibility
Many of the visualization functions accept multiple input formats:
NumPy arrays: For single variable visualizations
Dictionary of arrays: For multi-variable visualizations with named variables
Pandas DataFrames: For direct use of pandas data structures (supported by most functions)
Customization Options
Most plotting functions accept the following customization parameters:
figsize: Tuple specifying the figure dimensions
title: Custom title for the plot
labels: Dictionary mapping variable names to display labels
colors: Custom color scheme for the plot
style: Matplotlib style sheet to use
Usage Examples
Time series plot:
import numpy as np
from mdr.visualization.plots import plot_time_series
import matplotlib.pyplot as plt
# Create sample data
time = np.array([0, 1, 2, 3, 4])
data_dict = {
"temperature": np.array([20.5, 21.3, 22.1, 21.7, 23.0]),
"pressure": np.array([101.3, 101.4, 101.5, 101.2, 101.1])
}
# Create a time series plot
fig, ax = plot_time_series(
data_dict,
time,
title="Sensor Readings",
labels={"temperature": "Temperature (°C)", "pressure": "Pressure (hPa)"}
)
plt.show()
Refinement comparison:
import numpy as np
from mdr.core.refinement import RefinementConfig, refine_data
from mdr.visualization.plots import plot_refinement_comparison
import matplotlib.pyplot as plt
# Create sample data with outliers
data = np.array([1.0, 2.0, 3.0, 20.0, 5.0])
# Configure and apply refinement
config = RefinementConfig(
smoothing_factor=0.2,
outlier_threshold=2.5,
imputation_method="linear",
normalization_type="minmax"
)
refined_data = refine_data(data, config)
# Create a comparison plot
fig, axes = plot_refinement_comparison(
data,
refined_data,
title="Data Refinement Results"
)
plt.tight_layout()
plt.show()
Correlation matrix with pandas DataFrame:
import numpy as np
import pandas as pd
from mdr.visualization.plots import plot_correlation_matrix, PlotConfig
import matplotlib.pyplot as plt
# Create a sample DataFrame
data = {
"temperature": [20.5, 21.3, 22.1, 21.7, 23.0],
"pressure": [101.3, 101.4, 101.5, 101.2, 101.1],
"humidity": [45.0, 47.0, 48.5, 50.2, 49.8]
}
df = pd.DataFrame(data)
# Create a correlation matrix plot directly from the DataFrame
fig, ax = plot_correlation_matrix(
df,
method="pearson",
cmap="coolwarm",
config=PlotConfig(title="Correlation Matrix")
)
plt.show()