Data Writers

Data writers for Macrodata Refinement (MDR).

This module provides functions and classes for writing macrodata to various file formats.

class mdr.io.writers.DataDestination(value)[source]

Bases: Enum

Types of data destinations.

FILE = 1

DATABASE = 2

API = 3

MEMORY = 4

class mdr.io.writers.DataWriter(dest_type=DataDestination.FILE)[source]

Bases: ABC

Abstract base class for data writers.

Parameters:: dest_type (DataDestination)

__init__(dest_type=DataDestination.FILE)[source]

Initialize the data writer.

Parameters:: dest_type (DataDestination) – Type of data destination

abstract write(data, destination, **options)[source]

Write data to the destination.

Parameters:

data (Dict[str, <MagicMock id='136017405270224'>]) – Dictionary mapping variable names to data arrays
destination (str) – Destination identifier (file path, table name, etc.)
**options – Additional writing options

Return type:

None

abstract validate_destination(destination)[source]

Validate if the destination can be written to.

Parameters:: destination (str) – Destination identifier
Returns:: True if the destination is valid, False otherwise
Return type:: bool

class mdr.io.writers.FileWriter(encoding='utf-8', overwrite=False)[source]

Bases: DataWriter

Base class for file-based data writers.

Parameters:

encoding (str)
overwrite (bool)

__init__(encoding='utf-8', overwrite=False)[source]

Initialize the file writer.

Parameters:

encoding (str) – File encoding
overwrite (bool) – Whether to overwrite existing files

validate_destination(destination)[source]

Validate if the file can be written to.

Parameters:: destination (str) – File path
Returns:: True if the file is valid, False otherwise
Return type:: bool

class mdr.io.writers.CSVWriter(delimiter=',', quotechar='"', encoding='utf-8', overwrite=False)[source]

Bases: FileWriter

Writer for CSV files.

Parameters:

delimiter (str)
quotechar (str)
encoding (str)
overwrite (bool)

__init__(delimiter=',', quotechar='"', encoding='utf-8', overwrite=False)[source]

Initialize the CSV writer.

Parameters:

delimiter (str) – Field delimiter
quotechar (str) – Character for quoting fields
encoding (str) – File encoding
overwrite (bool) – Whether to overwrite existing files

write(data, destination, index=False, float_format='%.6f', date_format=None, **options)[source]

Write data to a CSV file.

Parameters:

data (Dict[str, <MagicMock id='136017403228176'>]) – Dictionary mapping column names to data arrays
destination (str) – File path
index (bool) – Whether to write row indices
float_format (str | None) – Format string for float values
date_format (str | None) – Format string for date values
**options – Additional pandas.to_csv options

Return type:

None

class mdr.io.writers.JSONWriter(encoding='utf-8', overwrite=False)[source]

Bases: FileWriter

Writer for JSON files.

Parameters:

encoding (str)
overwrite (bool)

write(data, destination, orient='columns', date_format='iso', indent=4, **options)[source]

Write data to a JSON file.

Parameters:

data (Dict[str, <MagicMock id='136017403246528'>]) – Dictionary mapping column names to data arrays
destination (str) – File path
orient (str) – JSON format, one of [‘columns’, ‘records’, ‘index’, ‘split’, ‘values’]
date_format (str) – Format for date values
indent (int | None) – Number of spaces for indentation (None for no indentation)
**options – Additional pandas.to_json options

Return type:

None

class mdr.io.writers.ExcelWriter(encoding='utf-8', overwrite=False)[source]

Bases: FileWriter

Writer for Excel files.

Parameters:

encoding (str)
overwrite (bool)

write(data, destination, sheet_name='Sheet1', float_format='%.6f', freeze_panes=None, **options)[source]

Write data to an Excel file.

Parameters:

data (Dict[str, <MagicMock id='136017403308656'>]) – Dictionary mapping column names to data arrays
destination (str) – File path
sheet_name (str) – Name of the sheet
float_format (str | None) – Format string for float values
freeze_panes (Tuple[int, int] | None) – Tuple of (rows, cols) to freeze
**options – Additional pandas.to_excel options

Return type:

None

class mdr.io.writers.ParquetWriter(encoding='utf-8', overwrite=False)[source]

Bases: FileWriter

Writer for Parquet files.

Parameters:

encoding (str)
overwrite (bool)

write(data, destination, compression='snappy', index=False, **options)[source]

Write data to a Parquet file.

Parameters:

data (Dict[str, <MagicMock id='136017403375392'>]) – Dictionary mapping column names to data arrays
destination (str) – File path
compression (str) – Compression method
index (bool) – Whether to include row indices
**options – Additional pandas.to_parquet options

Return type:

None

class mdr.io.writers.HDF5Writer(encoding='utf-8', overwrite=False)[source]

Bases: FileWriter

Writer for HDF5 files.

Parameters:

encoding (str)
overwrite (bool)

write(data, destination, key, mode='a', complevel=9, complib='zlib', **options)[source]

Write data to an HDF5 file.

Parameters:

data (Dict[str, <MagicMock id='136017403350656'>]) – Dictionary mapping column names to data arrays
destination (str) – File path
key (str) – Group identifier in the HDF5 file
mode (str) – File open mode (‘a’ for append, ‘w’ for write)
complevel (int | None) – Compression level (0-9, 0 for no compression)
complib (str | None) – Compression library
**options – Additional pandas.to_hdf options

Return type:

None

mdr.io.writers.get_writer(file_type, **options)[source]

Get a writer for the specified file type.

Parameters:

file_type (str) – Type of file (‘csv’, ‘json’, ‘excel’, ‘parquet’, ‘hdf5’)
**options – Additional options for the writer

Returns:

Appropriate DataWriter instance

Return type:

DataWriter

mdr.io.writers.write_csv(data, filepath, delimiter=',', float_format='%.6f', **options)[source]

Write data to a CSV file.

Parameters:

data (Dict[str, <MagicMock id='136017403276752'>]) – Dictionary mapping column names to data arrays
filepath (str) – Path to the CSV file
delimiter (str) – Field delimiter
float_format (str) – Format string for float values
**options – Additional writing options

Return type:

None

mdr.io.writers.write_json(data, filepath, orient='columns', **options)[source]

Write data to a JSON file.

Parameters:

data (Dict[str, <MagicMock id='136017403252112'>]) – Dictionary mapping column names to data arrays
filepath (str) – Path to the JSON file
orient (str) – JSON format
**options – Additional writing options

Return type:

None

mdr.io.writers.write_excel(data, filepath, sheet_name='Sheet1', **options)[source]

Write data to an Excel file.

Parameters:

data (Dict[str, <MagicMock id='136017403433728'>]) – Dictionary mapping column names to data arrays
filepath (str) – Path to the Excel file
sheet_name (str) – Name of the sheet
**options – Additional writing options

Return type:

None

mdr.io.writers.write_parquet(data, filepath, compression='snappy', **options)[source]

Write data to a Parquet file.

Parameters:

data (Dict[str, <MagicMock id='136017403458368'>]) – Dictionary mapping column names to data arrays
filepath (str) – Path to the Parquet file
compression (str) – Compression method
**options – Additional writing options

Return type:

None

mdr.io.writers.write_hdf5(data, filepath, key, **options)[source]

Write data to an HDF5 file.

Parameters:

data (Dict[str, <MagicMock id='136017405271856'>]) – Dictionary mapping column names to data arrays
filepath (str) – Path to the HDF5 file
key (str) – Group identifier in the HDF5 file
**options – Additional writing options

Return type:

None

Overview

The writers module provides functions for writing processed data to various file formats. These functions handle data serialization, formatting, and output to persistent storage.

Supported File Formats

The module supports writing data to the following formats:

CSV: Comma-separated values files
JSON: JavaScript Object Notation files
Excel: Microsoft Excel workbooks (.xlsx)
Parquet: Apache Parquet columnar storage files
HDF5: Hierarchical Data Format version 5 files

Core Functions

mdr.io.writers.write_csv(data, filepath, delimiter=',', float_format='%.6f', **options)[source]

Write data to a CSV file.

Parameters:

data (Dict[str, <MagicMock id='136017403276752'>]) – Dictionary mapping column names to data arrays
filepath (str) – Path to the CSV file
delimiter (str) – Field delimiter
float_format (str) – Format string for float values
**options – Additional writing options

Return type:

None

mdr.io.writers.write_json(data, filepath, orient='columns', **options)[source]

Write data to a JSON file.

Parameters:

data (Dict[str, <MagicMock id='136017403252112'>]) – Dictionary mapping column names to data arrays
filepath (str) – Path to the JSON file
orient (str) – JSON format
**options – Additional writing options

Return type:

None

mdr.io.writers.write_excel(data, filepath, sheet_name='Sheet1', **options)[source]

Write data to an Excel file.

Parameters:

data (Dict[str, <MagicMock id='136017403433728'>]) – Dictionary mapping column names to data arrays
filepath (str) – Path to the Excel file
sheet_name (str) – Name of the sheet
**options – Additional writing options

Return type:

None

mdr.io.writers.write_parquet(data, filepath, compression='snappy', **options)[source]

Write data to a Parquet file.

Parameters:

data (Dict[str, <MagicMock id='136017403458368'>]) – Dictionary mapping column names to data arrays
filepath (str) – Path to the Parquet file
compression (str) – Compression method
**options – Additional writing options

Return type:

None

mdr.io.writers.write_hdf5(data, filepath, key, **options)[source]

Write data to an HDF5 file.

Parameters:

data (Dict[str, <MagicMock id='136017405271856'>]) – Dictionary mapping column names to data arrays
filepath (str) – Path to the HDF5 file
key (str) – Group identifier in the HDF5 file
**options – Additional writing options

Return type:

None

Usage Examples

Writing to a CSV file:

import numpy as np
from mdr.io.writers import write_csv

# Create a dictionary of data variables
data_dict = {
    "time": np.array([0, 1, 2, 3, 4]),
    "temperature": np.array([20.5, 21.3, 22.1, 21.7, 23.0]),
    "pressure": np.array([101.3, 101.4, 101.5, 101.2, 101.1])
}

# Write data to a CSV file
write_csv(data_dict, "path/to/output.csv")

Writing to multiple formats:

from mdr.io.writers import write_csv, write_json, write_excel

# Write to multiple formats for different use cases
write_csv(data_dict, "path/to/output.csv")  # For general use
write_json(data_dict, "path/to/output.json")  # For web applications
write_excel(data_dict, "path/to/output.xlsx")  # For spreadsheet analysis