Data Writers

Data writers for Macrodata Refinement (MDR).

This module provides functions and classes for writing macrodata to various file formats.

class mdr.io.writers.DataDestination(value)[source]

Bases: Enum

Types of data destinations.

FILE = 1
DATABASE = 2
API = 3
MEMORY = 4
class mdr.io.writers.DataWriter(dest_type=DataDestination.FILE)[source]

Bases: ABC

Abstract base class for data writers.

Parameters:

dest_type (DataDestination)

__init__(dest_type=DataDestination.FILE)[source]

Initialize the data writer.

Parameters:

dest_type (DataDestination) – Type of data destination

abstract write(data, destination, **options)[source]

Write data to the destination.

Parameters:
  • data (Dict[str, <MagicMock id='136017405270224'>]) – Dictionary mapping variable names to data arrays

  • destination (str) – Destination identifier (file path, table name, etc.)

  • **options – Additional writing options

Return type:

None

abstract validate_destination(destination)[source]

Validate if the destination can be written to.

Parameters:

destination (str) – Destination identifier

Returns:

True if the destination is valid, False otherwise

Return type:

bool

class mdr.io.writers.FileWriter(encoding='utf-8', overwrite=False)[source]

Bases: DataWriter

Base class for file-based data writers.

Parameters:
__init__(encoding='utf-8', overwrite=False)[source]

Initialize the file writer.

Parameters:
  • encoding (str) – File encoding

  • overwrite (bool) – Whether to overwrite existing files

validate_destination(destination)[source]

Validate if the file can be written to.

Parameters:

destination (str) – File path

Returns:

True if the file is valid, False otherwise

Return type:

bool

class mdr.io.writers.CSVWriter(delimiter=',', quotechar='"', encoding='utf-8', overwrite=False)[source]

Bases: FileWriter

Writer for CSV files.

Parameters:
  • delimiter (str)

  • quotechar (str)

  • encoding (str)

  • overwrite (bool)

__init__(delimiter=',', quotechar='"', encoding='utf-8', overwrite=False)[source]

Initialize the CSV writer.

Parameters:
  • delimiter (str) – Field delimiter

  • quotechar (str) – Character for quoting fields

  • encoding (str) – File encoding

  • overwrite (bool) – Whether to overwrite existing files

write(data, destination, index=False, float_format='%.6f', date_format=None, **options)[source]

Write data to a CSV file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403228176'>]) – Dictionary mapping column names to data arrays

  • destination (str) – File path

  • index (bool) – Whether to write row indices

  • float_format (str | None) – Format string for float values

  • date_format (str | None) – Format string for date values

  • **options – Additional pandas.to_csv options

Return type:

None

class mdr.io.writers.JSONWriter(encoding='utf-8', overwrite=False)[source]

Bases: FileWriter

Writer for JSON files.

Parameters:
write(data, destination, orient='columns', date_format='iso', indent=4, **options)[source]

Write data to a JSON file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403246528'>]) – Dictionary mapping column names to data arrays

  • destination (str) – File path

  • orient (str) – JSON format, one of [‘columns’, ‘records’, ‘index’, ‘split’, ‘values’]

  • date_format (str) – Format for date values

  • indent (int | None) – Number of spaces for indentation (None for no indentation)

  • **options – Additional pandas.to_json options

Return type:

None

class mdr.io.writers.ExcelWriter(encoding='utf-8', overwrite=False)[source]

Bases: FileWriter

Writer for Excel files.

Parameters:
write(data, destination, sheet_name='Sheet1', float_format='%.6f', freeze_panes=None, **options)[source]

Write data to an Excel file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403308656'>]) – Dictionary mapping column names to data arrays

  • destination (str) – File path

  • sheet_name (str) – Name of the sheet

  • float_format (str | None) – Format string for float values

  • freeze_panes (Tuple[int, int] | None) – Tuple of (rows, cols) to freeze

  • **options – Additional pandas.to_excel options

Return type:

None

class mdr.io.writers.ParquetWriter(encoding='utf-8', overwrite=False)[source]

Bases: FileWriter

Writer for Parquet files.

Parameters:
write(data, destination, compression='snappy', index=False, **options)[source]

Write data to a Parquet file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403375392'>]) – Dictionary mapping column names to data arrays

  • destination (str) – File path

  • compression (str) – Compression method

  • index (bool) – Whether to include row indices

  • **options – Additional pandas.to_parquet options

Return type:

None

class mdr.io.writers.HDF5Writer(encoding='utf-8', overwrite=False)[source]

Bases: FileWriter

Writer for HDF5 files.

Parameters:
write(data, destination, key, mode='a', complevel=9, complib='zlib', **options)[source]

Write data to an HDF5 file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403350656'>]) – Dictionary mapping column names to data arrays

  • destination (str) – File path

  • key (str) – Group identifier in the HDF5 file

  • mode (str) – File open mode (‘a’ for append, ‘w’ for write)

  • complevel (int | None) – Compression level (0-9, 0 for no compression)

  • complib (str | None) – Compression library

  • **options – Additional pandas.to_hdf options

Return type:

None

mdr.io.writers.get_writer(file_type, **options)[source]

Get a writer for the specified file type.

Parameters:
  • file_type (str) – Type of file (‘csv’, ‘json’, ‘excel’, ‘parquet’, ‘hdf5’)

  • **options – Additional options for the writer

Returns:

Appropriate DataWriter instance

Return type:

DataWriter

mdr.io.writers.write_csv(data, filepath, delimiter=',', float_format='%.6f', **options)[source]

Write data to a CSV file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403276752'>]) – Dictionary mapping column names to data arrays

  • filepath (str) – Path to the CSV file

  • delimiter (str) – Field delimiter

  • float_format (str) – Format string for float values

  • **options – Additional writing options

Return type:

None

mdr.io.writers.write_json(data, filepath, orient='columns', **options)[source]

Write data to a JSON file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403252112'>]) – Dictionary mapping column names to data arrays

  • filepath (str) – Path to the JSON file

  • orient (str) – JSON format

  • **options – Additional writing options

Return type:

None

mdr.io.writers.write_excel(data, filepath, sheet_name='Sheet1', **options)[source]

Write data to an Excel file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403433728'>]) – Dictionary mapping column names to data arrays

  • filepath (str) – Path to the Excel file

  • sheet_name (str) – Name of the sheet

  • **options – Additional writing options

Return type:

None

mdr.io.writers.write_parquet(data, filepath, compression='snappy', **options)[source]

Write data to a Parquet file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403458368'>]) – Dictionary mapping column names to data arrays

  • filepath (str) – Path to the Parquet file

  • compression (str) – Compression method

  • **options – Additional writing options

Return type:

None

mdr.io.writers.write_hdf5(data, filepath, key, **options)[source]

Write data to an HDF5 file.

Parameters:
  • data (Dict[str, <MagicMock id='136017405271856'>]) – Dictionary mapping column names to data arrays

  • filepath (str) – Path to the HDF5 file

  • key (str) – Group identifier in the HDF5 file

  • **options – Additional writing options

Return type:

None

Overview

The writers module provides functions for writing processed data to various file formats. These functions handle data serialization, formatting, and output to persistent storage.

Supported File Formats

The module supports writing data to the following formats:

  • CSV: Comma-separated values files

  • JSON: JavaScript Object Notation files

  • Excel: Microsoft Excel workbooks (.xlsx)

  • Parquet: Apache Parquet columnar storage files

  • HDF5: Hierarchical Data Format version 5 files

Core Functions

mdr.io.writers.write_csv(data, filepath, delimiter=',', float_format='%.6f', **options)[source]

Write data to a CSV file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403276752'>]) – Dictionary mapping column names to data arrays

  • filepath (str) – Path to the CSV file

  • delimiter (str) – Field delimiter

  • float_format (str) – Format string for float values

  • **options – Additional writing options

Return type:

None

mdr.io.writers.write_json(data, filepath, orient='columns', **options)[source]

Write data to a JSON file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403252112'>]) – Dictionary mapping column names to data arrays

  • filepath (str) – Path to the JSON file

  • orient (str) – JSON format

  • **options – Additional writing options

Return type:

None

mdr.io.writers.write_excel(data, filepath, sheet_name='Sheet1', **options)[source]

Write data to an Excel file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403433728'>]) – Dictionary mapping column names to data arrays

  • filepath (str) – Path to the Excel file

  • sheet_name (str) – Name of the sheet

  • **options – Additional writing options

Return type:

None

mdr.io.writers.write_parquet(data, filepath, compression='snappy', **options)[source]

Write data to a Parquet file.

Parameters:
  • data (Dict[str, <MagicMock id='136017403458368'>]) – Dictionary mapping column names to data arrays

  • filepath (str) – Path to the Parquet file

  • compression (str) – Compression method

  • **options – Additional writing options

Return type:

None

mdr.io.writers.write_hdf5(data, filepath, key, **options)[source]

Write data to an HDF5 file.

Parameters:
  • data (Dict[str, <MagicMock id='136017405271856'>]) – Dictionary mapping column names to data arrays

  • filepath (str) – Path to the HDF5 file

  • key (str) – Group identifier in the HDF5 file

  • **options – Additional writing options

Return type:

None

Usage Examples

Writing to a CSV file:

import numpy as np
from mdr.io.writers import write_csv

# Create a dictionary of data variables
data_dict = {
    "time": np.array([0, 1, 2, 3, 4]),
    "temperature": np.array([20.5, 21.3, 22.1, 21.7, 23.0]),
    "pressure": np.array([101.3, 101.4, 101.5, 101.2, 101.1])
}

# Write data to a CSV file
write_csv(data_dict, "path/to/output.csv")

Writing to multiple formats:

from mdr.io.writers import write_csv, write_json, write_excel

# Write to multiple formats for different use cases
write_csv(data_dict, "path/to/output.csv")  # For general use
write_json(data_dict, "path/to/output.json")  # For web applications
write_excel(data_dict, "path/to/output.xlsx")  # For spreadsheet analysis