h2integrate.core.file_utils#
Functions
|
Check csv file format for the csv file used for the CSVGenerator generator. |
|
This function attempts to find a filepath matching filename from a variety of locations in the following order: |
|
Convert a string or Path object to an absolute Path object, prioritizing different locations. |
|
|
|
Generate a filename that does not already exist in a user-defined folder. |
|
Writes a dictionary to a YAML file using the yaml library. |
|
Writes a dictionary to a YAML file using the ruamel.yaml library. |
Classes
|
Exceptions
|
Exception raised when a duplicate YAML key is found. |
- h2integrate.core.file_utils.get_path(path)#
Convert a string or Path object to an absolute Path object, prioritizing different locations.
This function attempts to find the existence of a path in the following order: 1. As an absolute path. 2. Relative to the current working directory. 3. Relative to the H2Integrate package.
- Parameters:
path (str | Path) – The input path, either as a string or a Path object.
- Raises:
FileNotFoundError – If the path is not found in any of the locations.
- Returns:
Path – The absolute path to the file.
- Return type:
Path
- h2integrate.core.file_utils.find_file(filename, root_folder=None)#
This function attempts to find a filepath matching filename from a variety of locations in the following order:
Relative to the root_folder (if provided)
Relative to the current working directory.
Relative to the H2Integrate package.
As an absolute path if filename is already absolute
- Parameters:
filename (str | Path) – Input filepath
root_folder (str | Path, optional) – Root directory to search for filename in. Defaults to None.
- Raises:
FileNotFoundError – If the path is not found in any of the locations.
- Returns:
Path – The absolute path to the file.
- exception h2integrate.core.file_utils.DuplicateKeyError(message)#
Exception raised when a duplicate YAML key is found.
- Parameters:
( (message) – obj:str): The duplicate key error message to be displayed.
- class h2integrate.core.file_utils.Loader(stream)#
- include(node)#
- compose_node(parent, index)#
Custom implementation to include line numbers that account for all lines, including blank spaces that align with user anticipated 1-indexing.
- construct_mapping(node, deep=False)#
Hooks into the
yaml.SafeLoader.construct_mappingroutine to create line number mappings for all keys and values, which enables duplicate key error handling.Two copies of node are created to avoid errors when run through the validation schema as the
__line__{key}and__line__keys in the key and value nodes are not represented by the schema, and therefore raise an error during validation.
- check_duplicate_keys(numbered_node, node, deep=False)#
Raises an error for duplicate keys and calls the
SafeLoader.construct_mapping()routine to create the final dictionary mappings.
- yaml_constructors = {'!include': <function Loader.include>, 'tag:yaml.org,2002:binary': <function SafeConstructor.construct_yaml_binary>, 'tag:yaml.org,2002:bool': <function SafeConstructor.construct_yaml_bool>, 'tag:yaml.org,2002:float': <function SafeConstructor.construct_yaml_float>, 'tag:yaml.org,2002:int': <function SafeConstructor.construct_yaml_int>, 'tag:yaml.org,2002:map': <function SafeConstructor.construct_yaml_map>, 'tag:yaml.org,2002:null': <function SafeConstructor.construct_yaml_null>, 'tag:yaml.org,2002:omap': <function SafeConstructor.construct_yaml_omap>, 'tag:yaml.org,2002:pairs': <function SafeConstructor.construct_yaml_pairs>, 'tag:yaml.org,2002:seq': <function SafeConstructor.construct_yaml_seq>, 'tag:yaml.org,2002:set': <function SafeConstructor.construct_yaml_set>, 'tag:yaml.org,2002:str': <function SafeConstructor.construct_yaml_str>, 'tag:yaml.org,2002:timestamp': <function SafeConstructor.construct_yaml_timestamp>, None: <function SafeConstructor.construct_undefined>}#
- h2integrate.core.file_utils.load_yaml(filename, loader=<class 'h2integrate.core.file_utils.Loader'>)#
- Return type:
dict
- h2integrate.core.file_utils.write_yaml(instance, foutput, convert_np=True, check_formatting=False)#
Writes a dictionary to a YAML file using the ruamel.yaml library.
- Parameters:
instance (dict) – Dictionary to be written to the YAML file.
foutput (str) – Path to the output YAML file.
convert_np (bool) – Whether to convert numpy objects to simple types. Defaults to True.
check_formatting (bool) – Whether to check formatting to convert numpy arrays to lists. Defaults to False.
- Returns:
None
- Return type:
None
- h2integrate.core.file_utils.write_readable_yaml(instance, foutput)#
Writes a dictionary to a YAML file using the yaml library.
- Parameters:
instance (dict) – Dictionary to be written to the YAML file.
foutput (str | Path) – Path to the output YAML file.
- Returns:
None
- h2integrate.core.file_utils.make_unique_case_name(folder, proposed_fname, fext)#
Generate a filename that does not already exist in a user-defined folder.
- Parameters:
folder (str | Path) – directory that a file is expected to be created in.
proposed_fname (str) – filename (with extension) to check for existence and to use as the base file description of a new an unique file name.
fext (str) – file extension, such as “.csv”, “.sql”, “.yaml”, etc.
- Returns:
str – unique filename that does not yet exist in folder.
- h2integrate.core.file_utils.check_file_format_for_csv_generator(csv_fpath, driver_config, check_only=True, overwrite_file=False)#
Check csv file format for the csv file used for the CSVGenerator generator.
Note
Future development could include further checking the values within the rows of the csv file and more rigorous checking of columns with empty headers.
- Parameters:
csv_fpath (str | Path) – filepath to csv file used for ‘csvgen’ generator.
driver_config (dict) – driver configuration dictionary
check_only (bool, optional) – If True, only check if file is error-free and return a boolean. If False, also create a valid csv file if errors are found in the original csv file. Defaults to True.
overwrite_file (bool, optional) – If True, overwrites the input csv file with possible errors removed. If False, writes a new csv file with a unique name. Only used if check_only is False. Defaults to False.
- Raises:
ValueError – If there are errors in the csv file beyond general formatting errors.
- Returns:
bool | Path –
- returns a boolean if check_only is True, or a Path object is check_only is
False. If check_only is True, returns True if the file appears error-free or False if errors are found. If check_only is False, returns the filepath of the new csv file that should not have errors.