Skip to content

read_csv_folder

Hassan Syyid edited this page Feb 11, 2021 · 1 revision

read_csv_folder

Convenience method to read a set of CSV files in a folder, based on Panda's read_csv(). This method assumes that the files are being pulled in a stream and follow a naming convention with the stream/ entity / table name is the first word in the file name (format specified by Singer) for example; Account-20200811T121507.csv is for an entity called Account

Definition

read_csv_folder(path, converters={}, index_cols={})
    :param path: the folder directory
    :param converters: a dictionary with an array of converters that are passed to
    read_csv, the key of the dictionary is the name of the entity.
    :param index_cols: a dictionary with an array of
    index_cols, the key of the dictionary is the name of the entity.
    :return: a dict of pandas.DataFrames. the keys of which are the entity names

Example

entity_data = read_csv_folder(CSV_FOLDER_PATH, index_cols={'Invoice': 'DocNumber'},
                        converters={'Invoice': {'Line': ast.literal_eval, 'CustomField': ast.literal_eval,
                                                'Categories': ast.literal_eval}})
df = entity_data['Account']
Clone this wiki locally