-
Notifications
You must be signed in to change notification settings - Fork 8
read_csv_folder
Hassan Syyid edited this page Feb 11, 2021
·
1 revision
Convenience method to read a set of CSV files in a folder, based on Panda's read_csv()
. This method assumes that the files are being pulled in a stream and follow a naming convention with the stream/ entity / table name is the first word in the file name (format specified by Singer) for example; Account-20200811T121507.csv
is for an entity called Account
read_csv_folder(path, converters={}, index_cols={})
:param path: the folder directory
:param converters: a dictionary with an array of converters that are passed to
read_csv, the key of the dictionary is the name of the entity.
:param index_cols: a dictionary with an array of
index_cols, the key of the dictionary is the name of the entity.
:return: a dict of pandas.DataFrames. the keys of which are the entity names
entity_data = read_csv_folder(CSV_FOLDER_PATH, index_cols={'Invoice': 'DocNumber'},
converters={'Invoice': {'Line': ast.literal_eval, 'CustomField': ast.literal_eval,
'Categories': ast.literal_eval}})
df = entity_data['Account']