maad.util.date_parser

maad.util.date_parser(datadir, dateformat='%Y%m%d_%H%M%S', extension='.wav', prefix='', verbose=False)[source]

Extracts dates from filenames in a given folder and subfolders.

Parameters:
datadirstr

Path to the folder to search for files.

dateformatstr, optional

Format string specifying the datetime pattern to extract. The default is’%Y%m%d_%H%M%S’ For more information about the format codes, refer to the strftime format documentation.

extensionstr, optional,

File extension to filter files by (e.g., ‘.wav’, ‘.mp3’). The default is ‘.wav’.

prefixstr, optional,

Prefix of the filenames to match. The default is ‘’.

verbosebool, optional

If True, print the filenames as they are processed. The default is False.

Returns:
pandas.DataFrame

DataFrame containing the extracted dates as the index ‘Date’, and the full file paths in a ‘file’ column.

Raises:
ValueError

If the datetime_format is invalid or does not match the filenames.

Notes

This function searches for files in the specified folder and its subfolders that have the given extension and match the specified prefix. It extracts the dates from the filenames using the provided datetime_format.

The extracted dates are set as the index of the resulting DataFrame. The ‘file’ column contains the full file paths.

Examples

>>> folder_path = '../../data/indices/'
>>> ext = '.wav'
>>> datetime_format = '%Y%m%d_%H%M%S'
>>> df = maad.util.date_parser(datadir=folder_path, dateformat=datetime_format, extension=ext)
>>> df
                                                                file
Date        
2019-05-22 00:00:00 ../../data/indices/S4A03895_20190522_000000.wav
2019-05-22 00:15:00 ../../data/indices/S4A03895_20190522_001500.wav
2019-05-22 00:30:00 ../../data/indices/S4A03895_20190522_003000.wav
2019-05-22 00:45:00 ../../data/indices/S4A03895_20190522_004500.wav
2019-05-22 01:00:00 ../../data/indices/S4A03895_20190522_010000.wav
                ...                                             ...
2019-05-22 22:45:00 ../../data/indices/S4A03895_20190522_224500.wav
2019-05-22 23:00:00 ../../data/indices/S4A03895_20190522_230000.wav
2019-05-22 23:15:00 ../../data/indices/S4A03895_20190522_231500.wav
2019-05-22 23:30:00 ../../data/indices/S4A03895_20190522_233000.wav
2019-05-22 23:45:00 ../../data/indices/S4A03895_20190522_234500.wav
>>> df = maad.util.date_parser("../../data/indices/", dateformat='SM4', verbose=False)
>>> list(df)
>>> df
                                                                file
Date        
2019-05-22 00:00:00 ../../data/indices/S4A03895_20190522_000000.wav
2019-05-22 00:15:00 ../../data/indices/S4A03895_20190522_001500.wav
2019-05-22 00:30:00 ../../data/indices/S4A03895_20190522_003000.wav
2019-05-22 00:45:00 ../../data/indices/S4A03895_20190522_004500.wav
2019-05-22 01:00:00 ../../data/indices/S4A03895_20190522_010000.wav
                ...                                             ...
2019-05-22 22:45:00 ../../data/indices/S4A03895_20190522_224500.wav
2019-05-22 23:00:00 ../../data/indices/S4A03895_20190522_230000.wav
2019-05-22 23:15:00 ../../data/indices/S4A03895_20190522_231500.wav
2019-05-22 23:30:00 ../../data/indices/S4A03895_20190522_233000.wav
2019-05-22 23:45:00 ../../data/indices/S4A03895_20190522_234500.wav