maad.util.date_parser
- maad.util.date_parser(datadir, dateformat='%Y%m%d_%H%M%S', extension='.wav', prefix='', verbose=False)[source]
Extracts dates from filenames in a given folder and subfolders.
- Parameters:
- datadirstr
Path to the folder to search for files.
- dateformatstr, optional
Format string specifying the datetime pattern to extract. The default is’%Y%m%d_%H%M%S’ For more information about the format codes, refer to the strftime format documentation.
- extensionstr, optional,
File extension to filter files by (e.g., ‘.wav’, ‘.mp3’). The default is ‘.wav’.
- prefixstr, optional,
Prefix of the filenames to match. The default is ‘’.
- verbosebool, optional
If True, print the filenames as they are processed. The default is False.
- Returns:
- pandas.DataFrame
DataFrame containing the extracted dates as the index ‘Date’, and the full file paths in a ‘file’ column.
- Raises:
- ValueError
If the datetime_format is invalid or does not match the filenames.
Notes
This function searches for files in the specified folder and its subfolders that have the given extension and match the specified prefix. It extracts the dates from the filenames using the provided datetime_format.
The extracted dates are set as the index of the resulting DataFrame. The ‘file’ column contains the full file paths.
Examples
>>> folder_path = '../../data/indices/' >>> ext = '.wav' >>> datetime_format = '%Y%m%d_%H%M%S' >>> df = maad.util.date_parser(datadir=folder_path, dateformat=datetime_format, extension=ext) >>> df file Date 2019-05-22 00:00:00 ../../data/indices/S4A03895_20190522_000000.wav 2019-05-22 00:15:00 ../../data/indices/S4A03895_20190522_001500.wav 2019-05-22 00:30:00 ../../data/indices/S4A03895_20190522_003000.wav 2019-05-22 00:45:00 ../../data/indices/S4A03895_20190522_004500.wav 2019-05-22 01:00:00 ../../data/indices/S4A03895_20190522_010000.wav ... ... 2019-05-22 22:45:00 ../../data/indices/S4A03895_20190522_224500.wav 2019-05-22 23:00:00 ../../data/indices/S4A03895_20190522_230000.wav 2019-05-22 23:15:00 ../../data/indices/S4A03895_20190522_231500.wav 2019-05-22 23:30:00 ../../data/indices/S4A03895_20190522_233000.wav 2019-05-22 23:45:00 ../../data/indices/S4A03895_20190522_234500.wav
>>> df = maad.util.date_parser("../../data/indices/", dateformat='SM4', verbose=False) >>> list(df) >>> df file Date 2019-05-22 00:00:00 ../../data/indices/S4A03895_20190522_000000.wav 2019-05-22 00:15:00 ../../data/indices/S4A03895_20190522_001500.wav 2019-05-22 00:30:00 ../../data/indices/S4A03895_20190522_003000.wav 2019-05-22 00:45:00 ../../data/indices/S4A03895_20190522_004500.wav 2019-05-22 01:00:00 ../../data/indices/S4A03895_20190522_010000.wav ... ... 2019-05-22 22:45:00 ../../data/indices/S4A03895_20190522_224500.wav 2019-05-22 23:00:00 ../../data/indices/S4A03895_20190522_230000.wav 2019-05-22 23:15:00 ../../data/indices/S4A03895_20190522_231500.wav 2019-05-22 23:30:00 ../../data/indices/S4A03895_20190522_233000.wav 2019-05-22 23:45:00 ../../data/indices/S4A03895_20190522_234500.wav