herbie.fast.FastHerbie#
- class herbie.fast.FastHerbie(DATES: datetime | Timestamp | str | list[datetime | Timestamp | str], fxx: int | list[int] = [0], *, max_threads: int = 50, **kwargs)[source]#
Create many Herbie objects quickly.
- __init__(DATES: datetime | Timestamp | str | list[datetime | Timestamp | str], fxx: int | list[int] = [0], *, max_threads: int = 50, **kwargs)[source]#
Create many Herbie objects with methods to download or read with xarray.
Uses multithreading.
Note
Currently, Herbie objects looped by run datetime (date) and forecast lead time (fxx).
- Parameters:
DATES (pandas-parsable datetime string or list of datetimes)
fxx (int or list of forecast lead times)
max_threads (int) – Maximum number of threads to use.
kwargs – Remaining keywords for Herbie object (e.g., model, product, priority, verbose, etc.)
Benchmark
---------
objects (Creating 48 Herbie) –
1 thread took 16 s
2 threads took 8 s
5 threads took 3.3 s
10 threads took 1.7 s
50 threads took 0.5 s
Methods
__init__(DATES[, fxx, max_threads])Create many Herbie objects with methods to download or read with xarray.
df()Organize Herbie objects into a DataFrame.
download([search, max_threads])Download many Herbie objects.
inventory([search])Get combined inventory DataFrame.
xarray(search, *[, max_threads])Read many Herbie objects into an xarray Dataset.
Methods:
__init__(DATES[, fxx, max_threads])Create many Herbie objects with methods to download or read with xarray.
df()Organize Herbie objects into a DataFrame.
download([search, max_threads])Download many Herbie objects.
inventory([search])Get combined inventory DataFrame.
xarray(search, *[, max_threads])Read many Herbie objects into an xarray Dataset.
- __init__(DATES: datetime | Timestamp | str | list[datetime | Timestamp | str], fxx: int | list[int] = [0], *, max_threads: int = 50, **kwargs)[source]#
Create many Herbie objects with methods to download or read with xarray.
Uses multithreading.
Note
Currently, Herbie objects looped by run datetime (date) and forecast lead time (fxx).
- Parameters:
DATES (pandas-parsable datetime string or list of datetimes)
fxx (int or list of forecast lead times)
max_threads (int) – Maximum number of threads to use.
kwargs – Remaining keywords for Herbie object (e.g., model, product, priority, verbose, etc.)
Benchmark
---------
objects (Creating 48 Herbie) –
1 thread took 16 s
2 threads took 8 s
5 threads took 3.3 s
10 threads took 1.7 s
50 threads took 0.5 s
- df() DataFrame[source]#
Organize Herbie objects into a DataFrame.
#? Why is this inefficient? Takes several seconds to display because the __str__ does a lot.
- download(search: str | None = None, *, max_threads: int = 20, **download_kwargs) list[Path][source]#
Download many Herbie objects.
Uses multithreading.
- Parameters:
search (string) – Regular expression string to specify which GRIB messages to download.
**download_kwargs – Any kwarg for Herbie’s download method.
Benchmark
---------
(TMP (Downloading 48 files with 1 variable) –
1 thread took 1 min 17 s
2 threads took 36 s
5 threads took 28 s
10 threads took 25 s
50 threads took 23 s
- inventory(search: str | None = None)[source]#
Get combined inventory DataFrame.
Useful for data discovery and checking your search before doing a download.
- xarray(search: str | None, *, max_threads: int | None = None, **xarray_kwargs) Dataset | list[Dataset][source]#
Read many Herbie objects into an xarray Dataset.
# TODO: Sometimes the Jupyter Cell always crashes when I run this. # TODO: “fatal flex scanner internal error–end of buffer missed”
Uses multithreading (or multiprocessing). This would likely benefit from multiprocessing instead.
- Parameters:
max_threads (int) – Control the maximum number of threads to use. If you use too many threads, you may run into memory limits.
Benchmark
---------
(TMP (Opening 48 files with 1 variable) –
1 thread took 1 min 45 s
2 threads took 55 s
5 threads took 39 s
10 threads took 39 s
50 threads took 37 s