herbie.core.Herbie#

class herbie.core.Herbie(date=None, *, valid_date=None, model='hrrr', fxx=0, product=None, priority=None, save_dir=PosixPath('/home/docs/data'), overwrite=False, verbose=True, **kwargs)[source]#

Locate GRIB2 file at one of the archive sources.

Parameters:
  • date (pandas-parsable datetime) – Model initialization datetime. If None, then must set valid_date.

  • valid_date (pandas-parsable datetime) – Model valid datetime. Must set when date is None.

  • fxx (int or pandas-parsable timedelta (e.g. "6h")) – Forecast lead time in hours. Available lead times depend on the model type and model version.

  • model ({'hrrr', 'hrrrak', 'rap', 'gfs', 'ecmwf', etc.}) – Model name as defined in the models template folder. CASE INSENSITIVE; e.g., “HRRR” is the same as “hrrr”.

  • product ({'sfc', 'prs', 'nat', 'subh', etc.}) – Output variable product file type. If not specified, will use first product in model template file. CASE SENSITIVE. For example, the HRRR model has these products: - 'sfc' surface fields - 'prs' pressure fields - 'nat' native fields - 'subh' subhourly fields

  • member (None or int) – Some ensemble models (e.g. the future RRFS) will need to specify an ensemble member.

  • priority (list or str) – List of model sources to get the data in the order of download priority. CASE INSENSITIVE. Some example data sources and the default priority order are listed below. - 'aws' Amazon Web Services (Big Data Program) - 'nomads' NOAA’s NOMADS server - 'google' Google Cloud Platform (Big Data Program) - 'azure' Microsoft Azure (Big Data Program) - 'pando' University of Utah Pando Archive (gateway 1) - 'pando2' University of Utah Pando Archive (gateway 2)

  • save_dir (str or pathlib.Path) – Location to save GRIB2 files locally. Default save directory is set in ~/.config/herbie/config.cfg.

  • overwrite (bool) – If True, look for GRIB2 files on remote servers even if a local copy exists. If False (default), use the GRIB local copy if it exits. Note: it will still look for the idx file on the remote or try to generate the idx file if wgrib2 is installed.

  • **kwargs – Any other parameter needed to satisfy the conditions in the model template file (e.g., nest=2, other_label=’run2’)

__init__(date=None, *, valid_date=None, model='hrrr', fxx=0, product=None, priority=None, save_dir=PosixPath('/home/docs/data'), overwrite=False, verbose=True, **kwargs)[source]#

Specify model output and find GRIB2 file at one of the sources.

Methods

__init__([date, valid_date, model, fxx, ...])

Specify model output and find GRIB2 file at one of the sources.

download([search, searchString, source, ...])

Download file from source.

find_grib()

Find a GRIB file from the archive sources.

find_idx()

Find an index file for the GRIB file.

get_localFilePath([search, searchString])

Get full path to the local file.

help()

Print help message if available.

inventory([search, searchString, verbose])

Inspect the GRIB2 file contents by reading the index file.

tell_me_everything()

Print all the attributes of the Herbie object.

terrain([water_masked])

Shortcut method to return model terrain as an xarray.Dataset.

xarray([search, searchString, ...])

Open GRIB2 data as xarray DataSet.

Attributes

config

get_localFileName

Predict the local file name.

get_remoteFileName

Predict remote file name (assumes all sources are named the same).

index_as_dataframe

Read and cache the full index file.

Methods:

__init__([date, valid_date, model, fxx, ...])

Specify model output and find GRIB2 file at one of the sources.

download([search, searchString, source, ...])

Download file from source.

find_grib()

Find a GRIB file from the archive sources.

find_idx()

Find an index file for the GRIB file.

get_localFilePath([search, searchString])

Get full path to the local file.

help()

Print help message if available.

inventory([search, searchString, verbose])

Inspect the GRIB2 file contents by reading the index file.

tell_me_everything()

Print all the attributes of the Herbie object.

terrain([water_masked])

Shortcut method to return model terrain as an xarray.Dataset.

xarray([search, searchString, ...])

Open GRIB2 data as xarray DataSet.

Attributes:

get_localFileName

Predict the local file name.

get_remoteFileName

Predict remote file name (assumes all sources are named the same).

index_as_dataframe

Read and cache the full index file.

__init__(date=None, *, valid_date=None, model='hrrr', fxx=0, product=None, priority=None, save_dir=PosixPath('/home/docs/data'), overwrite=False, verbose=True, **kwargs)[source]#

Specify model output and find GRIB2 file at one of the sources.

download(search=None, *, searchString=None, source=None, save_dir=None, overwrite=None, verbose=None, errors='warn')[source]#

Download file from source.

TODO: When we download a full file, the value of self.grib and TODO: self.grib_source should change to represent the local file.

Subsetting by variable follows the same principles described here: https://www.cpc.ncep.noaa.gov/products/wesley/fast_downloading_grib.html

Parameters:
  • search (str) – If None, download the full file. Else, use regex to subset the file by specific variables and levels. Read more in the user guide: https://herbie.readthedocs.io/en/latest/user_guide/search.html

  • source ({'nomads', 'aws', 'google', 'azure', 'pando', 'pando2'}) – If None, download GRIB2 file from self.grib2 which is the first location the GRIB2 file was found from the priority lists when this class was initialized. Else, you may specify the source to force downloading it from a different location.

  • save_dir (str or pathlib.Path) – Location to save the model output files. If None, uses the default or path specified in __init__. Else, changes the path files are saved.

  • overwrite (bool) – If True, overwrite existing files. Default will skip downloading if the full file exists. Not applicable when when search is not None because file subsets might be unique.

  • errors ({'warn', 'raise'}) – When an error occurs, send a warning or raise a value error.

find_grib()[source]#

Find a GRIB file from the archive sources.

Returns:

  • 1) The URL or pathlib.Path to the GRIB2 files that exists

  • 2) The source of the GRIB2 file

find_idx()[source]#

Find an index file for the GRIB file.

property get_localFileName#

Predict the local file name.

get_localFilePath(search=None, *, searchString=None)[source]#

Get full path to the local file.

property get_remoteFileName#

Predict remote file name (assumes all sources are named the same).

help()[source]#

Print help message if available.

property index_as_dataframe#

Read and cache the full index file.

inventory(search=None, *, searchString=None, verbose=None)[source]#

Inspect the GRIB2 file contents by reading the index file.

This reads index files created with the wgrib2 utility.

Parameters:
  • search (str) –

    Filter dataframe by a search regular expression. Searches for strings in the index file lines, specifically the variable, level, and forecast_time columns. Execute _search_help() for examples of a good search.

    Read more in the user guide at https://herbie.readthedocs.io/en/latest/user_guide/search.html

  • verbose (None, bool) – If True, then print a help message if no messages are found. If False, does not print a help message if no messages are found. If None (default), then verbose is set in the Herbie.__init__.

Return type:

A Pandas DataFrame of the index file.

tell_me_everything()[source]#

Print all the attributes of the Herbie object.

terrain(water_masked=True)[source]#

Shortcut method to return model terrain as an xarray.Dataset.

xarray(search=None, *, searchString=None, backend_kwargs={}, remove_grib=True, **download_kwargs)[source]#

Open GRIB2 data as xarray DataSet.

Parameters:
  • search (str) – Variables to read into xarray Dataset

  • remove_grib (bool) – If True, grib file will be removed ONLY IF it didn’t exist before we downloaded it.