gpm.dataset package#
Subpackages#
- gpm.dataset.decoding package
- Submodules
- gpm.dataset.decoding.cf module
- gpm.dataset.decoding.coordinates module
- gpm.dataset.decoding.dataarray_attrs module
- gpm.dataset.decoding.decode_1c_pmw module
- gpm.dataset.decoding.decode_2a_pmw module
decode_airmassLiftIndex()
decode_cloudWaterPath()
decode_iceWaterPath()
decode_pixelStatus()
decode_precip1stTertial()
decode_precip2ndTertial()
decode_precipitationYesNoFlag()
decode_product()
decode_qualityFlag()
decode_rainWaterPath()
decode_sunGlintAngle()
decode_surfacePrecipitation()
decode_surfaceTypeIndex()
- gpm.dataset.decoding.decode_2a_radar module
decode_attenuationNP()
decode_binBBBottom()
decode_binBBPeak()
decode_binBBTop()
decode_binDFRmMLBottom()
decode_binDFRmMLTop()
decode_binHeavyIcePrecipBottom()
decode_binHeavyIcePrecipTop()
decode_flagAnvil()
decode_flagBB()
decode_flagGraupelHail()
decode_flagHail()
decode_flagHeavyIcePrecip()
decode_flagPrecip()
decode_flagShallowRain()
decode_flagSurfaceSnowfall()
decode_heightBB()
decode_landSurfaceType()
decode_phase()
decode_phaseNearSurface()
decode_product()
decode_qualityTypePrecip()
decode_widthBB()
decode_zFactorMeasured()
- gpm.dataset.decoding.decode_imerg module
decode_HQobservationTime()
decode_HQprecipSource()
decode_HQprecipitation()
decode_IRinfluence()
decode_IRkalmanFilterWeight()
decode_IRprecipitation()
decode_MWobservationTime()
decode_MWprecipSource()
decode_MWprecipitation()
decode_precipitation()
decode_precipitationCal()
decode_precipitationQualityIndex()
decode_precipitationUncal()
decode_probabilityLiquidPrecipitation()
decode_product()
decode_randomError()
- gpm.dataset.decoding.routines module
- gpm.dataset.decoding.utils module
- Module contents
Submodules#
gpm.dataset.attrs module#
This module contains functions to parse GPM granule attributes.
- gpm.dataset.attrs.decode_attrs(attrs)[source]#
Decode GPM nested dictionary attributes from a xarray object.
gpm.dataset.conventions module#
This module contains functions to enforce CF-conventions into the GPM-API objects.
gpm.dataset.coords module#
This module contains functions to extract the coordinates from GPM files.
gpm.dataset.crs module#
This module contains functions to define and create CF-compliant CRS.
- gpm.dataset.crs.get_pyproj_crs(xr_obj)[source]#
Return
pyproj.crs.CoordinateSystem
from CRS coordinate(s).If a geographic and projected CRS are present, it returns the projected.
- Parameters:
xr_obj (xarray.Dataset or xarray.DataArray) –
- Returns:
proj_crs
- Return type:
CoordinateSystem
- gpm.dataset.crs.get_pyresample_area(xr_obj)[source]#
Define pyresample area from CF-compliant xarray object.
To be used by the pyresample accessor: ds.pyresample.area
- gpm.dataset.crs.get_pyresample_projection(xr_obj)[source]#
Get pyresample AreaDefinition from CF-compliant xarray object.
- gpm.dataset.crs.get_pyresample_swath(xr_obj)[source]#
Get pyresample SwathDefinition from CF-compliant xarray object.
- gpm.dataset.crs.set_dataset_crs(ds, crs, grid_mapping_name='spatial_ref', inplace=False)[source]#
Add CF-compliant CRS information to an xr.Dataset.
It assumes all dataset variables have same CRS ! For projected CRS, it expects that the CRS dimension coordinates are specified. For swath dataset, it expects that the geographic coordinates are specified.
For projected CRS, if 2D latitude/longitude arrays are specified, it assumes they refer to the WGS84 CRS !
- Parameters:
ds (xarray.Dataset) –
crs (
CoordinateSystem
) – CRS information to be added to the xr.Datasetgrid_mapping_name (str) – Name of the grid_mapping coordinate to store the CRS information The default is
spatial_ref
. Other common names aregrid_mapping
andcrs
.
- Returns:
ds – Dataset with CF-compliant CRS information.
- Return type:
xarray.Dataset
- gpm.dataset.crs.set_dataset_single_crs(ds, crs, grid_mapping_name='spatial_ref', inplace=False)[source]#
Add CF-compliant CRS information to an xr.Dataset.
It assumes all dataset variables have same CRS ! For projected CRS, it expects that the CRS dimension coordinates are specified. For swath dataset, it expects that the geographic coordinates are specified.
- Parameters:
ds (xarray.Dataset) –
crs (
CoordinateSystem
) – CRS information to be added to the xr.Datasetgrid_mapping_name (str) – Name of the grid_mapping coordinate to store the CRS information The default is
spatial_ref
. Other common names aregrid_mapping
andcrs
.
- Returns:
ds – Dataset with CF-compliant CRS information.
- Return type:
xarray.Dataset
gpm.dataset.dataset module#
This module contains functions to read files into a GPM-API Dataset.
- gpm.dataset.dataset.open_dataset(product, start_time, end_time, variables=None, groups=None, scan_mode=None, version=None, product_type='RS', chunks={}, decode_cf=True, parallel=False, prefix_group=False, verbose=False)[source]#
Lazily map HDF5 data into
xarray.Dataset
with relevant GPM data and attributes.Note:
gpm.open_dataset
does not load GPM granules with the FileHeader flag'EmptyGranule' != 'NOT_EMPTY'
The group
ScanStatus
provides relevant data flags for Swath products.The variable
dataQuality
provides an overall quality flag status. IfdataQuality = 0
, no issues have been detected.The variable
SCorientation
provides the orientation of the sensor from the forward track of the satellite.
- Parameters:
product (str) – GPM product acronym.
start_time ((datetime.datetime, datetime.date, np.datetime64, str)) – Start time. Accepted types:
datetime.datetime
,datetime.date
,np.datetime64
orstr
. If string type, it expects the isoformatYYYY-MM-DD hh:mm:ss
.end_time ((datetime.datetime, datetime.date, np.datetime64, str)) – End time. Accepted types:
datetime.datetime
,datetime.date
,np.datetime64
orstr
. If string type, it expects the isoformatYYYY-MM-DD hh:mm:ss
.variables (list, str, optional) – Variables to read from the HDF5 file. The default is
None
(all variables).groups (list, str, optional) – HDF5 Groups from which to read all variables. The default is
None
(all groups).scan_mode (str, optional) –
Scan mode of the GPM product. The default is
None
. Usegpm.available_scan_modes(product, version)
to get the available scan modes for a specific product. The radar products have the following scan modes:'FS'
: Full Scan. For Ku, Ka and DPR (since version 7 products).'NS'
: Normal Scan. For Ku band and DPR (till version 6 products).'MS'
: Matched Scan. For Ka band and DPR (till version 6 products).'HS'
: High-sensitivity Scan. For Ka band and DPR.
product_type (str, optional) – GPM product type. Either
'RS'
(Research) or'NRT'
(Near-Real-Time). The default is'RS'
.version (int, optional) – GPM version of the data to retrieve if
product_type = "RS"
. GPM data readers currently support version 4, 5, 6 and 7.chunks (int, dict, 'auto' or None, optional) –
Chunk size for dask array:
chunks=-1
loads the dataset with dask using a single chunk for all arrays.chunks={}
loads the dataset with dask using the file chunks.chunks='auto'
will use daskauto
chunking taking into account the file chunks.
If you want to load data in memory directly, specify
chunks=None
. The default is{}
.Hint: xarray’s lazy loading of remote or on-disk datasets is often but not always desirable. Before performing computationally intense operations, load the dataset entirely into memory by invoking
ds.compute()
.decode_cf (bool, optional) – Whether to decode the dataset. The default is
False
.prefix_group (bool, optional) – Whether to add the group as a prefix to the variable names. If you aim to save the Dataset to disk as netCDF or Zarr, you need to set
prefix_group=False
or later remove the prefix before writing the dataset. The default isFalse
.parallel (bool) – If
True
, the dataset are opened in parallel usingdask.delayed
. Ifparallel=True
,'chunks'
can not beNone
. The underlying data must bedask.Array
. The default isFalse
.
- Return type:
xarray.Dataset
gpm.dataset.datatree module#
This module contains functions to read a GPM granule into a DataTree object.
- gpm.dataset.datatree.check_non_empty_granule(dt, filepath)[source]#
Check that the datatree (or dataset) is not empty.
- gpm.dataset.datatree.check_valid_granule(filepath)[source]#
Raise an explanatory error if the GPM granule is not readable.
- gpm.dataset.datatree.open_datatree(filepath, chunks={}, decode_cf=False, use_api_defaults=True)[source]#
Open HDF5 in datatree object.
chunks={} –> Lazy map to dask.array –> Wait for pydata/xarray#7948 –> Maybe need to implement “auto” option manually that defaults to full shape”
chunks=”auto” –> datatree fails. Can not estimate size of object dtype !
chunks=None –> lazy map to numpy.array
gpm.dataset.dimensions module#
This module contains functions to retrieve the dimensions associated to each GPM variable.
gpm.dataset.granule module#
This module contains functions to read a single file into a GPM-API Dataset.
- gpm.dataset.granule.get_variables_dims(ds)[source]#
Retrieve the dimensions used by the xr.Dataset variables.
- gpm.dataset.granule.open_granule(filepath, scan_mode=None, groups=None, variables=None, decode_cf=True, chunks={}, prefix_group=False)[source]#
Create a lazy
xarray.Dataset
with relevant GPM data and attributes for a specific granule.- Parameters:
filepath (str) – Filepath of GPM granule dataset
scan_mode (str, optional) –
Scan mode of the GPM product. The default is
None
. Usegpm.available_scan_modes(product, version)
to get the available scan modes for a specific product. The radar products have the following scan modes:'FS'
: Full Scan. For Ku, Ka and DPR (since version 7 products).'NS'
: Normal Scan. For Ku band and DPR (till version 6 products).'MS'
: Matched Scan. For Ka band and DPR (till version 6 products).'HS'
: High-sensitivity Scan. For Ka band and DPR.
variables (list, str, optional) – Variables to read from the HDF5 file. The default is
None
(all variables).groups (list, str, optional) – HDF5 Groups from which to read all variables. The default is
None
(all groups).chunks (int, dict, 'auto' or None, optional) –
Chunk size for dask array:
chunks=-1
loads the dataset with dask using a single chunk for all arrays.chunks={}
loads the dataset with dask using the file chunks.chunks='auto'
will use daskauto
chunking taking into account the file chunks.
If you want to load data in memory directly, specify
chunks=None
. The default is{}
.Hint: xarray’s lazy loading of remote or on-disk datasets is often but not always desirable. Before performing computationally intense operations, load the dataset entirely into memory by invoking
ds.compute()
.decode_cf (bool, optional) – Whether to decode the dataset. The default is
False
.prefix_group (bool, optional) – Whether to add the group as a prefix to the variable names. THe default is
True
.
- Returns:
ds
- Return type:
xarray.Dataset
gpm.dataset.groups_variables module#
This module contains functions to read GPM file groups, sub-groups and variables.
Module contents#
This directory defines the GPM-API datasets.