PMW 1B and 1C products#

First, let’s import the package required in this tutorial.

[1]:
import datetime

import gpm

Let’s have a look at the available PMW products:

[2]:
gpm.available_products(product_categories="PMW", product_levels="1B")
[2]:
['1B-GMI', '1B-TMI']
[3]:
gpm.available_products(product_categories="PMW", product_levels="1C")
[3]:
['1C-AMSR2-GCOMW1',
 '1C-AMSRE-AQUA',
 '1C-AMSUB-NOAA15',
 '1C-AMSUB-NOAA16',
 '1C-AMSUB-NOAA17',
 '1C-ATMS-NOAA20',
 '1C-ATMS-NOAA21',
 '1C-ATMS-NPP',
 '1C-GMI',
 '1C-GMI-R',
 '1C-MHS-METOPA',
 '1C-MHS-METOPB',
 '1C-MHS-METOPC',
 '1C-MHS-NOAA18',
 '1C-MHS-NOAA19',
 '1C-SAPHIR-MT1',
 '1C-SSMI-F08',
 '1C-SSMI-F10',
 '1C-SSMI-F11',
 '1C-SSMI-F13',
 '1C-SSMI-F14',
 '1C-SSMI-F15',
 '1C-SSMIS-F16',
 '1C-SSMIS-F17',
 '1C-SSMIS-F18',
 '1C-SSMIS-F19',
 '1C-TMI']

1. Data Download#

Now let’s download a 1C PMW product over a couple of hours.

[4]:
# Specify the time period you are interested in
start_time = datetime.datetime.strptime("2020-08-02 12:00:00", "%Y-%m-%d %H:%M:%S")
end_time = datetime.datetime.strptime("2020-08-03 12:00:00", "%Y-%m-%d %H:%M:%S")
# Specify the product and product type
product = "1C-MHS-METOPB" # 1C-GMI, 1C-SSMIS-F17, ...
product_type = "RS"
# Specify the version
version = 7
[5]:
# Download the data
gpm.download(
    product=product,
    product_type=product_type,
    version=version,
    start_time=start_time,
    end_time=end_time,
    force_download=False,
    verbose=True,
    progress_bar=True,
    check_integrity=False,
)

All the available GPM 1C-MHS-METOPB product files are already on disk.

Once, the data are downloaded on disk, let’s load the 1C product and look at the dataset structure.

2. Data Loading#

[6]:
# Load the 1C dataset
# - If scan_mode is not specified, it automatically load one!
ds = gpm.open_dataset(
    product=product,
    product_type=product_type,
    version=version,
    start_time=start_time,
    end_time=end_time,
)
ds
/home/ghiggi/Python_Packages/gpm/gpm/dataset/conventions.py:43: GPM_Warning: 'The dataset start at 2020-08-02 12:00:01, although the specified start_time is 2020-08-02 12:00:00.'
  warnings.warn(msg, GPM_Warning)
[6]:
<xarray.Dataset>
Dimensions:                  (cross_track: 90, along_track: 32401, nchUIA1: 1,
                              pmw_frequency: 5)
Coordinates:
    SCorientation            (along_track) float32 dask.array<chunksize=(2045,), meta=np.ndarray>
    lon                      (cross_track, along_track) float32 ...
    lat                      (cross_track, along_track) float32 ...
    time                     (along_track) datetime64[ns] 2020-08-02T12:00:01...
    gpm_id                   (along_track) <U10 ...
    gpm_granule_id           (along_track) int64 ...
    gpm_cross_track_id       (cross_track) int64 ...
    gpm_along_track_id       (along_track) int64 ...
  * pmw_frequency            (pmw_frequency) <U8 '89.0V' '157.0V' ... '190.31V'
    crsWGS84                 int64 0
Dimensions without coordinates: cross_track, along_track, nchUIA1
Data variables:
    sunLocalTime             (cross_track, along_track) float32 dask.array<chunksize=(90, 2045), meta=np.ndarray>
    Quality                  (cross_track, along_track) float32 dask.array<chunksize=(90, 2045), meta=np.ndarray>
    incidenceAngle           (cross_track, along_track, nchUIA1) float32 dask.array<chunksize=(90, 2045, 1), meta=np.ndarray>
    sunGlintAngle            (cross_track, along_track, nchUIA1) float32 dask.array<chunksize=(90, 2045, 1), meta=np.ndarray>
    incidenceAngleIndex      (along_track, pmw_frequency) float32 dask.array<chunksize=(2045, 5), meta=np.ndarray>
    Tc                       (cross_track, along_track, pmw_frequency) float32 dask.array<chunksize=(90, 2045, 5), meta=np.ndarray>
    SClatitude               (along_track) float32 dask.array<chunksize=(2045,), meta=np.ndarray>
    SClongitude              (along_track) float32 dask.array<chunksize=(2045,), meta=np.ndarray>
    SCaltitude               (along_track) float32 dask.array<chunksize=(2045,), meta=np.ndarray>
    FractionalGranuleNumber  (along_track) float64 dask.array<chunksize=(2045,), meta=np.ndarray>
Attributes: (12/17)
    FileName:           1C.METOPB.MHS.XCAL2016-V.20200802-S114932-E133053.040...
    EphemerisFileName:
    AttitudeFileName:
    MissingData:        0
    DOI:                10.5067/GPM/MHS/METOPB/1C/07
    DOIauthority:       http://dx.doi.org/
    ...                 ...
    ProcessingSystem:   PPS
    DataFormatVersion:  7e
    MetadataVersion:    7e
    ScanMode:           S1
    history:            Created by ghiggi/gpm_api software on 2023-08-24 12:4...
    gpm_api_product:    1C-MHS-METOPB

If you want to load another scan_mode, first have a look at the available ones:

[7]:
gpm.available_scan_modes(product=product, version=version)
[7]:
['S1']

and then specify the scan_mode argument in open_dataset:

[8]:
ds = gpm.open_dataset(
    product=product,
    product_type=product_type,
    version=version,
    start_time=start_time,
    end_time=end_time,
    scan_mode="S1",
)
ds
/home/ghiggi/Python_Packages/gpm/gpm/dataset/conventions.py:43: GPM_Warning: 'The dataset start at 2020-08-02 12:00:01, although the specified start_time is 2020-08-02 12:00:00.'
  warnings.warn(msg, GPM_Warning)
[8]:
<xarray.Dataset>
Dimensions:                  (cross_track: 90, along_track: 32401, nchUIA1: 1,
                              pmw_frequency: 5)
Coordinates:
    SCorientation            (along_track) float32 dask.array<chunksize=(2045,), meta=np.ndarray>
    lon                      (cross_track, along_track) float32 ...
    lat                      (cross_track, along_track) float32 ...
    time                     (along_track) datetime64[ns] 2020-08-02T12:00:01...
    gpm_id                   (along_track) <U10 ...
    gpm_granule_id           (along_track) int64 ...
    gpm_cross_track_id       (cross_track) int64 ...
    gpm_along_track_id       (along_track) int64 ...
  * pmw_frequency            (pmw_frequency) <U8 '89.0V' '157.0V' ... '190.31V'
    crsWGS84                 int64 0
Dimensions without coordinates: cross_track, along_track, nchUIA1
Data variables:
    sunLocalTime             (cross_track, along_track) float32 dask.array<chunksize=(90, 2045), meta=np.ndarray>
    Quality                  (cross_track, along_track) float32 dask.array<chunksize=(90, 2045), meta=np.ndarray>
    incidenceAngle           (cross_track, along_track, nchUIA1) float32 dask.array<chunksize=(90, 2045, 1), meta=np.ndarray>
    sunGlintAngle            (cross_track, along_track, nchUIA1) float32 dask.array<chunksize=(90, 2045, 1), meta=np.ndarray>
    incidenceAngleIndex      (along_track, pmw_frequency) float32 dask.array<chunksize=(2045, 5), meta=np.ndarray>
    Tc                       (cross_track, along_track, pmw_frequency) float32 dask.array<chunksize=(90, 2045, 5), meta=np.ndarray>
    SClatitude               (along_track) float32 dask.array<chunksize=(2045,), meta=np.ndarray>
    SClongitude              (along_track) float32 dask.array<chunksize=(2045,), meta=np.ndarray>
    SCaltitude               (along_track) float32 dask.array<chunksize=(2045,), meta=np.ndarray>
    FractionalGranuleNumber  (along_track) float64 dask.array<chunksize=(2045,), meta=np.ndarray>
Attributes: (12/17)
    FileName:           1C.METOPB.MHS.XCAL2016-V.20200802-S114932-E133053.040...
    EphemerisFileName:
    AttitudeFileName:
    MissingData:        0
    DOI:                10.5067/GPM/MHS/METOPB/1C/07
    DOIauthority:       http://dx.doi.org/
    ...                 ...
    ProcessingSystem:   PPS
    DataFormatVersion:  7e
    MetadataVersion:    7e
    ScanMode:           S1
    history:            Created by ghiggi/gpm_api software on 2023-08-24 12:4...
    gpm_api_product:    1C-MHS-METOPB

You can list variables, coordinates and dimensions with the following methods

[9]:
# Available variables
variables = list(ds.data_vars)
print("Available variables: ", variables)
# Available coordinates
coords = list(ds.coords)
print("Available coordinates: ", coords)
# Available dimensions
dims = list(ds.dims)
print("Available dimensions: ", dims)
Available variables:  ['sunLocalTime', 'Quality', 'incidenceAngle', 'sunGlintAngle', 'incidenceAngleIndex', 'Tc', 'SClatitude', 'SClongitude', 'SCaltitude', 'FractionalGranuleNumber']
Available coordinates:  ['SCorientation', 'lon', 'lat', 'time', 'gpm_id', 'gpm_granule_id', 'gpm_cross_track_id', 'gpm_along_track_id', 'pmw_frequency', 'crsWGS84']
Available dimensions:  ['cross_track', 'along_track', 'nchUIA1', 'pmw_frequency']

As you see, every variable has a prefix which indicates the group in the original HDF file where the variable is stored. You can remove the prefix when opening the dataset by specifying prefix_group=False. You can also directly load only a subset of variables, by specifying the variables argument.

[10]:
# List some variables of interest
variables = [
   "Tc", # "Tb" if PMW 1B product
]
# Load the dataset
ds = gpm.open_dataset(
    product=product,
    product_type=product_type,
    version=version,
    start_time=start_time,
    end_time=end_time,
    variables=variables,
    prefix_group=False,
)
ds
/home/ghiggi/Python_Packages/gpm/gpm/dataset/conventions.py:43: GPM_Warning: 'The dataset start at 2020-08-02 12:00:01, although the specified start_time is 2020-08-02 12:00:00.'
  warnings.warn(msg, GPM_Warning)
[10]:
<xarray.Dataset>
Dimensions:             (cross_track: 90, along_track: 32401, pmw_frequency: 5)
Coordinates:
    SCorientation       (along_track) float32 dask.array<chunksize=(2045,), meta=np.ndarray>
    lon                 (cross_track, along_track) float32 ...
    lat                 (cross_track, along_track) float32 ...
    time                (along_track) datetime64[ns] 2020-08-02T12:00:01 ... ...
    gpm_id              (along_track) <U10 ...
    gpm_granule_id      (along_track) int64 ...
    gpm_cross_track_id  (cross_track) int64 ...
    gpm_along_track_id  (along_track) int64 ...
  * pmw_frequency       (pmw_frequency) <U8 '89.0V' '157.0V' ... '190.31V'
    crsWGS84            int64 0
Dimensions without coordinates: cross_track, along_track
Data variables:
    Tc                  (cross_track, along_track, pmw_frequency) float32 dask.array<chunksize=(90, 2045, 5), meta=np.ndarray>
Attributes: (12/17)
    FileName:           1C.METOPB.MHS.XCAL2016-V.20200802-S114932-E133053.040...
    EphemerisFileName:
    AttitudeFileName:
    MissingData:        0
    DOI:                10.5067/GPM/MHS/METOPB/1C/07
    DOIauthority:       http://dx.doi.org/
    ...                 ...
    ProcessingSystem:   PPS
    DataFormatVersion:  7e
    MetadataVersion:    7e
    ScanMode:           S1
    history:            Created by ghiggi/gpm_api software on 2023-08-24 12:4...
    gpm_api_product:    1C-MHS-METOPB

To select the DataArray corresponding to a single variable:

[11]:
variable = "Tc" # "Tb" if PMW 1B product
da = ds[variable]
da
[11]:
<xarray.DataArray 'Tc' (cross_track: 90, along_track: 32401, pmw_frequency: 5)>
dask.array<getitem, shape=(90, 32401, 5), dtype=float32, chunksize=(90, 2281, 5), chunktype=numpy.ndarray>
Coordinates:
    SCorientation       (along_track) float32 dask.array<chunksize=(2045,), meta=np.ndarray>
    lon                 (cross_track, along_track) float32 ...
    lat                 (cross_track, along_track) float32 ...
    time                (along_track) datetime64[ns] 2020-08-02T12:00:01 ... ...
    gpm_id              (along_track) <U10 ...
    gpm_granule_id      (along_track) int64 ...
    gpm_cross_track_id  (cross_track) int64 ...
    gpm_along_track_id  (along_track) int64 ...
  * pmw_frequency       (pmw_frequency) <U8 '89.0V' '157.0V' ... '190.31V'
    crsWGS84            int64 0
Dimensions without coordinates: cross_track, along_track
Attributes:
    units:            K
    LongName:         \nIntercalibrated Tb for channels 1) 89.0 GHz V-Pol 2) ...
    gpm_api_product:  1C-MHS-METOPB
    grid_mapping:     crsWGS84
    coordinates:      lat lon

To extract from the DataArray the numerical array you use:

[12]:
print("Data type of numerical array: ", type(da.data))
da.data
Data type of numerical array:  <class 'dask.array.core.Array'>
[12]:
Array Chunk
Bytes 55.62 MiB 3.92 MiB
Shape (90, 32401, 5) (90, 2281, 5)
Dask graph 15 chunks in 35 graph layers
Data type float32 numpy.ndarray
5 32401 90

If the numerical array data type is dask.Array, it means that the data are not yet loaded into RAM memory. To put the data into memory, you need to call the method compute, either on the xarray object or on the numerical array.

[13]:
# Option 1
da_opt1 = da.compute()
print("Data type of numerical array: ", type(da_opt1.data))
da_opt1.data
Data type of numerical array:  <class 'numpy.ndarray'>
[13]:
array([[[214.94, 230.33, 243.32, 248.03, 246.82],
        [210.01, 229.88, 244.05, 249.7 , 250.23],
        [207.04, 229.82, 244.76, 251.55, 253.29],
        ...,
        [277.  , 281.53, 236.9 , 253.38, 265.27],
        [277.22, 282.24, 239.59, 254.93, 266.19],
        [276.82, 282.21, 240.9 , 255.71, 267.39]],

       [[205.08, 229.87, 247.87, 254.35, 256.64],
        [203.07, 229.52, 247.62, 254.84, 257.51],
        [202.37, 229.82, 247.15, 254.28, 258.39],
        ...,
        [275.78, 282.66, 240.  , 256.19, 267.54],
        [275.84, 283.14, 241.05, 256.57, 268.34],
        [275.88, 283.  , 241.88, 257.25, 268.34]],

       [[200.94, 227.47, 248.23, 256.18, 258.28],
        [201.14, 227.29, 247.4 , 254.88, 258.79],
        [202.95, 230.54, 246.09, 254.21, 258.41],
        ...,
        [274.31, 282.88, 239.82, 255.52, 267.54],
        [274.46, 283.47, 240.13, 256.74, 268.46],
        [274.21, 283.84, 241.36, 256.88, 268.61]],

       ...,

       [[226.14, 264.22, 237.55, 252.64, 263.16],
        [226.93, 265.19, 239.15, 252.87, 264.03],
        [228.75, 265.48, 237.63, 253.35, 264.27],
        ...,
        [273.85, 283.84, 243.85, 258.09, 268.26],
        [272.54, 284.37, 244.62, 258.08, 268.3 ],
        [271.32, 284.33, 245.41, 258.45, 268.31]],

       [[226.31, 263.8 , 238.65, 253.15, 263.93],
        [228.33, 264.85, 238.98, 253.75, 263.6 ],
        [228.11, 264.6 , 237.03, 253.29, 264.3 ],
        ...,
        [274.46, 282.93, 243.96, 257.79, 267.57],
        [273.93, 283.07, 244.86, 258.11, 268.59],
        [273.24, 283.64, 245.74, 258.08, 269.15]],

       [[227.9 , 265.49, 238.44, 254.16, 264.49],
        [228.42, 265.47, 238.99, 254.58, 265.3 ],
        [226.47, 263.9 , 238.34, 254.37, 265.23],
        ...,
        [275.33, 282.74, 244.92, 257.74, 267.81],
        [274.09, 282.71, 245.35, 258.84, 268.25],
        [275.02, 283.21, 245.13, 257.64, 268.31]]], dtype=float32)
[14]:
# Option 2
print("Data type of numerical array: ", type(da.data.compute()))
da.data.compute()
Data type of numerical array:  <class 'numpy.ndarray'>
[14]:
array([[[214.94, 230.33, 243.32, 248.03, 246.82],
        [210.01, 229.88, 244.05, 249.7 , 250.23],
        [207.04, 229.82, 244.76, 251.55, 253.29],
        ...,
        [277.  , 281.53, 236.9 , 253.38, 265.27],
        [277.22, 282.24, 239.59, 254.93, 266.19],
        [276.82, 282.21, 240.9 , 255.71, 267.39]],

       [[205.08, 229.87, 247.87, 254.35, 256.64],
        [203.07, 229.52, 247.62, 254.84, 257.51],
        [202.37, 229.82, 247.15, 254.28, 258.39],
        ...,
        [275.78, 282.66, 240.  , 256.19, 267.54],
        [275.84, 283.14, 241.05, 256.57, 268.34],
        [275.88, 283.  , 241.88, 257.25, 268.34]],

       [[200.94, 227.47, 248.23, 256.18, 258.28],
        [201.14, 227.29, 247.4 , 254.88, 258.79],
        [202.95, 230.54, 246.09, 254.21, 258.41],
        ...,
        [274.31, 282.88, 239.82, 255.52, 267.54],
        [274.46, 283.47, 240.13, 256.74, 268.46],
        [274.21, 283.84, 241.36, 256.88, 268.61]],

       ...,

       [[226.14, 264.22, 237.55, 252.64, 263.16],
        [226.93, 265.19, 239.15, 252.87, 264.03],
        [228.75, 265.48, 237.63, 253.35, 264.27],
        ...,
        [273.85, 283.84, 243.85, 258.09, 268.26],
        [272.54, 284.37, 244.62, 258.08, 268.3 ],
        [271.32, 284.33, 245.41, 258.45, 268.31]],

       [[226.31, 263.8 , 238.65, 253.15, 263.93],
        [228.33, 264.85, 238.98, 253.75, 263.6 ],
        [228.11, 264.6 , 237.03, 253.29, 264.3 ],
        ...,
        [274.46, 282.93, 243.96, 257.79, 267.57],
        [273.93, 283.07, 244.86, 258.11, 268.59],
        [273.24, 283.64, 245.74, 258.08, 269.15]],

       [[227.9 , 265.49, 238.44, 254.16, 264.49],
        [228.42, 265.47, 238.99, 254.58, 265.3 ],
        [226.47, 263.9 , 238.34, 254.37, 265.23],
        ...,
        [275.33, 282.74, 244.92, 257.74, 267.81],
        [274.09, 282.71, 245.35, 258.84, 268.25],
        [275.02, 283.21, 245.13, 257.64, 268.31]]], dtype=float32)

3. Dataset Manipulations#

Now, let’s first have a look at the methods provided by GPM-API

[15]:
variable = "Tc" # "Tb" if PMW 1B product
da = ds[variable]
print("xr.Dataset gpm methods:", dir(ds.gpm))
print("")
print("xr.DataArray gpm methods:", dir(da.gpm))

xr.Dataset gpm methods: ['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_obj', 'available_retrievals', 'collocate', 'crop', 'crop_by_continent', 'crop_by_country', 'define_transect_slices', 'end_time', 'extent', 'frequency_variables', 'get_crop_slices_by_continent', 'get_crop_slices_by_country', 'get_crop_slices_by_extent', 'get_height_at_bin', 'get_slices_contiguous_granules', 'get_slices_contiguous_scans', 'get_slices_regular', 'get_slices_regular_time', 'get_slices_valid_geolocation', 'get_variable_at_bin', 'has_contiguous_scans', 'has_missing_granules', 'has_regular_time', 'has_valid_geolocation', 'is_grid', 'is_orbit', 'is_regular', 'is_spatial_2d', 'is_spatial_3d', 'plot_image', 'plot_map', 'plot_map_mesh', 'plot_map_mesh_centroids', 'plot_swath', 'plot_swath_lines', 'plot_transect', 'plot_transect_line', 'pyresample_area', 'remap_on', 'retrieve', 'select_spatial_2d_variables', 'select_spatial_3d_variables', 'select_transect', 'slice_range_at_height', 'slice_range_at_max_value', 'slice_range_at_min_value', 'slice_range_at_temperature', 'slice_range_at_value', 'slice_range_where_values', 'slice_range_with_valid_data', 'spatial_2d_variables', 'spatial_3d_variables', 'spatial_dimensions', 'start_time', 'subset_by_time', 'subset_by_time_slice', 'title', 'variables', 'vertical_dimension']

xr.DataArray gpm methods: ['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_obj', 'collocate', 'crop', 'crop_by_continent', 'crop_by_country', 'define_transect_slices', 'end_time', 'extent', 'get_crop_slices_by_continent', 'get_crop_slices_by_country', 'get_crop_slices_by_extent', 'get_height_at_bin', 'get_slices_contiguous_granules', 'get_slices_contiguous_scans', 'get_slices_regular', 'get_slices_regular_time', 'get_slices_valid_geolocation', 'get_slices_var_between', 'get_slices_var_equals', 'get_variable_at_bin', 'has_contiguous_scans', 'has_missing_granules', 'has_regular_time', 'has_valid_geolocation', 'integrate_profile_concentration', 'is_grid', 'is_orbit', 'is_regular', 'is_spatial_2d', 'is_spatial_3d', 'plot_image', 'plot_map', 'plot_map_mesh', 'plot_map_mesh_centroids', 'plot_swath', 'plot_swath_lines', 'plot_transect', 'plot_transect_line', 'pyresample_area', 'remap_on', 'select_transect', 'slice_range_at_height', 'slice_range_at_max_value', 'slice_range_at_min_value', 'slice_range_at_value', 'slice_range_where_values', 'slice_range_with_valid_data', 'spatial_dimensions', 'start_time', 'subset_by_time', 'subset_by_time_slice', 'title', 'vertical_dimension']

You can also select a specific PMW channel with the isel method:

[16]:
ds.isel(pmw_frequency=0)
[16]:
<xarray.Dataset>
Dimensions:             (cross_track: 90, along_track: 32401)
Coordinates:
    SCorientation       (along_track) float32 dask.array<chunksize=(2045,), meta=np.ndarray>
    lon                 (cross_track, along_track) float32 ...
    lat                 (cross_track, along_track) float32 ...
    time                (along_track) datetime64[ns] 2020-08-02T12:00:01 ... ...
    gpm_id              (along_track) <U10 ...
    gpm_granule_id      (along_track) int64 ...
    gpm_cross_track_id  (cross_track) int64 ...
    gpm_along_track_id  (along_track) int64 ...
    pmw_frequency       <U8 '89.0V'
    crsWGS84            int64 0
Dimensions without coordinates: cross_track, along_track
Data variables:
    Tc                  (cross_track, along_track) float32 dask.array<chunksize=(90, 2045), meta=np.ndarray>
Attributes: (12/17)
    FileName:           1C.METOPB.MHS.XCAL2016-V.20200802-S114932-E133053.040...
    EphemerisFileName:
    AttitudeFileName:
    MissingData:        0
    DOI:                10.5067/GPM/MHS/METOPB/1C/07
    DOIauthority:       http://dx.doi.org/
    ...                 ...
    ProcessingSystem:   PPS
    DataFormatVersion:  7e
    MetadataVersion:    7e
    ScanMode:           S1
    history:            Created by ghiggi/gpm_api software on 2023-08-24 12:4...
    gpm_api_product:    1C-MHS-METOPB

The GPM products are either ORBIT (i.e. PMW and RADAR) or GRID (i.e. IMERG) based. You can check the support of the data with the methods is_grid and is_orbit.

[17]:
print("Is GPM ORBIT data?: ", ds.gpm.is_orbit)
print("Is GPM GRID data?: ", ds.gpm.is_grid)
Is GPM ORBIT data?:  True
Is GPM GRID data?:  False

To check Whether the loaded GPM PMW product has contiguous scans, you can use:

[18]:
print(ds.gpm.has_contiguous_scans)
print(ds.gpm.is_regular)
True
True

In case there are non-contiguous scans, you can obtain the along-track slices over which the dataset is regular:

[19]:
list_slices = ds.gpm.get_slices_contiguous_scans()
print(list_slices)
[slice(0, 32401, None)]

You can then select a regular portion of the dataset with:

[20]:
slc = list_slices[0]
print(slc)
slice(0, 32401, None)
[21]:
ds_regular = ds.isel(along_track=slc)
ds_regular.gpm.is_regular
[21]:
True

To instead check if the open dataset has a single or multiple timestep, you can use:

[22]:
ds.gpm.is_spatial_2d # because the xr.Dataset also contains the channel dimension !
[22]:
False
[23]:
ds.isel(pmw_frequency=0).gpm.is_spatial_2d
[23]:
True

4. Product Visualization#

The GPM-API provides two ways of displaying the data: - The plot_map method plot the data in a geographic projection using the Cartopy pcolormesh method - The plot_image method plot the data as an image using the Maplotlib imshow method

Let’s start by plotting the PMW scan in the geographic space

[24]:
da = ds[variable].isel(pmw_frequency=0).isel(along_track=slice(0,8000))
da.gpm.plot_map()
[24]:
<cartopy.mpl.geocollection.GeoQuadMesh at 0x7fb4fc445a50>
../_images/tutorials_tutorial_02_PMW_1C_48_1.png

and now as an image, in “swath” view:

[25]:
da.gpm.plot_image()
[25]:
<matplotlib.image.AxesImage at 0x7fb4f02bf400>
../_images/tutorials_tutorial_02_PMW_1C_50_1.png

To facilitate the creation of a figure title, GPM-API also provide a title method:

[26]:
# Title for a single-timestep dataset
print(ds[variable].gpm.title(add_timestep=True))
print(ds[variable].gpm.title(add_timestep=False))
1C-MHS-METOPB Tc (2020-08-03 00:00)
1C-MHS-METOPB Tc

To instead zoom on a specific regions of a plot_map figure, you can use the axis method set_extent. Note that render the image with this approach can be quite slow, because plot_map plots all the data, and then restrict the figure extent over the area of interest. For a more efficient approach, see section 6. Dataset cropping.

[27]:
from gpm.utils.geospatial import get_country_extent

title = ds.gpm.title(add_timestep=False)
extent = get_country_extent("United States")
print("Extent: ", extent)
da = ds[variable].isel(pmw_frequency=0, along_track=slice(0, 8000))
p = da.gpm.plot_map()
_ = p.axes.set_extent(extent)
_ = p.axes.set_title(label=title)
Extent:  (-171.99111060299998, -66.76465999999999, 18.71619, 71.5577635769)
../_images/tutorials_tutorial_02_PMW_1C_54_1.png

You can also customize the geographic projection, by specifying the wished Cartopy projection. The available projections are listed here

[28]:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt

from gpm.visualization.plot import plot_cartopy_background

# Define some figure options
dpi = 100
figsize = (12, 10)

# Example of polar Cartopy projection
crs_proj = ccrs.Orthographic(180, -90)

# Subset the data for fast rendering and select single channel
da = ds[variable].isel(pmw_frequency=0, along_track=slice(0, 8000))

# Create the map
fig, ax = plt.subplots(subplot_kw={"projection": crs_proj}, figsize=figsize, dpi=dpi)
plot_cartopy_background(ax)
da.gpm.plot_map(ax=ax)
ax.set_global()
../_images/tutorials_tutorial_02_PMW_1C_56_0.png

It is possible to further customize these figures in multiply ways. For example by specifying the own colormap:

[29]:
da.gpm.plot_map(cmap="turbo", vmin=150, vmax=300)
[29]:
<cartopy.mpl.geocollection.GeoQuadMesh at 0x7fb4d217e0e0>
../_images/tutorials_tutorial_02_PMW_1C_58_1.png

5. Dataset Cropping#

GPM-API provides methods to easily spatially subset orbits by extent, country or continent. Note however, that an area can be crossed by multiple orbits. In other words, multiple orbit slices in along-track direction can intersect the area of interest. The method get_crop_slices_by_extent, get_crop_slices_by_country and get_crop_slices_by_continent enable to retrieve the orbit portions intersecting the area of interest.

[30]:
# Subset the data for fast rendering and select single channel
da = ds[variable].isel(pmw_frequency=0, along_track=slice(0, 8000))

# Crop by extent
extent = (-172, -67, 19, 72) # (xmin, xmax, ymin, ymax)
list_isel_dict = da.gpm.get_crop_slices_by_extent(extent)
print(list_isel_dict)
for isel_dict in list_isel_dict:
    da_subset = da.isel(isel_dict)
    slice_title = da_subset.gpm.title(add_timestep=True)
    p = da_subset.gpm.plot_map()
    p.axes.set_extent(extent)
    p.axes.set_title(label=slice_title)

[{'along_track': slice(3300, 3645, None)}, {'along_track': slice(5562, 5926, None)}, {'along_track': slice(7809, 8000, None)}]
../_images/tutorials_tutorial_02_PMW_1C_61_1.png
../_images/tutorials_tutorial_02_PMW_1C_61_2.png
../_images/tutorials_tutorial_02_PMW_1C_61_3.png
[31]:
# Crop by country
# - Option 1
list_isel_dict = da.gpm.get_crop_slices_by_country("United States")
print(list_isel_dict)
# - Option 2
from gpm.utils.geospatial import get_country_extent

extent = get_country_extent("United States")
list_isel_dict = da.gpm.get_crop_slices_by_extent(extent)
print(list_isel_dict)
# - Plot the swath crossing the country
for isel_dict in list_isel_dict:
    da_subset = da.isel(isel_dict)
    slice_title = da_subset.gpm.title(add_timestep=True)
    p = da_subset.gpm.plot_map()
    p.axes.set_extent(extent)
    p.axes.set_title(label=slice_title)
[{'along_track': slice(3303, 3647, None)}, {'along_track': slice(5565, 5928, None)}, {'along_track': slice(7809, 8000, None)}]
[{'along_track': slice(3303, 3647, None)}, {'along_track': slice(5565, 5928, None)}, {'along_track': slice(7809, 8000, None)}]
../_images/tutorials_tutorial_02_PMW_1C_62_1.png
../_images/tutorials_tutorial_02_PMW_1C_62_2.png
../_images/tutorials_tutorial_02_PMW_1C_62_3.png
[32]:
# Crop by continent
# - Option 1
list_isel_dict = da.gpm.get_crop_slices_by_continent("South America")
print(list_isel_dict)
# - Option 2
from gpm.utils.geospatial import get_continent_extent

extent = get_continent_extent("South America")
list_isel_dict = da.gpm.get_crop_slices_by_extent(extent)
print(list_isel_dict)
# - Plot the swath crossing the country
for isel_dict in list_isel_dict:
    da_subset = da.isel(isel_dict)
    slice_title = da_subset.gpm.title(add_timestep=True)
    p = da_subset.gpm.plot_map()
    p.axes.set_extent(extent)
    p.axes.set_title(label=slice_title)

[{'along_track': slice(1364, 1876, None)}, {'along_track': slice(3644, 4131, None)}]
[{'along_track': slice(1364, 1876, None)}, {'along_track': slice(3644, 4131, None)}]
../_images/tutorials_tutorial_02_PMW_1C_63_1.png
../_images/tutorials_tutorial_02_PMW_1C_63_2.png