Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FWI Visualization #11

Open
paridhi-parajuli opened this issue Feb 15, 2023 · 7 comments
Open

FWI Visualization #11

paridhi-parajuli opened this issue Feb 15, 2023 · 7 comments

Comments

@paridhi-parajuli
Copy link

  1. Use Panel parameterized objects (above) to optimize existing FWI dashboards.
  2. Use existing fire detections and plot the corresponding FWI time series and chiclet plots.
    • Read this using geopandas: s3://veda-data-store-staging/EIS/other/feds-output-conus/latest/perim-large.fgb –done
    • Pick a fire from here (just select one row).
    • Draw a 5km buffer around this fire (using Geopandas)
    • Use that buffer as input to FWI time series and chiclet plot.
@paridhi-parajuli
Copy link
Author

paridhi-parajuli commented Feb 17, 2023

Feb 17,2023

  1. Panel Parameterized objects created.
  2. Having issues with the file upload functionality.
    Punt for now. Create a separate static notebook where user enters file name as a variable at the top.
  3. Working to do this without dataframe creation. Will it make it faster?
    Probably! Keep working on this.
  4. Created buffer polygon around centroid.
    Good. Let’s try just a buffer around the geometry itself (not the centroid).
  5. Having issues with the distance to lat-lon conversion.
    This is just a warning. Buffer is in units of the projection – for lat-lon data, the unit is “degrees” which is not a true distance
    unit. But, e.g., 0.5 degree buffer is fine.
  6. No variable in lis-tws-trend data.
    This is a DataArray already. Can be accessed with data.values
    E.g., data[0,0,0:3,0:3].values to retrieve the first few pixels.
    …but there are issues with the S3 read, not your code. Investigate with Slesa/Iksha/etc.
    More generally, stacstack produces Xarray DataArrays. They are accessible via data.values, subset via data[...], etc.
  7. Need to work for multipolygons.
    Simple workaround: .geometry.convex_hull (recall – need to do .geometry.convex_hull.exterior.coords)
    Ideally, need a better solution – maybe some kind of union

@paridhi-parajuli
Copy link
Author

Feb 20,2023

  1. FWI analysis
  •   Example FWI dashboard using (buffered) fire perimeter
    
  •   Look for ways to optimize FWI dashboard – e.g., skip dataframe creation?
    
  1. STAC
  • Work with VEDA team to get basic STAC visualization/analysis example
    
  • Consider a different dataset – not TWS anomaly. Maybe HLS. Maybe NO2 or CO2 data.
    
  • Slesa now has a STAC entry for two Zarr datasets (SPL3SMP – SMAP; OCO2 L3). Try creating a notebook that reads one 
     of these Zarr datasets from the STAC catalog entry and does a basic plot.
    
  • Get STAC catalog entry for these datasets from Slesa.
    

@paridhi-parajuli
Copy link
Author

paridhi-parajuli commented Feb 27, 2023

2023-02-24

  1. FWI analysis
  • All null values issue – needed to add all_touched=True argument. Input polygons were too small, so no data were being selected.
  • Also recommend drop=True to throw an informative error.
  1. STAC
  • Zarr visualizations worked.

For next week

  1. FWI analysis
  • Get a complete static (non-dashboard) example working and push to GitHub. --> done
  • Clean up and add documentation to the static example. Once this is pushed, post a link in the EIS-FEDS Slack for Tess, etc. to review. --> done
  • Get the panel dashboard working and push to GitHub. --> done
  1. STAC
  • Once Slesa has debugged things, test it out. --> Pending
  • Giovanni Zarr store
  • Look for Zarr store in s3://prod-giovanni-cache/zarr/
  • Try to open and do a basic plot (map for one step, time series for one pixel) of each of these in Xarray. And report back – what works, what doesn’t. --> works with xr.open_zarr() not with xr.open_dataset() and only for GPM_3IMERGHH_06_precipitationCal

@paridhi-parajuli
Copy link
Author

paridhi-parajuli commented Mar 3, 2023

For next week

  1. Email Sharonin, Katrina (GSFC-DK000)[Intern] (intern on EIS-Fire) to set up a meeting, discuss ongoing tasks, identify tasks where you can help. Report back to me.

  2. Analysis of ESDIS metrics
    Alexey is in the process of downloading the data from (closed) ESDIS metrics service. Look for data provided in the doc
    Several datasets:

  • archive-size-by-product-totals – Total volume of each data product
  • data-products — Additional information on each data product, including science discipline, etc. Useful for merging with other datasets.
  • total-distribution-by-product-2022 — Total data downloads by product and distribution mechanism in 2022. Note that some products have multiple distribution mechanisms.
  • total-distribution-by-user-2022 — Total data downloads by user and product in 2022. Since this contains some mildly sensitive user information (emails), I password protected it – see my Slack message for the password.
  • Some questions to address:
  • Total archive data volume by science discipline
  • Distribution, and cumulative distribution, of data downloads by volume. E.g., How many datasets account for the top 95% of data downloads?
  • What were the top 100 data products distributed in 2022 by volume? By number of unique users? What are the similarities and differences between these top 100 lists – e.g., which products appear in these lists regardless of how you count? Are there any products that are especially popular in terms of number of users but not in terms of data volume? Vice versa?
  • Which providers (DAACs) distributed these datasets? What data formats are these datasets distributed in? What data services are available for these services (this may require some separate browsing of the dataset websites)?
  • What is the distribution of data users? E.g., How many users account for the top 95% of data downloads?
  • Download volume by user discipline?
  • What were the most popular data download mechanisms by volume in 2022?
  • Archive size and distribution volume by product level (Level 1, Level 2, Level 3, etc.).
  • Report all of these results in a Jupyter notebook shared via GitHub.
  1. STAC – once Slesa figures this out, come back to this task.

@paridhi-parajuli
Copy link
Author

For next week

  1. Metrics data analysis
  • Finalize some additional analyses of distribution and product volume
  • Clean up data analysis report – remove commented code; add descriptions of underlying datasets and analyses.
  1. New project: Working with HDF-EOS data

@paridhi-parajuli
Copy link
Author

Newly added to notebook :

  • Analysis with respect to domain of user
  • Changed all sorting by volume
  • Good volume products and good number of user products
  • Format of top 30 products by volume
  • Cleaned level and done level wise analysis
  • Analysis with respect to Provider
  • Merged products and archive data for further analysis
  • Manipulated discipline field for discipline wise analysis
  • Download volume by user discipline? --> done

@paridhi-parajuli
Copy link
Author

For next week:
Modify code to create parquet files using geopandas directly.
Create pandas data frame
Add the observation timestep (date + time) as a time column to the dataset.
Ideally, we want the exact timestep of each pixel…but only if we can find it.
Convert to geopandas geodataframe with argument for converting lat/lon columns to geometry (and set CRS to EPSG 4326).
Try working with new geoparquet files in geopandas:
Read (gpd.read_parquet).
Subset by arbitrary polygon – (1) Identify a arbitrary polygon that’s inside the MODIS image; (2) create it as a geopandas / shapely object; (3) crop the MODIS geodataframe to the object from (2).
Create parquet files for 3-5 adjacent MODIS tiles. Try reading and subsetting multiple parquet files at once using geopandas.
NetCDF analog – xr.open_mfdataset(“dat_*.nc). Trying to do something similar with Parquet.
GOAL: Try to work with 3-5 adjacent MODIS tiles as one continuous dataset.
Try doing some basic subsetting of files using Arrow (Reading and Writing the Apache Parquet Format — Apache Arrow v11.0.0).
E.g., Try grabbing all pixels with reflectance above a certain value.
Look for ways to do spatial subsetting with Arrow.
Chat with Denis Tuesday 1pm CT / 2pm ET about new activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant