Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor/feat: decouple solar profile generation from Grid object, add individual calculation #247

Merged
merged 6 commits into from
Dec 18, 2021

Conversation

danielolsen
Copy link
Contributor

@danielolsen danielolsen commented Dec 14, 2021

Pull Request doc

Purpose

This partially addresses #213 by decoupling the solar profile creation from the Grid object and associated network-specific mappings. The user can now pass to the existing blended-state-attributes function: a Grid object (for convenience), or a dataframe of solar plants and a mapping of interconnections to state abbreviations (for maximum decoupling). A few miscellaneous unrelated changes also decouple other modules from the usa_tamu grid model.

This also introduces a new solar profile generation function which uses the individual attributes of each solar plant, since we have this information available if we start from the raw EIA data. The pre-existing function has been renamed from retrieve_data to retrieve_data_blended, and the new function is named retrieve_data_individual. I'm not married to any of these names, in case anyone else has better ideas.

A few more helper functions have been added to the prereise.gather.solardata.nsrdb.sam module to serve the shared logic of these two functions.

The underlying NREL data downloading module has been updated to include local caching of the downloaded NREL data files.

What the code is doing

Within nrel_api.py: we add an optional cache_dir input, which if it is provided acts as a location to check for pre-downloaded files, and store newly-downloaded files. There's a new helper function _build_filename to translate queries to filenames, similar to the existing function to translate queries to URLs.

Within sam.py:

  • There are new generalized helper functions generate_timestamps_without_leap_day and calculate_power to support the two new solar profile calculation functions.
  • The existing retrieve_data function has been renamed to retrieve_data_blended and refactored to be more flexible with its inputs. If the user passes a Grid object, then all relevant information (including zone mappings) comes from this object. Otherwise, the user must pass a solar_plant dataframe with enough columns to derive relationships between zone IDs and states/interconnects, and a mapping of interconnect names to state abbreviation sets (interconnect_to_state_abvs). This mapping can't be reliably derived from the solar_plant dataframe, since there may be states without solar plants in the given dataframe, but these states should still be considered in interconnect averages when looking at the EIA data table.
  • retrieve_data has also been refactored to no longer need the to_reise data reshaping step, by constructing a dictionary of arrays as we go, and converting these to a dataframe as the final step of the function.
  • A new function retrieve_data_individual has been added, which expects extra data columns to be present in solar_plant, and uses these extra data columns to produce more precise profiles for each plant, considering its specific tracking type, inverter loading ratio (ILR), and tilt angle (for fixed-tilt systems).

Within all other files: miscellaneous updates to constants or docstrings, no changes to functionality.

Testing

Unit tests seem to be failing sporadically because of a 502 Bad Gateway when trying to map buses within prereise.gather.demanddata.eia.map_ba, which I didn't touch. Not sure what's going on there.

Both functions have been tested manually on the new HIFLD grid, with both invocation methods of the blended profile generator.

Usage Example/Visuals

The second function needs to be able to associate extra information from EIA Form 860 Table 3.3 (see #242 (comment)), so uses the CSVs created by the latest HIFLD-grid-generating code in the daniel/solar_pv_only_plus_index branch (message me and I'll point you to the files for testing), plus the PowerSimData branch which enables the HIFLD grid (Breakthrough-Energy/PowerSimData#566). For clarity, all usage examples will use this same HIFLD grid model.

Generating blended profiles using just a Grid object:

from powersimdata import Grid
from prereise.gather.solardata.nsrdb import sam
grid = Grid("USA", "hifld")
data = sam.retrieve_data_blended(
    YOUR_EMAIL,
    YOUR_NREL_API_KEY,
    grid=grid,
    year=2020,
    cache_dir=YOUR_CACHE_DIR,
)

Generating blended profiles using a dataframe and dictionary (athough both are derived from a Grid object in this example):

from powersimdata import Grid
from prereise.gather.solardata.nsrdb import sam
grid = Grid("USA", "hifld")
mi = grid.model_immutables
solar_plant = grid.plant.query("type == 'solar'").copy()
solar_plant["state_abv"] = solar_plant["zone_id"].map(mi.zones["id2abv"])
interconnect_to_state_abvs = mi.zones["interconnect2abv"]
data = sam.retrieve_data_blended(
    YOUR_EMAIL,
    YOUR_NREL_API_KEY,
    solar_plant=solar_plant,
    interconnect_to_state_abvs=interconnect_to_state_abvs,
    year=2020,
    cache_dir=YOUR_CACHE_DIR,
)

Generating individual profiles using additional information within EIA's Form 860 Table 3.3

import pandas as pd
from powersimdata import Grid
from prereise.gather.solardata.nsrdb import sam
grid = Grid("USA", "hifld")
solar_plant = grid.plant.query("type == 'solar'")

# Add a helper function to coerce column types
def floatify(x):
    try:
        return float(x)
    except ValueError:
        return float("nan")

# Read additional data table, and pre-process data
extra_solar_data = pd.read_csv("3_3_Solar_Y2019_Operable.csv")
boolean_columns = ["Single-Axis Tracking?", "Dual-Axis Tracking?", "Fixed Tilt?"]
float_columns = ["DC Net Capacity (MW)", "Nameplate Capacity (MW)", "Tilt Angle"]
for col in float_columns:
    extra_solar_data[col] = extra_solar_data[col].map(floatify)

extra_solar_data.index = extra_solar_data.apply(
    lambda x: f"{x['Plant Code']}_{x['Generator ID']}", axis=1
)
boolean_columns = ["Single-Axis Tracking?", "Dual-Axis Tracking?", "Fixed Tilt?"]
for col in boolean_columns:
    # 'Y' becomes True, anything else ('N', blank, etc) becomes False
    extra_solar_data[col] = extra_solar_data[col] == "Y"

# If more than one column is True, assume Fixed Tilt for the purpose of this example
bad_booleans = extra_solar_data.index[extra_solar_data[boolean_columns].sum(axis=1) != 1]
extra_solar_data.loc[bad_booleans, boolean_columns] = False
extra_solar_data.loc[bad_booleans, "Fixed Tilt?"] = True

# Join frame with additional data to the original dataframe, and process
joined = solar_plant.join(extra_solar_data, rsuffix="_extra")
data = sam.retrieve_data_individual(
    YOUR_EMAIL,
    YOUR_NREL_API_KEY,
    solar_plant=joined,
    year=2020,
    cache_dir=YOUR_CACHE_DIR,
)

Time estimate

1 hour or more.

@danielolsen danielolsen self-assigned this Dec 14, 2021
prereise/gather/helpers.py Outdated Show resolved Hide resolved
@jenhagg
Copy link
Collaborator

jenhagg commented Dec 17, 2021

Since retrieve_data is renamed, we should update the usage here. There is also some mocking in test_solar_data.py, and a notebook which call this. Alternatively, we could keep the name and just add the retrieve_data_individual function as you have now. I'm good with either way.

@danielolsen danielolsen force-pushed the daniel/decouple_solar_from_grid branch 4 times, most recently from 8990fc1 to e6c4dc1 Compare December 17, 2021 22:36
@danielolsen
Copy link
Contributor Author

Since retrieve_data is renamed, we should update the usage here. There is also some mocking in test_solar_data.py, and a notebook which call this. Alternatively, we could keep the name and just add the retrieve_data_individual function as you have now. I'm good with either way.

I've updated the function name and behavior in the two places you mentioned, plus the demo notebook.

Copy link
Collaborator

@jenhagg jenhagg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable and safe to merge. I think it could be nice to have a demo notebook or some docs regarding the required inputs for the individual calculation (an issue or future PR or whatever is fine) since there is a bit of setup needed before being able to call it.

I'd also consider disabling the failing tests so we keep develop branch in a passing state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants