Skip to content

Resource Preparation

Scripts for preparing renewable resources, power plants, and demand profiles.


build_cutout

Create cutouts with atlite <https://atlite.readthedocs.io/en/latest/>_.

For this rule to work you must have

  • installed the Copernicus Climate Data Store <https://cds.climate.copernicus.eu>_ cdsapi package (install withpip``) and
  • registered and setup your CDS API key as described on their website <https://cds.climate.copernicus.eu/api-how-to>_. The CDS API allows an automatic filedownload by executing this script

.. seealso:: For details on the weather data read the atlite documentation <https://atlite.readthedocs.io/en/latest/>. If you need help specifically for creating cutouts the corresponding section in the atlite documentation <https://atlite.readthedocs.io/en/latest/examples/create_cutout.html> should be helpful.

Relevant Settings

.. code:: yaml

atlite:
    nprocesses:
    cutouts:
        {cutout}:

.. seealso:: Documentation of the configuration file config.yaml at :ref:atlite_cf

Inputs

None

Outputs

  • cutouts/{cutout}: weather data from either the ERA5 <https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5> reanalysis weather dataset or SARAH-2 <https://wui.cmsaf.eu/safira/action/viewProduktSearch> satellite-based historic weather data with the following structure:

ERA5 cutout:

===================  ==========  ==========  =========================================================
Field                Dimensions  Unit        Description
===================  ==========  ==========  =========================================================
pressure             time, y, x  Pa          Surface pressure
-------------------  ----------  ----------  ---------------------------------------------------------
temperature          time, y, x  K           Air temperature 2 meters above the surface.
-------------------  ----------  ----------  ---------------------------------------------------------
soil temperature     time, y, x  K           Soil temperature between 1 meters and 3 meters
                                             depth (layer 4).
-------------------  ----------  ----------  ---------------------------------------------------------
influx_toa           time, y, x  Wm**-2      Top of Earth's atmosphere TOA incident solar radiation
-------------------  ----------  ----------  ---------------------------------------------------------
influx_direct        time, y, x  Wm**-2      Total sky direct solar radiation at surface
-------------------  ----------  ----------  ---------------------------------------------------------
runoff               time, y, x  m           `Runoff <https://en.wikipedia.org/wiki/Surface_runoff>`_
                                             (volume per area)
-------------------  ----------  ----------  ---------------------------------------------------------
roughness            y, x        m           Forecast surface roughness
                                             (`roughness length <https://en.wikipedia.org/wiki/Roughness_length>`_)
-------------------  ----------  ----------  ---------------------------------------------------------
height               y, x        m           Surface elevation above sea level
-------------------  ----------  ----------  ---------------------------------------------------------
albedo               time, y, x  --          `Albedo <https://en.wikipedia.org/wiki/Albedo>`_
                                             measure of diffuse reflection of solar radiation.
                                             Calculated from relation between surface solar radiation
                                             downwards (Jm**-2) and surface net solar radiation
                                             (Jm**-2). Takes values between 0 and 1.
-------------------  ----------  ----------  ---------------------------------------------------------
influx_diffuse       time, y, x  Wm**-2      Diffuse solar radiation at surface.
                                             Surface solar radiation downwards minus
                                             direct solar radiation.
-------------------  ----------  ----------  ---------------------------------------------------------
wnd100m              time, y, x  ms**-1      Wind speeds at 100 meters (regardless of direction)
===================  ==========  ==========  =========================================================

.. image:: /img/era5.png
    :width: 40 %

A SARAH-2 cutout can be used to amend the fields temperature, influx_toa, influx_direct, albedo, influx_diffuse of ERA5 using satellite-based radiation observations.

.. image:: /img/sarah.png
    :width: 40 %

Description

build_natura_raster

Converts vectordata or known as shapefiles (i.e. used for geopandas/shapely) to our cutout rasters. The Protected Planet Data <https://www.protectedplanet.net/en/thematic-areas/wdpa?tab=WDPA>_ on protected areas is aggregated to all cutout regions.

Relevant Settings

.. code:: yaml

renewable:
    {technology}:
        cutout:

.. seealso:: Documentation of the configuration file config.yaml at :ref:renewable_cf

Inputs

  • data/landcover/world_protected_areas/*.shp: shapefiles representing the world protected areas, such as the World Database of Protected Areas (WDPA) <https://www.protectedplanet.net/en/thematic-areas/wdpa?tab=WDPA>_.

    .. image:: /img/natura.png :width: 33 %

Outputs

  • resources/natura/natura.tiff: Rasterized version of the world protected areas, such as WDPA <https://www.protectedplanet.net/en/thematic-areas/wdpa?tab=WDPA>_ natural protection areas to reduce computation times.

    .. image:: /img/natura.png :width: 33 %

Description

To operate the script you need all input files.

This script collects all shapefiles available in the folder data/landcover/* describing regions of protected areas, merges them to one shapefile, and create a rasterized version of the region, that covers the region described by the cutout. The output is a raster file with the name natura.tiff in the folder resources/natura/.

get_relevant_regions(country_shapes, offshore_shapes, natura_crs, buffer)

Merge the country_shapes and the offshore_shapes into one GeoDataFrame. Additionally add a buffer to ensure all relevant regions are included.

Returns

regions : GeoDataFrame with a unified "multipolygon"

get_fileshapes(list_paths, accepted_formats=('.shp',))

Function to parse the list of paths to include shapes included in folders, if any

determine_region_xXyY(cutout_name, regions, natura_size, out_logging)

Determine the bounds of the analyzed regions depending on the natura_size parameter. "global" includes the entire world, "cutout" the extend of the cutout, and "countries" only includes the bounds of the requested countries and their offshore regions.

Returns

cutout_xXyY : List including the bounds

decide_bigtiff_flag(out_shape, dtype='uint8', safety_factor=1.1)

Decide whether BIGTIFF should be "YES" or "NO" based on raster shape. BIGTIFF is required for filesizes larger than 4 GB.

Returns

str: "YES" if the estimated size is larger than 4 GB, else "NO".

build_renewable_profiles

Calculates for each network node the (i) installable capacity (based on land- use), (ii) the available generation time series (based on weather data), and (iii) the average distance from the node for onshore wind, AC-connected offshore wind, DC-connected offshore wind and solar PV generators. For hydro generators, it calculates the expected inflows. In addition for offshore wind it calculates the fraction of the grid connection which is under water.

Relevant settings

.. code:: yaml

snapshots:

atlite:
    nprocesses:

renewable:
    {technology}:
        cutout:
        copernicus:
            grid_codes:
            distance:
            distance_grid_codes:
        natura:
        max_depth:
        max_shore_distance:
        min_shore_distance:
        capacity_per_sqkm:
        correction_factor:
        potential:
        min_p_max_pu:
        clip_p_max_pu:
        resource:
        clip_min_inflow:

.. seealso:: Documentation of the configuration file config.yaml at :ref:snapshots_cf, :ref:atlite_cf, :ref:renewable_cf

Inputs

  • data/copernicus/PROBAV_LC100_global_v3.0.1_2019-nrt_Discrete-Classification-map_EPSG-4326.tif: Copernicus Land Service <https://land.copernicus.eu/global/products/lc> inventory on 23 land use classes (e.g. forests, arable land, industrial, urban areas) based on UN-FAO classification. See Table 4 in the PUM <https://land.copernicus.eu/global/sites/cgls.vito.be/files/products/CGLOPS1_PUM_LC100m-V3_I3.4.pdf> for a list of all classes.

    .. image:: /img/copernicus.png :width: 33 %

  • data/gebco/GEBCO_2021_TID.nc: A bathymetric <https://en.wikipedia.org/wiki/Bathymetry> data set with a global terrain model for ocean and land at 15 arc-second intervals by the General Bathymetric Chart of the Oceans (GEBCO) <https://www.gebco.net/data_and_products/gridded_bathymetry_data/>.

    .. image:: /img/gebco_2021_grid_image.jpg :width: 50 %

    Source: GEBCO <https://www.gebco.net/data_and_products/images/gebco_2019_grid_image.jpg>_

  • resources/natura.tiff: confer :ref:natura

  • resources/offshore_shapes.geojson: confer :ref:shapes
  • resources/.geojson: (if not offshore wind), confer :ref:busregions
  • resources/regions_offshore.geojson: (if offshore wind), :ref:busregions
  • "cutouts/" + config["renewable"][{technology}]['cutout']: :ref:cutout
  • networks/base.nc: :ref:base

Outputs

  • resources/profile_{technology}.nc, except hydro technology, with the following structure

    =================== ========== ========================================================= Field Dimensions Description =================== ========== ========================================================= profile bus, time the per unit hourly availability factors for each node


    weight bus sum of the layout weighting for each node


    p_nom_max bus maximal installable capacity at the node (in MW)


    potential y, x layout of generator units at cutout grid cells inside the Voronoi cell (maximal installable capacity at each grid cell multiplied by capacity factor)


    average_distance bus average distance of units in the Voronoi cell to the grid node (in km)


    underwater_fraction bus fraction of the average connection distance which is under water (only for offshore) =================== ========== =========================================================

  • resources/profile_hydro.nc for the hydro technology =================== ================ ======================================================== Field Dimensions Description =================== ================ ======================================================== inflow plant, time Inflow to the state of charge (in MW), e.g. due to river inflow in hydro reservoir. =================== ================ ========================================================

    • profile

    .. image:: /img/profile_ts.png :width: 33 % :align: center

    • p_nom_max

    .. image:: /img/p_nom_max_hist.png :width: 33 % :align: center

    • potential

    .. image:: /img/potential_heatmap.png :width: 33 % :align: center

    • average_distance

    .. image:: /img/distance_hist.png :width: 33 % :align: center

    • underwater_fraction

    .. image:: /img/underwater_hist.png :width: 33 % :align: center

Description

This script leverages on atlite function to derivate hourly time series for an entire year for solar, wind (onshore and offshore), and hydro data.

This script functions at two main spatial resolutions: the resolution of the network nodes and their Voronoi cells <https://en.wikipedia.org/wiki/Voronoi_diagram>_, and the resolution of the cutout grid cells for the weather data. Typically the weather data grid is finer than the network nodes, so we have to work out the distribution of generators across the grid cells within each Voronoi cell. This is done by taking account of a combination of the available land at each grid cell and the capacity factor there.

This uses the Copernicus land use data, Natura2000 nature reserves and GEBCO bathymetry data.

.. image:: /img/eligibility.png :width: 50 % :align: center

To compute the layout of generators in each node's Voronoi cell, the installable potential in each grid cell is multiplied with the capacity factor at each grid cell. This is done since we assume more generators are installed at cells with a higher capacity factor.

.. image:: /img/offwinddc-gridcell.png :width: 50 % :align: center

.. image:: /img/offwindac-gridcell.png :width: 50 % :align: center

.. image:: /img/onwind-gridcell.png :width: 50 % :align: center

.. image:: /img/solar-gridcell.png :width: 50 % :align: center

This layout is then used to compute the generation availability time series from the weather data cutout from atlite.

Two methods are available to compute the maximal installable potential for the node (p_nom_max): simple and conservative:

  • simple adds up the installable potentials of the individual grid cells. If the model comes close to this limit, then the time series may slightly overestimate production since it is assumed the geographical distribution is proportional to capacity factor.

  • conservative ascertains the nodal limit by increasing capacities proportional to the layout until the limit of an individual grid cell is reached.

get_irena_annual_hydro_generation(fn, countries)

Load annual renewable hydropower generation data from the IRENA Country sheet. Convert ISO3 country codes to ISO2 and annual generation from GWh to MWh.

Original source: https://www.irena.org/-/media/Files/IRENA/Agency/Publication/2025/Jul/IRENA_Statistics_Extract_2025H2.xlsx

Note

IRENA energy statistics dataset is available for non-commercial use only. Users are responsible for ensuring compliance with the dataset’s licensing terms.

check_cutout_completness(cf)

Check if a cutout contains missed values.

That may be the case due to some issues with accessibility of ERA5 data See for details https://confluence.ecmwf.int/display/CUSF/Missing+data+in+ERA5T Returns share of cutout cells with missed data

estimate_bus_loss(data_column, tech)

Calculated share of buses with data loss due to flaws in the cutout data.

Returns share of the buses with missed data

filter_cutout_region(cutout, regions)

Filter the cutout to focus on the region of interest.

rescale_hydro(plants, runoff, normalize_using_yearly, normalization_year)

Function used to rescale the inflows of the hydro capacities to match country statistics.

Parameters

plants : DataFrame Run-of-river plants orf dams with lon, lat, countries, installed_hydro columns. Countries and installed_hydro column are only used with normalize_using_yearly installed_hydro column shall be a boolean vector specifying whether that plant is currently installed and used to normalize the inflows runoff : xarray object Runoff at each bus normalize_using_yearly : DataFrame Dataframe that specifies for every country the total hydro production year : int Year used for normalization

check_flag(d, field)

Check if a string is contained in keys of a dictionary and is either True or non-boolean

build_powerplants

Retrieves conventional powerplant capacities and locations from powerplantmatching <https://github.com/FRESNA/powerplantmatching>_, assigns these to buses and creates a .csv file. It is possible to amend the powerplant database with custom entries provided in data/custom_powerplants.csv.

Relevant Settings

.. code:: yaml

electricity:
  powerplants_filter:
  custom_powerplants:

.. seealso:: Documentation of the configuration file config.yaml at :ref:electricity

Inputs

  • networks/base.nc: confer :ref:base.
  • data/custom_powerplants.csv: custom powerplants in the same format as powerplantmatching <https://github.com/FRESNA/powerplantmatching>_ provides or as OSM extractor generates

Outputs

  • resource/powerplants.csv: A list of conventional power plants (i.e. neither wind nor solar) with fields for name, fuel type, technology, country, capacity in MW, duration, commissioning year, retrofit year, latitude, longitude, and dam information as documented in the powerplantmatching README <https://github.com/FRESNA/powerplantmatching/blob/master/README.md>_; additionally it includes information on the closest substation/bus in networks/base.nc.

    .. image:: /img/powerplantmatching.png :width: 30 %

    Source: powerplantmatching on GitHub <https://github.com/FRESNA/powerplantmatching>_

Description

The configuration options electricity: powerplants_filter and electricity: custom_powerplants can be used to control whether data should be retrieved from the original powerplants database or from custom amendments. These specify pandas.query <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html>_ commands.

  1. Adding all powerplants from custom:

    .. code:: yaml

    powerplants_filter: false
    custom_powerplants: true
    
  2. Replacing powerplants in e.g. Germany by custom data:

    .. code:: yaml

    powerplants_filter: Country not in ['Germany']
    custom_powerplants: true
    

    or

    .. code:: yaml

    powerplants_filter: Country not in ['Germany']
    custom_powerplants: Country in ['Germany']
    
  3. Adding additional built year constraints:

    .. code:: yaml

    powerplants_filter: Country not in ['Germany'] and YearCommissioned <= 2015
    custom_powerplants: YearCommissioned <= 2015
    

Format required for the custom_powerplants.csv should be similar to the powerplantmatching format with some additional considerations:

Columns required: [id, Name, Fueltype, Technology, Set, Country, Capacity, Efficiency, DateIn, DateRetrofit, DateOut, lat, lon, Duration, Volume_Mm3, DamHeight_m, StorageCapacity_MWh, EIC, projectID]

Tagging considerations for columns in the file:

  • FuelType: 'Natural Gas' has to be tagged either as 'OCGT', 'CCGT'
  • Technology: 'Reservoir' has to be set as 'ror' if hydro powerplants are to be considered as 'Generators' and not 'StorageUnits'
  • Country: Country name has to be defined with its alpha2 code ('NG' for Nigeria,'BO' for Bolivia, 'FR' for France, etc.

The following assumptions were done to map custom OSM-extracted power plants with powerplantmatching format.

  1. The benchmark PPM keys values were taken as follows: 'Fueltype': ['Hydro', 'Hard Coal', 'Natural Gas', 'Lignite', 'Nuclear', 'Oil', 'Bioenergy' 'Wind', 'Geothermal', 'Solar', 'Waste', 'Other']

    'Technology': ['Reservoir', 'Pumped Storage', 'Run-Of-River', 'Steam Turbine', 'CCGT', 'OCGT'
        'Pv', 'CCGT, Thermal', 'Offshore', 'Storage Technologies']
    
    'Set': ['Store', 'PP', 'CHP']
    
  2. OSM-extracted features were mapped into PPM ones using a (quite arbitrary) set of rules: 'coal': 'Hard Coal' 'wind_turbine': 'Onshore', 'horizontal_axis' : 'Onshore', 'vertical_axis' : 'Offhore', 'nuclear': 'Steam Turbine'

  3. All hydro OSM-extracted objects were interpreted as generation technologies, although ["Run-Of-River", "Pumped Storage", "Reservoir"] in PPM can belong to 'Storage Technologies', too.
  4. OSM extraction was supposed to be ignoring non-generation features like CHP and Natural Gas storage (in contrast to PPM).

replace_natural_gas_technology(df)

Maps and replaces gas technologies in the powerplants.csv onto model compliant carriers.

build_demand_profiles

Creates electric demand profile csv.

Relevant Settings

.. code:: yaml

load:
    scale:
    ssp:
    weather_year:
    prediction_year:
    region_load:

Inputs

  • networks/base.nc: confer :ref:base, a base PyPSA Network
  • resources/bus_regions/regions_onshore.geojson: confer :mod:build_bus_regions
  • load_data_paths: paths to load profiles, e.g. hourly country load profiles produced by GEGIS
  • resources/shapes/gadm_shapes.geojson: confer :ref:shapes, file containing the gadm shapes

Outputs

  • resources/demand_profiles.csv: the content of the file is the electric demand profile associated to each bus. The file has the snapshots as rows and the buses of the network as columns.

Description

The rule :mod:build_demand creates load demand profiles in correspondence of the buses of the network. It creates the load paths for GEGIS outputs by combining the input parameters of the countries, weather year, prediction year, and SSP scenario. Then with a function that takes in the PyPSA network "base.nc", region and gadm shape data, the countries of interest, a scale factor, and the snapshots, it returns a csv file called "demand_profiles.csv", that allocates the load to the buses of the network according to GDP and population.

get_gegis_regions(countries)

Get the GEGIS region from the config file.

Parameters

region : str The region of the bus

Returns

str The GEGIS region

get_load_paths_gegis(ssp_parentfolder, config)

Create load paths for GEGIS outputs.

The paths are created automatically according to included country, weather year, prediction year and ssp scenario

Example

["/data/ssp2-2.6/2030/era5_2013/Africa.nc", "/data/ssp2-2.6/2030/era5_2013/Africa.nc"]

shapes_to_shapes(orig, dest)

Adopted from vresutils.transfer.Shapes2Shapes()

compose_gegis_load(load_paths, countries)

Read and merge GEGIS electricity demand data from multiple input files.

Parameters

load_paths : str or list[str] Paths to demand input files. countries : str or list[str] Region codes used to look for the demand data.

Returns

gegis_load : pd.DataFrame Electricity load with time index, and containing the columns region_code, region_name, and Electricity demand.

read_demcast_load(load_paths, weather_year, countries)

Load electricity demand data from DemandCast dataset for selected countries and a given weather year.

Parameters

load_paths : str Path to the parquet file with Demcast demand data. weather_year : int Weather year for which demand profile should be extracted. countries : str or list Country name or list of country names to subset the demand dataset.

Returns

demcast_load : pd.DataFrame Electricity load with time index, and containing the columns region_code, region_name, and Electricity demand.

References

Kevin Steijn, Vamsi Priya Goli, Enrico Antonini (2025) "DemandCast: Global hourly electricity demand forecasting" https://arxiv.org/abs/2510.08000

build_demand_profiles(n, load_source, load_paths, regions, admin_shapes, countries, scale, weather_year, start_date, end_date, out_path)

Create csv file of electric demand time series.

Parameters

n : pypsa network load_source : str Type of data source to be used for electricity demand load_paths: paths of the load files regions : .geojson Contains bus_id of low voltage substations and bus region shapes (voronoi cells) admin_shapes : .geojson contains subregional gdp, population and shape data countries : list List of countries that is config input scale : float The scale factor is multiplied with the load (1.3 = 30% more load) start_date: parameter The start_date is the first hour of the first day of the snapshots end_date: parameter The end_date is the last hour of the last day of the snapshots

Returns

demand_profiles.csv : csv file containing the electric demand time series