Custom Data Integration¶

PyPSA-Earth allows users to extend the model with custom data to better reflect local or specialized scenarios. This section guides users on integrating custom datasets, ensuring smooth integration and reproducibility.

Overview¶

Custom data can be used to replace or supplement the default datasets provided by the model. Supported types include:

Power grids and lines
Power plants
Heat demand
Industry demand and databases
Transport demand
Water costs
Hydrogen underground storage
Gas networks
Export ports
Airports

Note

All custom data can remain private if desired. Users are not required to share their data publicly.

Configuration¶

The config.default.yaml file controls which custom data options are enabled. Each option can be set to true or false. When enabled, the model will expect corresponding files in the specified directories.

custom_data:
  heat_demand: false
  industry_demand: false
  industry_database: false
  transport_demand: false
  water_costs: false
  h2_underground: false
  add_existing: false
  custom_sectors: false
  gas_network: false
  export_ports: false
  airports: false

Required File Locations and Formats¶

Custom Data Type	Required File Path
Gas network	`resources/custom_data/pipelines.csv`
Export ports	`data/custom/export_ports.csv`
Airports	`data/custom/airports.csv`
Powerplants	`data/custom_powerplants.csv`

Note

Custom datasets should follow the filename conventions specified by PyPSA-Earth to ensure proper integration. See the demand section for details.

Reference Data Sources¶

For guidance on sourcing data, refer to the following table:

Name	Link	Sector	Global	API
IEA	https://www.iea.org/countries/	All	Yes	?
WRI	https://www.wri.org/	All	Yes	?
OECD	https://stats.oecd.org/	All	Yes	?

Note

This table is continuously updated to include new global and country-level datasets.

The PyPSA-Earth Status stream is also a valuable resource for sourcing and validating custom data. It provides up-to-date information on available datasets and can be used to cross-check custom inputs against known reference values.

Best Practices¶

Keep custom datasets in the recommended directories to avoid conflicts
Maintain the same format and prescribed filenames as the default CSV/NetCDF files for seamless integration
Document any assumptions or modifications made in custom data for future reproducibility

Additional Notes¶

If using GADM clustering, ensure at least one bus per administrative region. Missing buses can be added using a custom CSV created with centroids matching the substation GeoJSON format.
Private datasets do not need to be shared publicly.
Users are encouraged to contribute improvements back to the repository following contribution guidelines. See the how to contribute guide for details.

Usage Instructions¶

Enable the desired options in config.default.yaml.
Place required custom CSV/NetCDF files in the specified directories.
Integrate demand/renewable time series following the instructions.
Run PyPSA-Earth; the model will automatically use the custom datasets.