Data Sources and Supply Chain#
The data supply chain is created mostly from global data sources, local data sources (local gov./authorities) have been used in absence of suitable global data sources. A few of the data sources do not have API/static URL and/or requires form inputs from user. This guideline will help the user to have overall idea about the data, sources and download process. The use-case denotes the usage exclusively in this tool and analysis.
CODERS#
for CANADIAN studies.
Create
coders_api.yaml
config file
structure for
coders_api.yaml
below:
Default_user: <your_username> or <other username>)>
api_keys:
<your_username>: <your_api_key>
<other_user1>: <other_api_key1>
<other_user2>: <other_api_key2>
.....
<other_userN>: <other_api_keyN>
Save it at directory:
data/downloaded_data/CODERS
1. Demographics#
1.1 - Population#
Tag: Local
Authority: Statistics Canada
License : Data obtained through this application is distributed under the Canadian Open Government License.
In-short : worldwide, royalty-free, perpetual, non-exclusive licence to Copy, modify, publish, translate, adapt, distribute or otherwise use the Information in any medium, mode or format for any lawful purpose
Data: Population projection 2021-2046
Resolution: Annual population for regional districts (sub-provincial).
Description: Historical data up to 2023 and projection for 2024-2046.
Use-case : To mimic the load-centers in Canada at sub-provincial level (regional districts of province)
Supply_chain_mode : Manual Download from the portal
Instruction: Manually download from the portal with mentioned steps given in data_sources.yml
2. Climate and Weather Data#
2.1 Cutout from ERA5#
Tag: Global
Authority: Copernicus Climate Change Service (C3S), ECMWF, EU.
License : free of charge, worldwide, non-exclusive, royalty free and perpetual.
Caution: have to mention the attribution regarding C3S.
Description: Solar influx, wind speed (vertical components at 100m), land elevation (heights) time-series data for weather years.
Resolution: hourly time-series for .25 arc degree (~ 30km) grids.
Use-case :
A cutout is one of the basis for this work and associated calculations.
We are using atlite to create the cutout and also to download the ERA5 data for the cutout. The cutout will be saved as a NetCDF (.nc) file. NetCDF is a file format often used for storing large scientific data sets that often involves time-series data, especially in the fields of climate and weather research. Please check this resource for more about cutout preparation and customization.
In this analysis, we are downloading ERA5 data on-demand for a specified region e.g. BC region cutout . But atlite does also work with other data sources e.g. SARAH-2 for high resolution solar dataset.
NREL has higher spatio-temporal dataset for renewable resources but does not cover complete global regions. Atlite currently does not support NREL's NSDRB for solar or WRDB for wind. Users can follow this thread for updates.
Atlite does not support ERA5 forecast data yet. Users can follow this thread for updates
Please go through this documentation and example usage of cutout to learn further.
Supply_chain_mode : Automated via cdsapi (current version is cds-beta)
Note: From Sep 26, 2024 onwards the ERA5 dataset will only be supplied via cds-beta or ads-beta (source)
Before the data can be downloaded from ERA5, it has to be processed by CDS servers, this might take a while depending on the volume of data requested. This only works if you have in before
For linux users, please proceed as follows:
Steps to install the Copernicus Climate Data Store cdsapi package at your local Linux/WSL (sourced from > Registered and setup your CDS API key as described)
step1: Setup the CDS API personal access token
step2: Install the CDS API client.Note: atlite currently supports cdsapi <=0.7.2
Now your datapipeline to create the ERA5 Cutout is set.
2. Geospatial Raster/Vectors#
2.1 Boundaries from GADM#
Tag: Global
This data could be sourced locally as well e.g for Canada from Canadian open-dataset
Other global data sources :
OpenstreetMap via pyrosm library.
World Administrative Boundaries - Countries and Territories by opendatasoft (https://public.opendatasoft.com/explore/dataset/world-administrative-boundaries/export)
License : freely available for academic use and other non-commercial use
Authority: University of Berkeley, Museum of Vertebrate Zoology and the International Rice Research Institute (2012)
Description: GADM, the Database of Global Administrative Areas, is a high-resolution database of country administrative areas, with a goal of "all countries, at all levels, at any time period.
Use-case : This boundary has been processed for admin level 2 (i.e. sub-provincial) to extract geospatial boundaries of the Regional Districts (RD) e.g. 28 RDs inside BC, Canada. This boundary is primarily used for spatial-grid cell/point mapping, regional overlay visuals, clipping point of interests in regional level while clustering.
Supply_chain_mode : Automated via pygadm library [supports GADM data V4.1]
2.2 Conservation and Protected Lands#
Tag: Local
GAEZ also has similar global data under Land Resources (LR) theme, raster data with 7 classes. We are using this data as a mandatory filter in the process. But the local (pan-Canadian) data has more detailed local government and indigenous protected areas' data. The user can control the classes of exclusion and also can use buffer around exclusion for both case.
License : Data obtained through this application is distributed under the Canadian Open Government License.
In-short : worldwide, royalty-free, perpetual, non-exclusive licence to Copy, modify, publish, translate, adapt, distribute or otherwise use the Information in any medium, mode or format for any lawful purpose
Authority: Environment and Climate Change Canada (ECCC)
Data: Canadian Protected and Conserved Areas Database (CPCAD) | 2023-12-31
downloadble_source_url: https://data-donnees.az.ec.gc.ca/api/file?path=%2Fspecies%2Fprotectrestore%2Fcanadian-protected-conserved-areas-database%2FDatabases%2FProtectedConservedArea_2022.gdb.zip
Resolution: Spatial boundaries vector data
Description: CPCAD is the authoritative source of data on protected and conserved areas in Canada. The database consists of the most up-to-date spatial and attribute data on marine and terrestrial protected areas in all governance categories recognized by the International Union for Conservation of Nature (IUCN), as well as other effective area-based conservation measures (OECMs, or conserved areas) across the country. Indigenous Protected and Conserved Areas (IPCAs) are also included if they are recognized as protected or conserved areas. CPCAD adheres to national reporting standards and is available to the public.
Use-case : These specific areas (raster cells/vectors) are excluded in analysis for site considerations. The modeller can also consider buffer around exclusion areas.
Supply_chain_mode : Automated via specific url download. Has dependency on source_url.
Energy and Emission (exogenous)#
Community Energy and Emissions Inventory(CEEI)#
Tag: Local
License : Data obtained through this application is distributed under the Canadian Open Government License.
Authority: [Community Energy and Emissions Inventory(CEEI)]https://www2.gov.bc.ca/gov/content/environment/climate-change/data/ceei
Data: CEEI data up to 2021
Resolution: Annual total for Regional Districts, for different sectors and different end-use demands.
Description: The Community Energy and Emissions Inventory (CEEI) provides community-level greenhouse gas (GHG) emissions and energy consumption estimates for communities across BC. The data covers the buildings, municipal solid waste, and on-road transportation sectors for 161 municipalities, 28 regional districts, and 1 region (Stikine).
Buildings :The data is provided by utility companies and includes the amount of electricity and natural gas used by residential, commercial and some industrial buildings.
Transportation : Community-level data on greenhouse gas emissions from on-road transportation.
Waste : Estimates of community greenhouse gas emissions based on historic annual tonnes of waste disposed at regional district landfills.
More about data methods and inputs
Use-case : Used for load-center estimations on regional district level. Further used for Battery Energy Storage (BESS) size and required discharge hour estimation.
Supply_chain_mode : Automated via specific url download. Check config file for specific url dependencies.
Information Template#
Tag:
License :
Authority:
Data: title
Resolution:
Description:
Use-case :
Supply_chain_mode :
Instruction: