NClimGrid Importer¶
Executive Summary¶
The analysis ready NClimGrid data made available through the CISESS ARC project and the NOAA Open Data Dissemination program allow for parallel performant access to NClimGrid data. Importing this data leveraging the features of the underlying storage format means that only the data requested is loaded into memory for analysis.
Getting Started¶
To get started, first check out the code available here.
This repository contains a to-be-published python package with hooks to the NODD AWS NClimGrid repository that facilates fast access to the data.
We will primarily be using the load_nclimgrid_data
function located under the nclimgrid_importer
directory in the repository.
This function will allow us to specify the time periods we want to examine, the spatial resolution we want our data, and the specific spatial areas that we want to analyze.
First, set up your environment. If you are working in this cloned repo, consult the Makefile for easy setup:
If not, it is recommended to set up an isolated python environment with the dependencies listed in the pyproject.toml
file.
Dependencies This project uses Poetry as its environment and package manager. Please install it globally if you have not done so already. The NClimGrid data are available most rapidly in parquet format. To interface with this format, you may need to install some system libraries to access these data. On most systems, running
poetry install
should take care of it.