Pyleoclim Utilities (pyleoclim.utils)

Pyleoclim makes extensive use of functions from numpy, Pandas, Scipy, and scikit-learn. Please note that some default parameter values for these functions have been changed to more appropriate values for paleoclimate datasets.

Causality

granger_causality

Estimate Granger causality

liang_causality

Estimate Liang causality

Correlation

corr_sig

Estimates the Pearson’s correlation and associated significance between two time series, applicable to cases where the standard assumption of independence breaks down. Three methods are currently implemented: t test with a heuristic correction for the degrees of freedom, Monte Carlo simulations based on an AR(1) model (isopersist), and phase randomization (isospectral).

fdr

False Discovery Rate, as per the method of Benjamini and Hochberg [1995]

Decomposition

Methods used for decomposing timeseries into orthogonal components.

Filter

Filtering functions

Savitzky-Golay filter

Smoothe (and optionally differentiate) data with a Savitzky-Golay filter

Butterworth filter

Applies a Butterworth filter with frequency fc, with optional padding

Mapping

This module contains mapping function based on cartopy <https://scitools.org.uk/cartopy/docs/latest/>_

map

Maps records according to some criteria (e.g, proxy type, interpretation)

Plotting

The functions contained in this module rely heavily on matplotlib <https://matplotlib.org>_. See here for details. If considering plotting without making use of the functions in the ui module, we recommend using matplotlib directly.

However, the following functions can be used to manipulate the default style and save settings.

showfig

Shows the figure

savefig

Saves the figure to a user specified path

set_style

Modifies the visualization style

Spectral

This modules contains several spectral methods applicable to paleoclimate data

welch

Estimate power spectral density using Welch’s method

periodogram

Estimate power spectral density using periodogram method

mtm

Estimate power spectral density using multi-taper method

lomb_scargle

Estimate power spectral density using the Lomb-Scargle method

wwz_psd

Estimate power spectral density using the Weighted Z-Transform wavelet method

Tsmodel

This module generates simulated time series that can be used for significance testing.

ar1_sim

Produces p realizations of an AR(1) process of length n with lag-1 autocorrelation g calculated from y and (if provided) t

colored_noise

Generate a colored noise with given scaling factor alpha

colored_noise_2regimes

Generate a colored noise with two regimes given scaling factors alpha1 and alpha2

Wavelet

Functions for wavelet analysis. Includes some pre-processing and post-processing functions for spectral and wavelet analysis described here.

wwz

Weighted wavelet amplitude (WWA) for unevenly-spaced data

cwt

Continous wavelet transform for evenly spaced data

xwc

Cross-wavelet analysis for unevenly-spaced data.

Tsutils

This modules contain pre-processing functions for time series analysis.

simple_stats

Computes the mean, median, min, max, standard deviation and interquartile range of a timeseries

bin

Bin the values into evenly-spaced bins

gkernel

Coarsen time resolution using a Gaussian Kernel

grid_properties

Establishes the grid properties of a numerical array

interp

Interpolation function based on scipy.interpolate.interp1d <https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp1d.html>_

on_common_axis

Places two timeseries on a common time axis

standardize

Standardizes a timeseries

ts2segments

Chop a timeseries into several segments based on gap detection

clean_ts

Remove NaNs in the time series and sort it in ascending time order

dropna

Remove NaNs

sort_ts

Sort time values in ascending order

reduce_duplicated_timestamps

Reduce duplicated timestamps in a timeseries by averaging the values

annualize

Annualize a time series whose time resolution is finer than 1 year

gaussianize

Maps a (proxy) timeseries to a Gaussian distribution

gaussianize_single

Transforms a single (proxy) timeseries to a Gaussian distribution

detrend

Applies linear, constant, low-pass filter, or decomposition-based detrending

detect_outliers

Detect outliers in a timeseries

remove_outliers

Remove outliers in a timeseries

is_evenly_spaced

Detect whether a timeseries is evenly spaced in time

Lipdutils

This module contains functions to manipulate LiPD files and automate data transformation whenever possible. These functions are used throughout Pyleoclim but are not meant for direct interactions. A list of these functions can be found here.

The most relevant functions concern querying the LinkedEarth wiki. The first 5 functions can be used to get relevant query terms.

whatArchives

Query the names of all ArchiveTypes from the LinkedEarth Ontology

whatProxyObservations

Query the names of all ProxyObservations from the LinkedEarth Ontology

whatProxySensors

Query the names of all ProxySensors from the LinkedEarth Ontology

whatInferredVariables

Query the names of all InferredVariables from the LinkedEarth Ontology

whatInterpretations

Query the names of all Interpretations from the LinkedEarth Ontology.

queryLinkedEarth

Query the LinkedEarth wiki for datasets.

jsonutils

This module converts Pyleoclim objects to and from JSON files. Useful for obtaining a human-readable output and keeping the results of an analysis.

PyleoObj_to_json

Saves a Pyleoclim Object (e.g., Series, PSD, Scalogram) to a json file

json_to_Series

Load a pyleoclim Series object from a JSON file

json_to_PSD

Load a pyleoclim PSD object from a JSON file

json_to_Scalogram

Load a pyleoclim Scalogram object from a JSON file