Description of the Public API¶

Here we document the Public API.

pairstat.vsf_props(pos_a, pos_b, *args, val_a=<object object>, val_b=<object object>, vel_a=<object object>, vel_b=<object object>, dist_bin_edges=<object object>, weights_a=None, weights_b=None, stat_kw_pairs=[('variance', {})], longitudinal=False, nproc=1, force_sequential=False, postprocess_stat=True)¶

Calculates properties pertaining to the vector structure function for pairs of points. It’s commonly used for the velocity structure function in particular.

If you set both pos_b and val_b to None then the structure function properties will only be computed for unique pairs of the points specified by pos_a and val_a

Parameters:

pos_a, pos_barray_like: 2D arrays holding the positions of each point. Axis 0 should be the number of spatial dimensions must be consistent for each array. Axis 1 can be different for each array
val_a, val_barray_like: 2D arrays holding the vector values at each point. The shape of val_a should match pos_a and the shape of val_b should match pos_b.
dist_bin_edgesarray_like: 1D array of monotonically increasing values that represent edges for distance bins. A distance x lies in bin i if it lies in the interval dist_bin_edges[i] <= x < dist_bin_edges[i+1].
weights_a, weights_barray_like, optional: optional 1D arrays that can be used to specify weights for point. When specified, the size of weights_a should match np.shape(pos_a)[1] and the size of weights_b should match np.shape(pos_b)[1]. It is an error to specify weights when the specified statistics won’t use them.
stat_kw_pairssequence of (str, dict) tuples: Each entry is a tuple holding the name of a statistic to compute and a dictionary of kwargs needed to compute that statistic. A list of valid statistics are described below. Unless we explicitly state otherwise, an empty dict should be passed for the kwargs.
nprocint, optional: Number of processes to use for parallelizing this calculation. Default is 1. If the problem is small enough, the program may ignore this argument and use fewer processes.
force_sequentialbool, optional: False by default. When True, this forces the code to run with a single process (regardless of the value of nproc). However, the data is still partitioned as though it were using nproc processes. Thus, floating point results should be bitwise identical to an identical function call where this is False. (This is primarily provided for debugging purposes)
postprocess_statbool, optional: Users directly employing this function should almost always set this kwarg to True (the default). This option is only provided to simplify the process of consolidating results from multiple calls to vsf_props.
vel_a, vel_barray_like: Parameters that are deprecated in favor of val_a and val_b.

Notes

Currently recognized unweighted statistic names include:

'mean' : calculate the 1st order structure function.
'variance' : calculate the 1st order structure function and the variance (while variance is related to the 2nd order structure function, it is NOT the same)
'omoment2' : calculate the 1st and 2nd order structure functions.
'omoment3' : calculate the 1st, 2nd, and 3rd order structure functions
'omoment4' : calculate the 1st, 2nd, 3rd, and 4th order structure functions
'histogram' : this constructs a 2D histogram. The bin edges along axis 0 are given by the dist_bin_edges argument. The magnitudes of the vector differences are binned along axis 1. The ‘val_bin_edges’ keyword must be specified alongside this statistic name (to specify the bin edges along axis 1). It should be associated with a 1D monotonic array.

Weighted versions of each of these statistics are also available. To access these, you should prepend "weighted" to the start of the string (so "weightedmean" instead of "mean" or "weightedhistogram" instead of "histogram").

BE AWARE, that unlike 'variance', 'weightedvariance' does NOT attempt to make any corrections to get an unbiased estimate of variance.

pairstat.twopoint_correlation(pos_a, pos_b, val_a, val_b, dist_bin_edges, *, stat_kw_pairs=[('mean', {})], longitudinal=False, nproc=1, force_sequential=False)¶

Calculates the 2pcf (two-point correlation function) for pairs of points.

If you set both pos_b and val_b to None then the two-point correlation function will only be computed for unique pairs of the points specified by pos_a and val_a

Parameters:

pos_a, pos_barray_like: 2D arrays holding the positions of each point. Axis 0 should be the number of spatial dimensions must be consistent for each array. Axis 1 can be different for each array
val_a, val_barray_like: These can be 1D arrays holding the velocities at each point. In that case, the size of val_a should match the length of pos_a along axis 0 and the and size of val_b should match the the length of pos_b. Alternatively, these can be 2D arrays. In this case, the shape of val_a should match pos_a and the shape of val_b should match pos_b.
dist_bin_edgesarray_like: 1D array of monotonically increasing values that represent edges for distance bins. A distance x lies in bin i if it lies in the interval dist_bin_edges[i] <= x < dist_bin_edges[i+1].
stat_kw_pairssequence of (str, dict) tuples, optional: The default choice is most meaningful for the 2pcf. In practice, this can accept the same arguments (other than the weighted arguments) accepted by vsf_props().
nprocint, optional: Number of processes to use for parallelizing this calculation. Default is 1. If the problem is small enough, the program may ignore this argument and use fewer processes.
force_sequentialbool, optional: False by default. When True, this forces the code to run with a single process (regardless of the value of nproc). However, the data is still partitioned as though it were using nproc processes. Thus, floating point results should be bitwise identical to an identical function call where this is False. (This is primarily provided for debugging purposes)

Notes

Currently recognized statistic names include:

'mean': the typical correlation function
'variance': the variance of the products of the pairs of scalar values are computed for all pairs of values in a given distance bin (in addition to 'mean').
'omoment2': calculates the 2nd order moment about the origin for all pairs of points in a a given distance bin (in addition to 'mean').
'omoment3': calculates the 3rd order moment about the origin for all pairs of points in a a given distance bin (in addition to 'mean' and 'omoment2').
'omoment4': calculates the 4th order moment about the origin for all pairs of points in a a given distance bin (in addition to 'mean', 'omoment2', and 'omoment3').
‘histogram’: this constructs a 2D histogram. The bin edges along axis 0 are given by the dist_bin_edges argument. The products of the pairs of scalar values are binned along axis 1. The ‘val_bin_edges’ keyword must be specified alongside this statistic name (to specify the bin edges along axis 1). It should be associated with a 1D monotonic array.

BE AWARE, that unlike 'variance', 'weightedvariance' does NOT attempt to make any corrections to get an unbiased estimate of variance.