Description of the Public API

Here we document the Public API.

pairstat.vsf_props(pos_a, pos_b, *args, val_a=<object object>, val_b=<object object>, vel_a=<object object>, vel_b=<object object>, dist_bin_edges=<object object>, weights_a=None, weights_b=None, stat_kw_pairs=[('variance', {})], longitudinal=False, nproc=1, force_sequential=False, postprocess_stat=True)

Calculates properties pertaining to the vector structure function for pairs of points. It’s commonly used for the velocity structure function in particular.

If you set both pos_b and val_b to None then the structure function properties will only be computed for unique pairs of the points specified by pos_a and val_a

Parameters:
pos_a, pos_barray_like

2D arrays holding the positions of each point. Axis 0 should be the number of spatial dimensions must be consistent for each array. Axis 1 can be different for each array

val_a, val_barray_like

2D arrays holding the vector values at each point. The shape of val_a should match pos_a and the shape of val_b should match pos_b.

dist_bin_edgesarray_like

1D array of monotonically increasing values that represent edges for distance bins. A distance x lies in bin i if it lies in the interval dist_bin_edges[i] <= x < dist_bin_edges[i+1].

weights_a, weights_barray_like, optional

optional 1D arrays that can be used to specify weights for point. When specified, the size of weights_a should match np.shape(pos_a)[1] and the size of weights_b should match np.shape(pos_b)[1]. It is an error to specify weights when the specified statistics won’t use them.

stat_kw_pairssequence of (str, dict) tuples

Each entry is a tuple holding the name of a statistic to compute and a dictionary of kwargs needed to compute that statistic. A list of valid statistics are described below. Unless we explicitly state otherwise, an empty dict should be passed for the kwargs.

nprocint, optional

Number of processes to use for parallelizing this calculation. Default is 1. If the problem is small enough, the program may ignore this argument and use fewer processes.

force_sequentialbool, optional

False by default. When True, this forces the code to run with a single process (regardless of the value of nproc). However, the data is still partitioned as though it were using nproc processes. Thus, floating point results should be bitwise identical to an identical function call where this is False. (This is primarily provided for debugging purposes)

postprocess_statbool, optional

Users directly employing this function should almost always set this kwarg to True (the default). This option is only provided to simplify the process of consolidating results from multiple calls to vsf_props.

vel_a, vel_barray_like

Parameters that are deprecated in favor of val_a and val_b.

Notes

Currently recognized unweighted statistic names include:
  • 'mean' : calculate the 1st order structure function.

  • 'variance' : calculate the 1st order structure function and the variance (while variance is related to the 2nd order structure function, it is NOT the same)

  • 'omoment2' : calculate the 1st and 2nd order structure functions.

  • 'omoment3' : calculate the 1st, 2nd, and 3rd order structure functions

  • 'omoment4' : calculate the 1st, 2nd, 3rd, and 4th order structure functions

  • 'histogram' : this constructs a 2D histogram. The bin edges along axis 0 are given by the dist_bin_edges argument. The magnitudes of the vector differences are binned along axis 1. The ‘val_bin_edges’ keyword must be specified alongside this statistic name (to specify the bin edges along axis 1). It should be associated with a 1D monotonic array.

Weighted versions of each of these statistics are also available. To access these, you should prepend "weighted" to the start of the string (so "weightedmean" instead of "mean" or "weightedhistogram" instead of "histogram").

BE AWARE, that unlike 'variance', 'weightedvariance' does NOT attempt to make any corrections to get an unbiased estimate of variance.

pairstat.twopoint_correlation(pos_a, pos_b, val_a, val_b, dist_bin_edges, *, stat_kw_pairs=[('mean', {})], longitudinal=False, nproc=1, force_sequential=False)

Calculates the 2pcf (two-point correlation function) for pairs of points.

If you set both pos_b and val_b to None then the two-point correlation function will only be computed for unique pairs of the points specified by pos_a and val_a

Parameters:
pos_a, pos_barray_like

2D arrays holding the positions of each point. Axis 0 should be the number of spatial dimensions must be consistent for each array. Axis 1 can be different for each array

val_a, val_barray_like

These can be 1D arrays holding the velocities at each point. In that case, the size of val_a should match the length of pos_a along axis 0 and the and size of val_b should match the the length of pos_b. Alternatively, these can be 2D arrays. In this case, the shape of val_a should match pos_a and the shape of val_b should match pos_b.

dist_bin_edgesarray_like

1D array of monotonically increasing values that represent edges for distance bins. A distance x lies in bin i if it lies in the interval dist_bin_edges[i] <= x < dist_bin_edges[i+1].

stat_kw_pairssequence of (str, dict) tuples, optional

The default choice is most meaningful for the 2pcf. In practice, this can accept the same arguments (other than the weighted arguments) accepted by vsf_props().

nprocint, optional

Number of processes to use for parallelizing this calculation. Default is 1. If the problem is small enough, the program may ignore this argument and use fewer processes.

force_sequentialbool, optional

False by default. When True, this forces the code to run with a single process (regardless of the value of nproc). However, the data is still partitioned as though it were using nproc processes. Thus, floating point results should be bitwise identical to an identical function call where this is False. (This is primarily provided for debugging purposes)

Notes

Currently recognized statistic names include:
  • 'mean': the typical correlation function

  • 'variance': the variance of the products of the pairs of scalar values are computed for all pairs of values in a given distance bin (in addition to 'mean').

  • 'omoment2': calculates the 2nd order moment about the origin for all pairs of points in a a given distance bin (in addition to 'mean').

  • 'omoment3': calculates the 3rd order moment about the origin for all pairs of points in a a given distance bin (in addition to 'mean' and 'omoment2').

  • 'omoment4': calculates the 4th order moment about the origin for all pairs of points in a a given distance bin (in addition to 'mean', 'omoment2', and 'omoment3').

  • ‘histogram’: this constructs a 2D histogram. The bin edges along axis 0 are given by the dist_bin_edges argument. The products of the pairs of scalar values are binned along axis 1. The ‘val_bin_edges’ keyword must be specified alongside this statistic name (to specify the bin edges along axis 1). It should be associated with a 1D monotonic array.

Weighted versions of each of these statistics are also available. To access these, you should prepend "weighted" to the start of the string (so "weightedmean" instead of "mean" or "weightedhistogram" instead of "histogram").

BE AWARE, that unlike 'variance', 'weightedvariance' does NOT attempt to make any corrections to get an unbiased estimate of variance.