features (array-like, shape=[…, n_features, n_vertices]) – Descriptors of the shape, where n_features is the number of features extracted by the feature extractor.
features (array-like, shape=[…, n_features, n_vertices]) – Descriptors of the shape, where n_features is the number of features extracted by the feature extractor.
Computes a 352-dimensional (by default) local shape signature per vertex
by accumulating normal-alignment statistics of neighbours into a spatially
structured histogram. The algorithm is translated directly from the
reference C++ implementation by Tombari, Salti and Di Stefano (ECCV 2010).
Parameters:
radius (float, optional) – Support sphere radius. If None (default), the radius is set
automatically to 0.05×bounding-boxdiagonal of the input mesh,
which gives a neighbourhood of roughly 50–200 vertices on typical
normalised human-body meshes (FAUST, SMAL, …).
n_bins (int) – Number of histogram bins per spatial volume (default 10).
Descriptor dimensionality = 32 × (n_bins + 1) = 352 for n_bins=10.
min_neighbors (int) – Minimum number of neighbours required to compute a descriptor for a
vertex. Vertices with fewer neighbours receive an all-zero row.
This class provides an index into a set of k-dimensional points
which can be used to rapidly look up the nearest neighbors of any
point.
Note
cKDTree is functionally identical to KDTree. Prior to SciPy
v1.6.0, cKDTree had better performance and slightly different
functionality but now the two names exist only for
backward-compatibility reasons. If compatibility with SciPy < 1.6 is not
a concern, prefer KDTree.
Parameters:
data (array_like, shape (n,m)) – The n data points of dimension m to be indexed. This array is
not copied unless this is necessary to produce a contiguous
array of doubles, and so modifying this data will result in
bogus results. The data are also copied if the kd-tree is built
with copy_data=True.
leafsize (positive int, optional) – The number of points at which the algorithm switches over to
brute-force. Default: 16.
compact_nodes (bool, optional) – If True, the kd-tree is built to shrink the hyperrectangles to
the actual data range. This usually gives a more compact tree that
is robust against degenerated input data and gives faster queries
at the expense of longer build time. Default: True.
copy_data (bool, optional) – If True the data is always copied to protect the kd-tree against
data corruption. Default: False.
balanced_tree (bool, optional) – If True, the median is used to split the hyperrectangles instead of
the midpoint. This usually gives a more compact tree and
faster queries at the expense of longer build time. Default: True.
boxsize (array_like or scalar, optional) – Apply an m-d toroidal topology to the KDTree. The topology is generated
by \(x_i + n_i L_i\) where \(n_i\) are integers and \(L_i\)
is the boxsize along i-th dimension. The input data shall be wrapped
into \([0, L_i)\). A ValueError is raised if any of the data is
outside of this bound.
Notes
The algorithm used is described in [1]_.
The general idea is that the kd-tree is a binary tree, each of whose
nodes represents an axis-aligned hyperrectangle. Each node specifies
an axis and splits the set of points based on whether their coordinate
along that axis is greater than or less than a particular value.
During construction, the axis and splitting point are chosen by the
“sliding midpoint” rule, which ensures that the cells do not all
become long and thin.
The tree can be queried for the r closest neighbors of any given point
(optionally returning only those within some maximum distance of the
point). It can also be queried, with a substantial gain in efficiency,
for the r approximate closest neighbors.
For large dimensions (20 is already large) do not expect this to run
significantly faster than brute force. High-dimensional nearest-neighbor
queries are a substantial open problem in computer science.
The n data points of dimension m to be indexed. This array is
not copied unless this is necessary to produce a contiguous
array of doubles. The data are also copied if the kd-tree is built
with copy_data=True.
This attribute exposes a Python view of the root node in the cKDTree
object. A full Python view of the kd-tree is created dynamically
on the first access. This attribute allows you to create your own
query functions in Python.
Count the number of pairs (x1,x2) can be formed, with x1 drawn
from self and x2 drawn from other, and where
distance(x1,x2,p)<=r.
Data points on self and other are optionally weighted by the
weights argument. (See below)
This is adapted from the “two-point correlation” algorithm described by
Gray and Moore [1]_. See notes for further discussion.
Parameters:
other (cKDTree instance) – The other tree to draw points from, can be the same tree as self.
r (float or one-dimensional array of floats) – The radius to produce a count for. Multiple radii are searched with
a single tree traversal.
If the count is non-cumulative (cumulative=False), r
defines the edges of the bins, and must be non-decreasing.
p (float, optional) – 1<=p<=infinity.
Which Minkowski p-norm to use.
Default 2.0.
A finite large p may cause a ValueError if overflow can occur.
weights (tuple, array_like, or None, optional) – If None, the pair-counting is unweighted.
If given as a tuple, weights[0] is the weights of points in self, and
weights[1] is the weights of points in other; either can be None to
indicate the points are unweighted.
If given as an array_like, weights is the weights of points in self
and other. For this to make sense, self and other must be the
same tree. If self and other are two different trees, a ValueError
is raised.
Default: None
cumulative (bool, optional) – Whether the returned counts are cumulative. When cumulative is set to False
the algorithm is optimized to work with a large number of bins (>10) specified
by r. When cumulative is set to True, the algorithm is optimized to work
with a small number of r. Default: True
Returns:
result (scalar or 1-D array) – The number of pairs. For unweighted counts, the result is integer.
For weighted counts, the result is float.
If cumulative is False, result[i] contains the counts with
(-infifi==0elser[i-1])<R<=r[i]
Notes
Pair-counting is the basic operation used to calculate the two point
correlation functions from a data set composed of position of objects.
Two point correlation function measures the clustering of objects and
is widely used in cosmology to quantify the large scale structure
in our Universe, but it may be useful for data analysis in other fields
where self-similar assembly of objects also occur.
The Landy-Szalay estimator for the two point correlation function of
D measures the clustering signal in D. [2]
For example, given the position of two sets of objects,
objects D (data) contains the clustering signal, and
where the brackets represents counting pairs between two data sets
in a finite bin around r (distance), corresponding to setting
cumulative=False, and f=float(len(D))/float(len(R)) is the
ratio between number of objects from data and random.
The algorithm implemented here is loosely based on the dual-tree
algorithm described in [1]_. We switch between two different
pair-cumulation scheme depending on the setting of cumulative.
The computing time of the method we use when for
cumulative==False does not scale with the total number of bins.
The algorithm for cumulative==True scales linearly with the
number of bins, though it is slightly faster when only
1 or 2 bins are used. [5].
As an extension to the naive pair-counting,
weighted pair-counting counts the product of weights instead
of number of pairs.
Weighted pair-counting is used to estimate marked correlation functions
([3], section 2.2),
or to properly calculate the average of data per distance bin
(e.g. [4], section 2.1 on redshift).
Examples
You can count neighbors number between two kd-trees within a distance:
x (array_like, last dimension self.m) – An array of points to query.
k (list of integer or integer) – The list of k-th nearest neighbors to return. If k is an
integer it is treated as a list of [1, … k] (range(1, k+1)).
Note that the counting starts from 1.
eps (non-negative float) – Return approximate nearest neighbors; the k-th returned value
is guaranteed to be no further than (1+eps) times the
distance to the real k-th nearest neighbor.
p (float, 1<=p<=infinity) – Which Minkowski p-norm to use.
1 is the sum-of-absolute-values “Manhattan” distance
2 is the usual Euclidean distance
infinity is the maximum-coordinate-difference distance
A finite large p may cause a ValueError if overflow can occur.
distance_upper_bound (nonnegative float) – Return only neighbors within this distance. This is used to prune
tree searches, so if you are doing a series of nearest-neighbor
queries, it may help to supply the distance to the nearest neighbor
of the most recent point.
workers (int, optional) – Number of workers to use for parallel processing. If -1 is given
all CPU threads are used. Default: 1.
Changed in version 1.9.0: The “n_jobs” argument was renamed “workers”. The old name
“n_jobs” was deprecated in SciPy 1.6.0 and was removed in
SciPy 1.9.0.
Returns:
d (array of floats) – The distances to the nearest neighbors.
If x has shape tuple+(self.m,), then d has shape tuple+(k,).
When k == 1, the last dimension of the output is squeezed.
Missing neighbors are indicated with infinite distances.
i (ndarray of ints) – The index of each neighbor in self.data.
If x has shape tuple+(self.m,), then i has shape tuple+(k,).
When k == 1, the last dimension of the output is squeezed.
Missing neighbors are indicated with self.n.
Notes
If the KD-Tree is periodic, the position x is wrapped into the
box.
When the input k is a list, a query for arange(max(k)) is performed, but
only columns that store the requested values of k are preserved. This is
implemented in a manner that reduces memory usage.
x (array_like, shape tuple + (self.m,)) – The point or points to search for neighbors of.
r (array_like, float) – The radius of points to return, shall broadcast to the length of x.
p (float, optional) – Which Minkowski p-norm to use. Should be in the range [1, inf].
A finite large p may cause a ValueError if overflow can occur.
eps (nonnegative float, optional) – Approximate search. Branches of the tree are not explored if their
nearest points are further than r/(1+eps), and branches are
added in bulk if their furthest points are nearer than
r*(1+eps).
workers (int, optional) – Number of jobs to schedule for parallel processing. If -1 is given
all processors are used. Default: 1.
Changed in version 1.9.0: The “n_jobs” argument was renamed “workers”. The old name
“n_jobs” was deprecated in SciPy 1.6.0 and was removed in
SciPy 1.9.0.
return_sorted (bool, optional) – Sorts returned indices if True and does not sort them if False. If
None, does not sort single point queries, but does sort
multi-point queries which was the behavior before this option
was added.
Added in version 1.2.0.
return_length (bool, optional) – Return the number of points inside the radius instead of a list
of the indices.
.. versionadded:: 1.3.0
Returns:
results (list or array of lists) – If x is a single point, returns a list of the indices of the
neighbors of x. If x is an array of points, returns an object
array of shape tuple containing lists of neighbors.
Notes
If you have many points whose neighbors you want to find, you may save
substantial amounts of time by putting them in a cKDTree and using
query_ball_tree.
Find all pairs of points between self and other whose distance is at most r
Parameters:
other (cKDTree instance) – The tree containing points to search against.
r (float) – The maximum distance, has to be positive.
p (float, optional) – Which Minkowski norm to use. p has to meet the condition
1<=p<=infinity.
A finite large p may cause a ValueError if overflow can occur.
eps (float, optional) – Approximate search. Branches of the tree are not explored
if their nearest points are further than r/(1+eps), and
branches are added in bulk if their furthest points are nearer
than r*(1+eps). eps has to be non-negative.
Returns:
results (list of lists) – For each element self.data[i] of this tree, results[i] is a
list of the indices of its neighbors in other.data.
Examples
You can search all pairs of points between two kd-trees within a distance:
Find all pairs of points in self whose distance is at most r.
Parameters:
r (positive float) – The maximum distance.
p (float, optional) – Which Minkowski norm to use. p has to meet the condition
1<=p<=infinity.
A finite large p may cause a ValueError if overflow can occur.
eps (float, optional) – Approximate search. Branches of the tree are not explored
if their nearest points are further than r/(1+eps), and
branches are added in bulk if their furthest points are nearer
than r*(1+eps). eps has to be non-negative.
output_type (string, optional) – Choose the output container, ‘set’ or ‘ndarray’. Default: ‘set’
Returns:
results (set or ndarray) – Set of pairs (i,j), with i<j, for which the corresponding
positions are close. If output_type is ‘ndarray’, an ndarray is
returned instead of a set.
Examples
You can search all pairs of points in a kd-tree within a distance:
Computes a distance matrix between two cKDTrees, leaving as zero
any distance greater than max_distance.
Parameters:
other (cKDTree)
max_distance (positive float)
p (float, 1<=p<=infinity) – Which Minkowski p-norm to use.
A finite large p may cause a ValueError if overflow can occur.
output_type (string, optional) – Which container to use for output data. Options: ‘dok_matrix’,
‘coo_matrix’, ‘dict’, or ‘ndarray’. Default: ‘dok_matrix’.
Returns:
result (dok_matrix, coo_matrix, dict or ndarray) – Sparse matrix representing the results in “dictionary of keys”
format. If a dict is returned the keys are (i,j) tuples of indices.
If output_type is ‘ndarray’ a record array with fields ‘i’, ‘j’,
and ‘v’ is returned,
Examples
You can compute a sparse distance matrix between two kd-trees:
scale (bool, optional) – Whether to make the descriptor scale-invariant, by default True.
n_domain (int, optional) – Number of frequency domains, by default 3.
domain (list, optional) – List of frequency domain limits, by default None.
k (int, optional) – Number of eigenvalues and eigenfunctions to use, by default None.
References
[BK2010]
M. M. Bronstein and I. Kokkinos.
“Scale-invariant heat kernel signatures for non-rigid shape recognition.”
2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,
San Francisco, CA, USA, 2010, pp. 1704-1711.
https://doi.org/10.1109/CVPR.2010.5539838.
Descriptors Module. This module contains various shape descriptors used in Geomfum. Including spectral descriptors, distance from landmarks, and feature extractors.
Computes a 352-dimensional (by default) local shape signature per vertex
by accumulating normal-alignment statistics of neighbours into a spatially
structured histogram. The algorithm is translated directly from the
reference C++ implementation by Tombari, Salti and Di Stefano (ECCV 2010).
Parameters:
radius (float, optional) – Support sphere radius. If None (default), the radius is set
automatically to 0.05×bounding-boxdiagonal of the input mesh,
which gives a neighbourhood of roughly 50–200 vertices on typical
normalised human-body meshes (FAUST, SMAL, …).
n_bins (int) – Number of histogram bins per spatial volume (default 10).
Descriptor dimensionality = 32 × (n_bins + 1) = 352 for n_bins=10.
min_neighbors (int) – Minimum number of neighbours required to compute a descriptor for a
vertex. Vertices with fewer neighbours receive an all-zero row.