hilbertsfc.torch¶
Core APIs¶
Hilbert¶
Morton¶
Cache Management¶
hilbertsfc.torch ¶
PyTorch-API for HilbertSFC.
This subpackage provides 2D/3D Hilbert and Morton encode/decode functions that operate on
integer torch.Tensor inputs.
hilbert_encode_2d ¶
hilbert_encode_2d(
x: Tensor,
y: Tensor,
*,
nbits: int | None = None,
out: Tensor | None = None,
lut_cache: TorchCacheMode = "device",
cpu_parallel: bool | None = None,
cpu_backend: CPUBackend = "auto",
gpu_backend: GPUBackend = "auto",
triton_tuning: TritonTuningMode = "heuristic",
) -> Tensor
Encode 2D integer coordinates to Hilbert indices.
This function provides a PyTorch equivalent of
hilbert_encode_2d. It accepts
integer torch.Tensor of arbitrary shape on any device, and dispatches
to backend-specific implementations depending on device and backend settings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Integer coordinate tensors to encode. Must have identical |
required |
y
|
Tensor
|
Integer coordinate tensors to encode. Must have identical |
required |
nbits
|
int | None
|
Number of bits per coordinate axis. This defines a coordinate domain of
Must satisfy If
For best performance and tighter output dtypes, pass the smallest value that covers the input coordinate range. |
None
|
out
|
Tensor | None
|
Optional output tensor. Must have the same shape and device as |
None
|
lut_cache
|
TorchCacheMode
|
Cache mode for look-up tables (LUTs) used by the Torch/Triton kernels.
This setting is ignored by the CPU Numba path. |
'device'
|
cpu_parallel
|
bool | None
|
Controls whether the CPU Numba kernel may execute in parallel. Only applies when dispatching to the CPU Numba backend and the input is
not a scalar tensor. If |
None
|
cpu_backend
|
CPUBackend
|
CPU backend selection.
|
'auto'
|
gpu_backend
|
GPUBackend
|
GPU (accelerator) backend selection.
|
'auto'
|
triton_tuning
|
TritonTuningMode
|
Triton launch config selection policy.
Only applies when the Triton backend is used. |
'heuristic'
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Hilbert indices.
|
Raises:
| Type | Description |
|---|---|
TypeError
|
If a non-integer tensor is provided. |
ValueError
|
If inputs are on different devices, have mismatched shapes, if |
RuntimeError
|
If |
Notes
When using this function with torch.compile, call
precache_compile_luts
before compilation. This avoids materialization of LUTs inside the
compiled region, which causes graph breaks, extra overhead, and failure
with fullgraph=True.
hilbert_decode_2d ¶
hilbert_decode_2d(
index: Tensor,
*,
nbits: int | None = None,
out_x: Tensor | None = None,
out_y: Tensor | None = None,
lut_cache: TorchCacheMode = "device",
cpu_parallel: bool | None = None,
cpu_backend: CPUBackend = "auto",
gpu_backend: GPUBackend = "auto",
triton_tuning: TritonTuningMode = "heuristic",
) -> tuple[Tensor, Tensor]
Decode Hilbert indices to 2D integer coordinates.
This function provides a PyTorch equivalent of
hilbert_decode_2d. It accepts
integer torch.Tensor of arbitrary shape on any device, and dispatches
to backend-specific implementations depending on device and backend settings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
index
|
Tensor
|
Integer Hilbert index tensor to decode. |
required |
nbits
|
int | None
|
Number of bits per coordinate axis. This defines a coordinate domain of
Must satisfy If
For best performance and tighter output dtypes, pass the smallest value that covers the input index range. |
None
|
out_x
|
Tensor | None
|
Optional output coordinate tensors. Either provide both or neither. Each must have the same shape and device as |
None
|
out_y
|
Tensor | None
|
Optional output coordinate tensors. Either provide both or neither. Each must have the same shape and device as |
None
|
lut_cache
|
TorchCacheMode
|
Cache mode for look-up tables (LUTs) used by the Torch/Triton kernels.
This setting is ignored by the CPU Numba path. |
'device'
|
cpu_parallel
|
bool | None
|
Controls whether the CPU Numba kernel may execute in parallel. Only applies when dispatching to the CPU Numba backend and the input is
not a scalar tensor. If |
None
|
cpu_backend
|
CPUBackend
|
CPU backend selection.
|
'auto'
|
gpu_backend
|
GPUBackend
|
GPU (accelerator) backend selection.
|
'auto'
|
triton_tuning
|
TritonTuningMode
|
Triton launch config selection policy.
Only applies when the Triton backend is used. |
'heuristic'
|
Returns:
| Type | Description |
|---|---|
tuple[Tensor, Tensor]
|
Decoded coordinates
|
Raises:
| Type | Description |
|---|---|
TypeError
|
If a non-integer tensor is provided. |
ValueError
|
If |
RuntimeError
|
If |
Notes
When using this function with torch.compile, call
precache_compile_luts
before compilation. This avoids materialization of LUTs inside the
compiled region, which causes graph breaks, extra overhead, and failure
with fullgraph=True.
hilbert_encode_3d ¶
hilbert_encode_3d(
x: Tensor,
y: Tensor,
z: Tensor,
*,
nbits: int | None = None,
out: Tensor | None = None,
lut_cache: TorchCacheMode = "device",
cpu_parallel: bool | None = None,
cpu_backend: CPUBackend = "auto",
gpu_backend: GPUBackend = "auto",
triton_tuning: TritonTuningMode = "heuristic",
) -> Tensor
Encode 3D integer coordinates to Hilbert indices.
This function provides a PyTorch equivalent of
hilbert_encode_3d. It accepts
integer torch.Tensor of arbitrary shape on any device, and dispatches
to backend-specific implementations depending on device and backend settings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Integer coordinate tensors to encode. Must have identical |
required |
y
|
Tensor
|
Integer coordinate tensors to encode. Must have identical |
required |
z
|
Tensor
|
Integer coordinate tensors to encode. Must have identical |
required |
nbits
|
int | None
|
Number of bits per coordinate axis. This defines a coordinate domain of
Must satisfy If
For best performance and tighter output dtypes, pass the smallest value that covers the input coordinate range. |
None
|
out
|
Tensor | None
|
Optional output tensor. Must have the same shape and device as |
None
|
lut_cache
|
TorchCacheMode
|
Cache mode for look-up tables (LUTs) used by the Torch/Triton kernels.
This setting is ignored by the CPU Numba path. |
'device'
|
cpu_parallel
|
bool | None
|
Controls whether the CPU Numba kernel may execute in parallel. Only applies when dispatching to the CPU Numba backend and the input is
not a scalar tensor. If |
None
|
cpu_backend
|
CPUBackend
|
CPU backend selection.
|
'auto'
|
gpu_backend
|
GPUBackend
|
GPU (accelerator) backend selection.
|
'auto'
|
triton_tuning
|
TritonTuningMode
|
Triton launch config selection policy.
Only applies when the Triton backend is used. |
'heuristic'
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Hilbert indices.
|
Raises:
| Type | Description |
|---|---|
TypeError
|
If a non-integer tensor is provided. |
ValueError
|
If inputs are on different devices, have mismatched shapes, if |
RuntimeError
|
If |
Notes
When using this function with torch.compile, call
precache_compile_luts
before compilation. This avoids materialization of LUTs inside the
compiled region, which causes graph breaks, extra overhead, and failure
with fullgraph=True.
hilbert_decode_3d ¶
hilbert_decode_3d(
index: Tensor,
*,
nbits: int | None = None,
out_x: Tensor | None = None,
out_y: Tensor | None = None,
out_z: Tensor | None = None,
lut_cache: TorchCacheMode = "device",
cpu_parallel: bool | None = None,
cpu_backend: CPUBackend = "auto",
gpu_backend: GPUBackend = "auto",
triton_tuning: TritonTuningMode = "heuristic",
) -> tuple[Tensor, Tensor, Tensor]
Decode Hilbert indices to 3D integer coordinates.
This function provides a PyTorch equivalent of
hilbert_decode_3d. It accepts
integer torch.Tensor of arbitrary shape on any device, and dispatches
to backend-specific implementations depending on device and backend settings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
index
|
Tensor
|
Integer Hilbert index tensor to decode. |
required |
nbits
|
int | None
|
Number of bits per coordinate axis. This defines a coordinate domain of
Must satisfy If
For best performance and tighter output dtypes, pass the smallest value that covers the input index range. |
None
|
out_x
|
Tensor | None
|
Optional output coordinate tensors. Either provide all three or none. Each must have the same shape and device as |
None
|
out_y
|
Tensor | None
|
Optional output coordinate tensors. Either provide all three or none. Each must have the same shape and device as |
None
|
out_z
|
Tensor | None
|
Optional output coordinate tensors. Either provide all three or none. Each must have the same shape and device as |
None
|
lut_cache
|
TorchCacheMode
|
Cache mode for look-up tables (LUTs) used by the Torch/Triton kernels.
This setting is ignored by the CPU Numba path. |
'device'
|
cpu_parallel
|
bool | None
|
Controls whether the CPU Numba kernel may execute in parallel. Only applies when dispatching to the CPU Numba backend and the input is
not a scalar tensor. If |
None
|
cpu_backend
|
CPUBackend
|
CPU backend selection.
|
'auto'
|
gpu_backend
|
GPUBackend
|
GPU (accelerator) backend selection.
|
'auto'
|
triton_tuning
|
TritonTuningMode
|
Triton launch config selection policy.
Only applies when the Triton backend is used. |
'heuristic'
|
Returns:
| Type | Description |
|---|---|
tuple[Tensor, Tensor, Tensor]
|
Decoded coordinates
|
Raises:
| Type | Description |
|---|---|
TypeError
|
If a non-integer tensor is provided. |
ValueError
|
If |
RuntimeError
|
If |
Notes
When using this function with torch.compile, call
precache_compile_luts
before compilation. This avoids materialization of LUTs inside the
compiled region, which causes graph breaks, extra overhead, and failure
with fullgraph=True.
morton_encode_2d ¶
morton_encode_2d(
x: Tensor,
y: Tensor,
*,
nbits: int | None = None,
out: Tensor | None = None,
cpu_parallel: bool | None = None,
cpu_backend: CPUBackend = "auto",
gpu_backend: GPUBackend = "auto",
triton_tuning: TritonTuningMode = "heuristic",
) -> Tensor
Encode 2D integer coordinate tensors to Morton (Z-order) indices.
API semantics for parameters, returns, and errors match
hilbert_encode_2d,
except that Morton kernels do not use lookup tables and therefore do not
accept lut_cache.
morton_decode_2d ¶
morton_decode_2d(
index: Tensor,
*,
nbits: int | None = None,
out_x: Tensor | None = None,
out_y: Tensor | None = None,
cpu_parallel: bool | None = None,
cpu_backend: CPUBackend = "auto",
gpu_backend: GPUBackend = "auto",
triton_tuning: TritonTuningMode = "heuristic",
) -> tuple[Tensor, Tensor]
Decode Morton (Z-order) index tensors to 2D integer coordinates.
API semantics for parameters, returns, and errors match
hilbert_decode_2d,
except that Morton kernels do not use lookup tables and therefore do not
accept lut_cache.
morton_encode_3d ¶
morton_encode_3d(
x: Tensor,
y: Tensor,
z: Tensor,
*,
nbits: int | None = None,
out: Tensor | None = None,
cpu_parallel: bool | None = None,
cpu_backend: CPUBackend = "auto",
gpu_backend: GPUBackend = "auto",
triton_tuning: TritonTuningMode = "heuristic",
) -> Tensor
Encode 3D integer coordinate tensors to Morton (Z-order) indices.
API semantics for parameters, returns, and errors match
hilbert_encode_3d,
except that Morton kernels do not use lookup tables and therefore do not
accept lut_cache.
morton_decode_3d ¶
morton_decode_3d(
index: Tensor,
*,
nbits: int | None = None,
out_x: Tensor | None = None,
out_y: Tensor | None = None,
out_z: Tensor | None = None,
cpu_parallel: bool | None = None,
cpu_backend: CPUBackend = "auto",
gpu_backend: GPUBackend = "auto",
triton_tuning: TritonTuningMode = "heuristic",
) -> tuple[Tensor, Tensor, Tensor]
Decode Morton (Z-order) index tensors to 3D integer coordinates.
API semantics for parameters, returns, and errors match
hilbert_decode_3d,
except that Morton kernels do not use lookup tables and therefore do not
accept lut_cache.
precache_compile_luts ¶
precache_compile_luts(
device: TorchDeviceLike = None,
*,
op: TorchHilbertOp = "all",
) -> None
Pre-cache Torch LUT tensors for use with torch.compile.
When using HilbertSFC Torch functions with torch.compile, call this before
compilation to avoid materializing LUT tensors inside the compiled region,
which can cause graph breaks, extra overhead, and failure with
fullgraph=True.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
device
|
TorchDeviceLike
|
Device for which to cache LUT tensors.
|
None
|
op
|
TorchHilbertOp
|
Operation used to select which LUT tensors are pre-cached.
|
'all'
|
Notes
It is generally not useful to pre-cache LUT tensors with this function
when not using torch.compile, as this function materializes LUT tensors
that may not be used outside compiled regions.
clear_torch_lut_caches ¶
clear_torch_lut_caches(
device: TorchDeviceLike = None,
*,
op: TorchHilbertOp = "all",
) -> None
Clear Torch-side LUT caches.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
device
|
TorchDeviceLike
|
Device whose cached LUT tensors should be cleared. If |
None
|
op
|
TorchHilbertOp
|
Operation used to filter which cached LUT tensors are cleared.
|
'all'
|
Notes
This does not clear the root process-wide LUT cache. Use
clear_lut_caches for that.