[ACCEPTED]-Mapping N-dimensional value to a point on Hilbert curve-dimension-reduction
I finally broke down and shelled out some 42 money. The AIP (American Institute of Physics) has 41 a nice, short article with source code in 40 C. "Programming the Hilbert curve" by 39 John Skilling (from the AIP Conf. Proc. 707, 381 38 (2004)) has an appendix with code for mappings 37 in both directions. It works for any number 36 of dimensions > 1, is not recursive, does 35 not use state-transition lookup tables that 34 gobble up huge amounts of memory, and mostly 33 uses bit operations. Thus it is reasonably 32 fast and has a good memory footprint.
If 31 you choose to purchase the article, I discovered 30 an error in the source code.
The following 29 line of code (found in the function TransposetoAxes) has 28 the error:
for( i = n-1; i >= 0; i-- ) X[i] ^= X[i-1];
The 27 correction is to change the greater than 26 or equal (>=) to a greater than (>). Without 25 this correction, the X array is accessed 24 using a negative index when the variable 23 “i” becomes zero, causing the program to 22 fail.
I recommend reading the article (which 21 is seven pages long, including the code), as 20 it explains how the algorithm works, which 19 is far from obvious.
I translated his code 18 into C# for my own use. The code follows. Skilling 17 performs the transformation in place, overwriting 16 the vector that you pass in. I chose to 15 make a clone of the input vector and return 14 a new copy. Also, I implemented the methods 13 as extension methods.
Skilling's code represents 12 the Hilbert index as a transpose, stored 11 as an array. I find it more convenient to 10 interleave the bits and form a single BigInteger 9 (more useful in Dictionaries, easier to 8 iterate over in loops, etc), but I optimized 7 that operation and its inverse with magic 6 numbers, bit operations and the like, and 5 the code is lengthy, so I have omitted it.
namespace HilbertExtensions
{
/// <summary>
/// Convert between Hilbert index and N-dimensional points.
///
/// The Hilbert index is expressed as an array of transposed bits.
///
/// Example: 5 bits for each of n=3 coordinates.
/// 15-bit Hilbert integer = A B C D E F G H I J K L M N O is stored
/// as its Transpose ^
/// X[0] = A D G J M X[2]| 7
/// X[1] = B E H K N <-------> | /X[1]
/// X[2] = C F I L O axes |/
/// high low 0------> X[0]
///
/// NOTE: This algorithm is derived from work done by John Skilling and published in "Programming the Hilbert curve".
/// (c) 2004 American Institute of Physics.
///
/// </summary>
public static class HilbertCurveTransform
{
/// <summary>
/// Convert the Hilbert index into an N-dimensional point expressed as a vector of uints.
///
/// Note: In Skilling's paper, this function is named TransposetoAxes.
/// </summary>
/// <param name="transposedIndex">The Hilbert index stored in transposed form.</param>
/// <param name="bits">Number of bits per coordinate.</param>
/// <returns>Coordinate vector.</returns>
public static uint[] HilbertAxes(this uint[] transposedIndex, int bits)
{
var X = (uint[])transposedIndex.Clone();
int n = X.Length; // n: Number of dimensions
uint N = 2U << (bits - 1), P, Q, t;
int i;
// Gray decode by H ^ (H/2)
t = X[n - 1] >> 1;
// Corrected error in Skilling's paper on the following line. The appendix had i >= 0 leading to negative array index.
for (i = n - 1; i > 0; i--)
X[i] ^= X[i - 1];
X[0] ^= t;
// Undo excess work
for (Q = 2; Q != N; Q <<= 1)
{
P = Q - 1;
for (i = n - 1; i >= 0; i--)
if ((X[i] & Q) != 0U)
X[0] ^= P; // invert
else
{
t = (X[0] ^ X[i]) & P;
X[0] ^= t;
X[i] ^= t;
}
} // exchange
return X;
}
/// <summary>
/// Given the axes (coordinates) of a point in N-Dimensional space, find the distance to that point along the Hilbert curve.
/// That distance will be transposed; broken into pieces and distributed into an array.
///
/// The number of dimensions is the length of the hilbertAxes array.
///
/// Note: In Skilling's paper, this function is called AxestoTranspose.
/// </summary>
/// <param name="hilbertAxes">Point in N-space.</param>
/// <param name="bits">Depth of the Hilbert curve. If bits is one, this is the top-level Hilbert curve.</param>
/// <returns>The Hilbert distance (or index) as a transposed Hilbert index.</returns>
public static uint[] HilbertIndexTransposed(this uint[] hilbertAxes, int bits)
{
var X = (uint[])hilbertAxes.Clone();
var n = hilbertAxes.Length; // n: Number of dimensions
uint M = 1U << (bits - 1), P, Q, t;
int i;
// Inverse undo
for (Q = M; Q > 1; Q >>= 1)
{
P = Q - 1;
for (i = 0; i < n; i++)
if ((X[i] & Q) != 0)
X[0] ^= P; // invert
else
{
t = (X[0] ^ X[i]) & P;
X[0] ^= t;
X[i] ^= t;
}
} // exchange
// Gray encode
for (i = 1; i < n; i++)
X[i] ^= X[i - 1];
t = 0;
for (Q = M; Q > 1; Q >>= 1)
if ((X[n - 1] & Q)!=0)
t ^= Q - 1;
for (i = 0; i < n; i++)
X[i] ^= t;
return X;
}
}
}
I 4 have posted working code in C# to github.
See 3 https://github.com/paulchernoch/HilbertTransformation
UPDATE: I just published (Fall 2019) a Rust 2 crate on crates.io called "hilbert". It 1 also uses Skilling's algorithm. See https://crates.io/crates/hilbert
Algorithm for mapping from n->1 and 1->n 5 given here "Calculation of Mappings Between One and n-dimensional Values Using the Hilbert Space-filling Curve" J K Lawder
If you Google for "SFC module 4 and Kademlia overlay" youl find a group 3 who claim to use it as part of their system. If 2 you view the source you can probably extract 1 the relevant function.
I spent a little time translating Paul Chernoch's 10 code to Java and cleaning it up. It's possible 9 that there is a bug in my code, especially 8 because I don't have access to the paper 7 that it's originally from. However, it passes 6 what unit tests I was able to write. It 5 is below.
Note that I have evaluated both 4 Z-Order and Hilbert curves for spatial indexing 3 on largish data sets. I have to say that 2 Z-Order provides far better quality. But 1 feel free to try for yourself.
/**
* Convert the Hilbert index into an N-dimensional point expressed as a vector of uints.
*
* Note: In Skilling's paper, this function is named TransposetoAxes.
* @param transposedIndex The Hilbert index stored in transposed form.
* @param bits Number of bits per coordinate.
* @return Point in N-space.
*/
static long[] HilbertAxes(final long[] transposedIndex, final int bits) {
final long[] result = transposedIndex.clone();
final int dims = result.length;
grayDecode(result, dims);
undoExcessWork(result, dims, bits);
return result;
}
static void grayDecode(final long[] result, final int dims) {
final long swap = result[dims - 1] >>> 1;
// Corrected error in Skilling's paper on the following line. The appendix had i >= 0 leading to negative array index.
for (int i = dims - 1; i > 0; i--)
result[i] ^= result[i - 1];
result[0] ^= swap;
}
static void undoExcessWork(final long[] result, final int dims, final int bits) {
for (long bit = 2, n = 1; n != bits; bit <<= 1, ++n) {
final long mask = bit - 1;
for (int i = dims - 1; i >= 0; i--)
if ((result[i] & bit) != 0)
result[0] ^= mask; // invert
else
swapBits(result, mask, i);
}
}
/**
* Given the axes (coordinates) of a point in N-Dimensional space, find the distance to that point along the Hilbert curve.
* That distance will be transposed; broken into pieces and distributed into an array.
*
* The number of dimensions is the length of the hilbertAxes array.
*
* Note: In Skilling's paper, this function is called AxestoTranspose.
* @param hilbertAxes Point in N-space.
* @param bits Depth of the Hilbert curve. If bits is one, this is the top-level Hilbert curve.
* @return The Hilbert distance (or index) as a transposed Hilbert index.
*/
static long[] HilbertIndexTransposed(final long[] hilbertAxes, final int bits) {
final long[] result = hilbertAxes.clone();
final int dims = hilbertAxes.length;
final long maxBit = 1L << (bits - 1);
inverseUndo(result, dims, maxBit);
grayEncode(result, dims, maxBit);
return result;
}
static void inverseUndo(final long[] result, final int dims, final long maxBit) {
for (long bit = maxBit; bit != 0; bit >>>= 1) {
final long mask = bit - 1;
for (int i = 0; i < dims; i++)
if ((result[i] & bit) != 0)
result[0] ^= mask; // invert
else
swapBits(result, mask, i);
} // exchange
}
static void grayEncode(final long[] result, final int dims, final long maxBit) {
for (int i = 1; i < dims; i++)
result[i] ^= result[i - 1];
long mask = 0;
for (long bit = maxBit; bit != 0; bit >>>= 1)
if ((result[dims - 1] & bit) != 0)
mask ^= bit - 1;
for (int i = 0; i < dims; i++)
result[i] ^= mask;
}
static void swapBits(final long[] array, final long mask, final int index) {
final long swap = (array[0] ^ array[index]) & mask;
array[0] ^= swap;
array[index] ^= swap;
}
It's not clear to me how this will do what 19 you want. Consider this trival 3D case:
001 ------ 101
|\ |\
| \ | \
| 011 ------ 111
| | | |
| | | |
000 -|---- 100 |
\ | \ |
\ | \ |
010 ------ 110
which 18 can be "Hilbertized" by the following path:
001 -----> 101
\ \
\ \
011 111
^ |
| |
000 | 100 |
\ | \ |
\ | \ V
010 110
into 17 the 1D order:
000 -> 010 -> 011 -> 001 -> 101 -> 111 -> 110 -> 100
Here's the nasty bit. Consider 16 the list of pairs and 1D distances below:
000 : 100 -> 7
010 : 110 -> 5
011 : 111 -> 3
001 : 101 -> 1
In 15 all cases, the left- and right-hand values 14 are the same 3D distance from each other 13 (+/- 1 in the first position), which seem 12 to imply similar "spatial locality". But 11 linearizing by any choice of dimensional 10 ordering (y, then z, then z, in the above 9 example) breaks that locality.
Another way 8 of saying this is that taking a starting 7 point and ordering the remaining points 6 by their distance from that starting point 5 will provide significantly different results. Taking 4 000
as the start, for example:
1D ordering : distance 3D ordering : distance
---------------------- ----------------------
010 : 1 001,010,100 : 1
011,101,110 : sqrt(2)
111 : sqrt(3)
011 : 2
001 : 3
101 : 4
111 : 5
110 : 6
100 : 7
This effect grows 3 exponentially with the number of dimensions 2 (assuming that each dimension has the same 1 "size").
Another possibility would be to build a 5 kd-tree on your data, and then to an in-order traversal 4 of the tree to get the ordering. Constructing 3 the kd-tree only requires you to have a 2 good median-finding algorithm, of which 1 there are many.
I don't see how you can use a Hilbert curve in one dimension.
If you are interested in mapping points 18 to a lower dimension while preserving distances 17 (with minimum error) then you can look into 16 "Multidimensional Scaling" algorithms.
Simulated 15 annealing is one approach.
Edit: Thanks for 14 the comment. I see what you meant by the 13 Hilbert Curve approach now. However, this 12 is a hard problem, and given N=100 and 10 11 million data points I don't think any approach 10 will preserve locality well and run in a 9 reasonable amount of time. I don't think 8 kd-trees will work here.
If finding a total 7 ordering is not important to you, then you 6 can look into locality-based hashing and 5 other approximate nearest neighbor schemes. Hierarchical 4 multidimensional scaling with buckets of 3 points to reduce the input size might give 2 you a good ordering, but again it's doubtful 1 in such a high dimension.
Here is John Skilling's original C code 14 for encode/decode of Hilbert coordinates 13 in arbitrary dimensions. This is from the 12 paper cited by Paul Chernoch above, Programming 11 the Hilbert curve" by John Skilling 10 (from the AIP Conf. Proc. 707, 381 (2004)).
This 9 code has the bug fix applied.
I also extended 8 main() to show both encoding and decoding. I 7 also added functions interleaveBits() and 6 uninterleaveBits() that convert the Hilbert-transposed 5 coordinates into a single code and back, which 4 is what most people are interested in.
Skilling's 3 code works for arbitrary dimensions, but 2 my interleaveBits() functions are specific 1 to three dimensions. Easy to extend, though.
//+++++++++++++++++++++++++++ PUBLIC-DOMAIN SOFTWARE ++++++++++++++++++++++++++
// Functions: TransposetoAxes AxestoTranspose
// Purpose: Transform in-place between Hilbert transpose and geometrical axes
// Example: b=5 bits for each of n=3 coordinates.
// 15-bit Hilbert integer = A B C D E F G H I J K L M N O is stored
// as its Transpose
// X[0] = A D G J M X[2]|
// X[1] = B E H K N <-------> | /X[1]
// X[2] = C F I L O axes |/
// high low 0------ X[0]
// Axes are stored conventially as b-bit integers.
// Author: John Skilling 20 Apr 2001 to 11 Oct 2003
//-----------------------------------------------------------------------------
#include <cstdio>
typedef unsigned int coord_t; // char,short,int for up to 8,16,32 bits per word
void TransposetoAxes(coord_t* X, int b, int n) // Position, #bits, dimension
{
coord_t N = 2 << (b - 1), P, Q, t;
// Gray decode by H ^ (H/2)
t = X[n - 1] >> 1;
// Corrected error in Skilling's paper on the following line. The appendix had i >= 0 leading to negative array index.
for (int i = n - 1; i > 0; i--) X[i] ^= X[i - 1];
X[0] ^= t;
// Undo excess work
for (Q = 2; Q != N; Q <<= 1) {
P = Q - 1;
for (int i = n - 1; i >= 0; i--)
if (X[i] & Q) // Invert
X[0] ^= P;
else { // Exchange
t = (X[0] ^ X[i]) & P;
X[0] ^= t;
X[i] ^= t;
}
}
}
void AxestoTranspose(coord_t* X, int b, int n) // Position, #bits, dimension
{
coord_t M = 1 << (b - 1), P, Q, t;
// Inverse undo
for (Q = M; Q > 1; Q >>= 1) {
P = Q - 1;
for (int i = 0; i < n; i++)
if (X[i] & Q) // Invert
X[0] ^= P;
else { // Exchange
t = (X[0] ^ X[i]) & P;
X[0] ^= t;
X[i] ^= t;
}
}
// Gray encode
for (int i = 1; i < n; i++) X[i] ^= X[i - 1];
t = 0;
for (Q = M; Q > 1; Q >>= 1)
if (X[n - 1] & Q) t ^= Q - 1;
for (int i = 0; i < n; i++) X[i] ^= t;
}
int interleaveBits(coord_t* X, int b, int n) // Position, #bits, dimension
{
unsigned int codex = 0, codey = 0, codez = 0;
const int nbits2 = 2 * b;
for (int i = 0, andbit = 1; i < nbits2; i += 2, andbit <<= 1) {
codex |= (unsigned int)(X[0] & andbit) << i;
codey |= (unsigned int)(X[1] & andbit) << i;
codez |= (unsigned int)(X[2] & andbit) << i;
}
return (codex << 2) | (codey << 1) | codez;
}
// From https://github.com/Forceflow/libmorton/blob/main/include/libmorton/morton3D.h
void uninterleaveBits(coord_t* X, int b, int n, unsigned int code) // Position, #bits, dimension
{
X[0] = X[1] = X[2] = 0;
for (unsigned int i = 0; i <= b; ++i) {
unsigned int selector = 1;
unsigned int shift_selector = 3 * i;
unsigned int shiftback = 2 * i;
X[2] |= (code & (selector << shift_selector)) >> (shiftback);
X[1] |= (code & (selector << (shift_selector + 1))) >> (shiftback + 1);
X[0] |= (code & (selector << (shift_selector + 2))) >> (shiftback + 2);
}
}
int main()
{
coord_t X[3] = {5, 10, 20}; // Any position in 32x32x32 cube
printf("Input coords = %d,%d,%d\n", X[0], X[1], X[2]);
AxestoTranspose(X, 5, 3); // Hilbert transpose for 5 bits and 3 dimensions
printf("Hilbert coords = %d,%d,%d\n", X[0], X[1], X[2]);
unsigned int code = interleaveBits(X, 5, 3);
printf("Hilbert integer = %d = %d%d%d %d%d%d %d%d%d %d%d%d %d%d%d = 7865 check\n", code,
X[0] >> 4 & 1, X[1] >> 4 & 1, X[2] >> 4 & 1,
X[0] >> 3 & 1, X[1] >> 3 & 1, X[2] >> 3 & 1,
X[0] >> 2 & 1, X[1] >> 2 & 1, X[2] >> 2 & 1,
X[0] >> 1 & 1, X[1] >> 1 & 1, X[2] >> 1 & 1,
X[0] >> 0 & 1, X[1] >> 0 & 1, X[2] >> 0 & 1);
uninterleaveBits(X, 5, 3, code);
printf("Reconstructed Hilbert coords = %d,%d,%d\n", X[0], X[1], X[2]);
TransposetoAxes(X, 5, 3);
printf("Orig coords = %d,%d,%d\n", X[0], X[1], X[2]);
return 0;
}
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.