[ACCEPTED]-How to generate 8 bytes unique id from GUID?-.net

Accepted answer
Score: 15

No, it won't. As highlighted many times 33 on Raymond Chen's blog, the GUID is designed 32 to be unique as a whole, if you cut out 31 just a piece of it (e.g. taking only 64 30 bytes out of its 128) it will lose its (pseudo-)uniqueness 29 guarantees.


Here it is:

A customer needed to generate 28 an 8-byte unique value, and their initial 27 idea was to generate a GUID and throw away 26 the second half, keeping the first eight 25 bytes. They wanted to know if this was a 24 good idea.

No, it's not a good idea. (...) Once 23 you see how it all works, it's clear that 22 you can't just throw away part of the GUID 21 since all the parts (well, except for the 20 fixed parts) work together to establish 19 the uniqueness. If you take any of the three 18 parts away, the algorithm falls apart. In 17 particular, keeping just the first eight 16 bytes (64 bits) gives you the timestamp 15 and four constant bits; in other words, all 14 you have is a timestamp, not a GUID.

Since 13 it's just a timestamp, you can have collisions. If 12 two computers generate one of these "truncated 11 GUIDs" at the same time, they will 10 generate the same result. Or if the system 9 clock goes backward in time due to a clock 8 reset, you'll start regenerating GUIDs that 7 you had generated the first time it was 6 that time.


I try to use long as unique id 5 within our C# application (not global, and 4 only for one session.) for our events. do 3 you know the following will generate an 2 unique long id?

Why don't you just use a 1 counter?

Score: 5

You cannot distill a 16-bit value down to 20 an 8-bit value while still retaining the 19 same degree of uniqueness. If uniqueness 18 is critical, don't "roll your own" anything. Stick 17 with GUIDs unless you really know what you're 16 doing.

If a relatively naive implementation 15 of uniqueness is sufficient then it's still 14 better to generate your own IDs rather than 13 derive them from GUIDs. The following code 12 snippet is extracted from a "Locally Unique 11 Identifier" class I find myself using fairly 10 often. It makes it easy to define both the 9 length and the range of characters output.

using System.Security.Cryptography;
using System.Text;

public class LUID
{
    private static readonly RNGCryptoServiceProvider RandomGenerator = new RNGCryptoServiceProvider();
    private static readonly char[] ValidCharacters = "ABCDEFGHJKLMNPQRSTUVWXYZ23456789".ToCharArray();
    public const int DefaultLength = 6;
    private static int counter = 0;

    public static string Generate(int length = DefaultLength)
    {
        var randomData = new byte[length];
        RandomGenerator.GetNonZeroBytes(randomData);

        var result = new StringBuilder(DefaultLength);
        foreach (var value in randomData)
        {
            counter = (counter + value) % (ValidCharacters.Length - 1);
            result.Append(ValidCharacters[counter]);
        }
        return result.ToString();
    }
}

In 8 this instance it excludes 1 (one), I (i), 0 7 (zero) and O (o) for the sake of unambiguous 6 human-readable output.

To determine just 5 how effectively 'unique' your particular 4 combination of valid characters and ID length 3 are, the math is simple enough but it's 2 still nice to have a 'code proof' of sorts 1 (Xunit):

    [Fact]
    public void Does_not_generate_collisions_within_reasonable_number_of_iterations()
    {
        var ids = new HashSet<string>();
        var minimumAcceptibleIterations = 10000;
        for (int i = 0; i < minimumAcceptibleIterations; i++)
        {
            var result = LUID.Generate();
            Assert.True(!ids.Contains(result), $"Collision on run {i} with ID '{result}'");
            ids.Add(result);
        }            
    }
Score: 2

No, it won't. A GUID has 128 bit length, a 4 long only 64 bit, you are missing 64 bit 3 of information, allowing for two GUIDs to 2 generate the same long representation. While 1 the chance is pretty slim, it is there.

Score: 2

Per the Guid.NewGuid MSDN page,

The chance that the value of the 3 new Guid will be all zeros or equal to any 2 other Guid is very low.

So, your method may produce 1 a unique ID, but it's not guaranteed.

Score: 1

Yes, this will be most likely unique but since the 3 number of bits are less than GUID, the chance 2 of duplicate is more than a GUID - although 1 still negligible.

Anyway, GUID itself does not guarantee uniqueness.

Score: 1
var s = Guid.NewGuid().ToString();
var h1 = s.Substring(0, s.Length / 2).GetHashCode(); // first half of Guid
var h2 = s.Substring(s.Length / 2).GetHashCode(); // second half of Guid
var result = (uint) h1 | (ulong) h2 << 32; // unique 8-byte long
var bytes = BitConverter.GetBytes(result);

P. S. It's very good, guys, that you are chatting with topic starter here. But what about answers that need other users, like me???

0

Score: 0

Like a few others have said, only taking 2 part of the guid is a good way to ruin its 1 uniqueness. Try something like this:

var bytes = new byte[8];
using (var rng = new RNGCryptoServiceProvider())
{
    rng.GetBytes(bytes);
}

Console.WriteLine(BitConverter.ToInt64(bytes, 0));
Score: 0

enerates an 8-byte Ascii85 identifier based 4 on the current timestamp in seconds. Guaranteed 3 unique for each second. 85% chance of no 2 collisions for 5 generated Ids within the 1 same second.

private static readonly Random Random = new Random();
public static string GenerateIdentifier()
{
    var seconds = (int) DateTime.Now.Subtract(new DateTime(1970, 1, 1, 0, 0, 0)).TotalSeconds;
    var timeBytes = BitConverter.GetBytes(seconds);
    var randomBytes = new byte[2];
    Random.NextBytes(randomBytes);
    var bytes = new byte[timeBytes.Length + randomBytes.Length];
    System.Buffer.BlockCopy(timeBytes, 0, bytes, 0, timeBytes.Length);
    System.Buffer.BlockCopy(randomBytes, 0, bytes, timeBytes.Length, randomBytes.Length);
    return Ascii85.Encode(bytes);
}
Score: 0

As already said in most of the other answers: No, you 11 can not just take a part of a GUID without 10 losing the uniqueness.

If you need something 9 that's shorter and still unique, read this 8 blog post by Jeff Atwood:
Equipping our ASCII Armor

He shows multiple 7 ways how to shorten a GUID without losing 6 information. The shortest is 20 bytes (with 5 ASCII85 encoding).

Yes, this is much longer than the 8 bytes 4 you wanted, but it's a "real" unique 3 GUID...while all attempts to cram something 2 into 8 bytes most likely won't be truly 1 unique.

Score: 0

In most cases bitwise XOR of both halves 1 together is enough

Score: 0

Everyone in here is making this way more 27 complicated than it needs to be. This is 26 a terrible idea.

GUID 1: AAAA-BBBB-CCCC-DDDD
GUID 25 2: AAAA-BBBB-EEEE-FFFF

throw away the second 24 half of each GUID, and now you have a duplicate 23 identifier. GUIDs are not guaranteed to 22 be unique, and its extremely awful. you 21 shouldn't rely on the gurantee of whats 20 generated, and it's not hard to get around 19 this. If you need unique identifiers for 18 an object, entity, or whatever, lets take 17 a database for example - which is the most 16 common, you should generate an id, see if 15 it already exists, and insert it only if 14 it doesn't. this is fast in databases since 13 most tables are indexed based on ID. "most." if 12 you have some kind of small object list 11 in memory, or wherever, you'd probably store 10 the entity in a hash table of some kind, in 9 which you could just look it up to see if 8 that generated GUID already exists.

all in 7 all, depends on what your use case is really. a 6 database, find the GUID first, and regenerate 5 if possible until you can insert the new 4 item. this really only matters in relational 3 databases who dont automatically generate 2 IDs for items in the tables. NoSQL DB's 1 usually generate a unique identifier.

More Related questions