[ACCEPTED]-Testing string equality using hashCode()-hashcode

Accepted answer
Score: 40

Let me give you a counter example. Try this,

public static void main(String[] args) {
    String str1 = "0-42L";
    String str2 = "0-43-";

    System.out.println("String equality: " + str1.equals(str2));
    System.out.println("HashCode eqauality: " + (str1.hashCode() == str2.hashCode()));
}

The 1 result on my Java,

String equality: false
HashCode eqauality: true
Score: 39

because: hashCodes of two objects must be 3 equal if the objects are equal, however, if 2 two objects are unequal, the hashCode can 1 still be equal.

(modified after comment)

Score: 16

as many said hashCode does not guaranty 7 uniqueness. in fact, it cannot do that for 6 a very simple reason.

hashCode returns an 5 int, which means there are 2^32 possible 4 values (around 4,000,000,000), but there 3 are surely more than 2^32 possible strings, which 2 means at least two strings have the same 1 hashcode value.

this is called Pigeonhole principle.

Score: 8

Others have pointed out why it won't work. So 28 I'll just add the addendum that the gain 27 would be minimal anyway.

When you compare 26 two strings in Java, the String equals function 25 first checks if they are two references 24 to the same object. If so, it immediately 23 returns true. Then it checks if the lengths 22 are equal. If not, it returns false. Only 21 then does it start comparing character-by-character.

If 20 you're manipulating data in memory, the 19 same-object compare may quickly handle the 18 "same" case, and that's a quick, umm, 4-byte 17 integer compare I think. (Someone correct 16 me if I have the length of an object handle 15 wrong.)

For most unequal strings, I'd bet 14 the length compare quickly finds them not 13 equal. If you're comparing two names of 12 things -- customers, cities, products, whatever 11 -- they'll usually have unequal length. So 10 a simple int compare quickly disposes of 9 them.

The worst case for performance is going 8 to be two long, identical, but not the same 7 object strings. Then it has to do the object 6 handle compare, false, keep checking. The 5 length compare, true, keep checking. Then 4 character by character through the entire 3 length of the string to verify that yes 2 indeed they are equal all the way to the 1 end.

Score: 4

You can get the effect you want using String.intern() (which 7 is implemented using a hash table.)

You can 6 compare the return values of intern() using the 5 == operator. If they refer to the same string 4 then the original strings were equivalent 3 (i.e. equals() would have returned true), and it requires 2 only a pointer comparison (which has the 1 same cost as an int comparison.)

String a = "Hello";
String b = "Hel" + "lo";

System.out.println(a.equals(b));
System.out.println(a == b);

String a2 = a.intern();
String b2 = b.intern();

System.out.println(a2.equals(b2));
System.out.println(a2 == b2);

Output:

true
false
true
true
Score: 1

The hashCode value isn't unique, which means 4 the Strings may not actually match. To 3 improve performance, often implementations 2 of equals will perform a hashCode check 1 before performing more laborious checks.

Score: 1

Very simple reason: risk of collisions... A 16 hash code will have a lot less possible 15 values than a string. It depends a bit of 14 the kind of hash you generate but let's 13 take a very simple example, where you would 12 add the ordinal values of letters, multiplied 11 with it's position: a=1, b=2, etc. Thus, 'hello' would 10 translate to: h: 8x1=8, e: 5x2=10, l: 12x3=36, l: 12x4=48, o: 15x5=75. 8+10+36+48+75=177.

Are 9 there other string values that could end 8 as 177 hashed? Of course! Plenty of options. Feel 7 free to calculate a few.

Still, this hashing 6 method used a simple method. Java and .NET 5 use a more complex hashing algorithm with 4 a lot smaller chance of such collisions. But 3 still, there's a chance that two different 2 strings will result in the same hash value, thus 1 this method is less reliable.

Score: 0

Two different String can easily generate 6 same hash Code or different hash Code. If 5 u want a equality test hash Code won't give 4 an unique result. When we use String class 3 it will return different value of hash Code. So 2 String buffer class should be apply to have 1 same hash Code for every concated object.

More Related questions