[ACCEPTED]-The Most Efficient Algorithm to Find First Prefix-Match From a Sorted String Array?-search

Accepted answer
Score: 21

If you only want to do this once, use binary search, if 5 on the other hand you need to do it for 4 many different prefixes but on the same 3 string array, building a radix tree can be a good 2 idea, after you've built the tree each look 1 up will be very fast.

Score: 3

This is just a modified bisection search:

  • Only check as many characters in each element as are in the search string; and
  • If you find a match, keep searching backwards (either linearly or by further bisection searches) until you find a non-matching result and then return the index of the last matching result.

0

Score: 3

It can be done in linear time using a Suffix Tree. Building 1 the suffix tree takes linear time.

Score: 0

The FreeBSD kernel use a Radix tree for its routing 1 table, you should check that.

Score: 0

My current solution in mind is, instead 6 of to find the "prefix", try to find a "virtual 5 prefix".

For example, prefix is “abd", try 4 to find a virtual-prefix “abc(255)". (255) just 3 represents the max char number. After locating 2 the "abc(255)". The next word should be 1 the first word matching "abd" if any.

Score: 0

Are you in the position to precalculate 5 all possible prefixes?

If so, you can do 4 that, then use a binary search to find the 3 prefix in the precalculated table. Store 2 the subscript to the desired value with 1 the prefix.

Score: 0

My solution: Used binary search.

private static int search(String[] words, String searchPrefix) {

        if (words == null || words.length == 0) {
            return -1;
        }
        int low = 0;
        int high = words.length - 1;
        int searchPrefixLength = searchPrefix.length();

        while (low <= high) {
            int mid = low + (high - low) / 2;

            String word = words[mid];
            int compare = -1;

            if (searchPrefixLength <= word.length()) {
                compare = word.substring(0, searchPrefixLength).compareTo(searchPrefix);
            }

            if (compare == 0) {
                return mid;
            } else if (compare > 0) {
                high = mid - 1;
            } else {
                low = mid + 1;
            }

        }
        return -1;
    }

0

Score: 0

Here is a possible solution (in Python), which 9 has O(k.log(n)) time complexity and O(1) additional space 8 complexity (considering n strings and k 7 prefix length).

The rationale behind it to 6 perform a binary search which only considers 5 a given character index of the strings. If 4 these are present, continue to the next 3 character index. If any of the prefix characters 2 cannot be found in any string, it returns 1 immediately.

from typing import List

def first(items: List[str], prefix: str, i: int, c: str, left: int, right: int):
    result = -1

    while left <= right:
        mid = left + ((right - left) // 2)
        if ( i >= len(items[mid]) ):
            left = mid + 1
        elif (c < items[mid][i]):
            right = mid - 1
        elif (c > items[mid][i]):
            left = mid + 1
        else:
            result = mid
            right = mid - 1

    return result

def last(items: List[str], prefix: str, i: int, c: str, left: int, right: int):
    result = -1

    while left <= right:
        mid = left + ((right - left) // 2)
        if ( i >= len(items[mid]) ):
            left = mid + 1
        elif (c < items[mid][i]):
            right = mid - 1
        elif (c > items[mid][i]):
            left = mid + 1
        else:
            result = mid
            left = mid + 1

    return result

def is_prefix(items: List[str], prefix: str):
    left = 0
    right = len(items) - 1
    for i in range(len(prefix)):
        c = prefix[i]
        left = first(items, prefix, i, c, left, right)
        right = last(items, prefix, i, c, left, right)

        if (left == -1 or right == -1):
            return False

    return True

# Test cases
a = ['ab', 'abjsiohjd', 'abikshdiu', 'ashdi','abcde Aasioudhf', 'abcdefgOAJ', 'aa', 'aaap', 'aas', 'asd', 'bbbbb', 'bsadiojh', 'iod', '0asdn', 'asdjd', 'bqw', 'ba']
a.sort()
print(a)
print(is_prefix(a, 'abcdf'))
print(is_prefix(a, 'abcde'))
print(is_prefix(a, 'abcdef'))
print(is_prefix(a, 'abcdefg'))
print(is_prefix(a, 'abcdefgh'))
print(is_prefix(a, 'abcde Aa'))
print(is_prefix(a, 'iod'))
print(is_prefix(a, 'ZZZZZZiod'))

This gist is available at https://gist.github.com/lopespm/9790d60492aff25ea0960fe9ed389c0f

More Related questions