[ACCEPTED]-O(nlogn) Algorithm - Find three evenly spaced ones within binary string-big-o
Finally! Following up leads in sdcvvc's answer, we have 90 it: the O(n log n) algorithm for the problem! It 89 is simple too, after you understand it. Those 88 who guessed FFT were right.
The problem: we 87 are given a binary string S
of length n, and 86 we want to find three evenly spaced 1s in 85 it. For example, S
may be 110110010
, where n=9. It 84 has evenly spaced 1s at positions 2, 5, and 83 8.
Scan
S
left to right, and make a listL
of 82 positions of 1. For theS=110110010
above, we have 81 the list L = [1, 2, 4, 5, 8]. This step 80 is O(n). The problem is now to find an arithmetic progression of length 3 in 79L
, i.e. to find distinct a, b, c inL
such that b-a = c-b, or 78 equivalently a+c=2b. For the example above, we 77 want to find the progression (2, 5, 8).Make 76 a polynomial
p
with terms xk for each k inL
. For the example 75 above, we make the polynomial p(x) = (x + x2 + x4 + x5+x8). This step 74 is O(n).Find the polynomial
q
= p2, using the 73 Fast Fourier Transform. For the example above, we get the polynomial 72 q(x) = x16 + 2x13 + 2x12 + 3x10 + 4x9 + x8 + 2x7 + 4x6 + 2x5 + x4 + 2x3 + x2. This step is O(n log n).Ignore all terms except those corresponding 71 to x2k for some k in
L
. For the example above, we 70 get the terms x16, 3x10, x8, x4, x2. This step is O(n), if you 69 choose to do it at all.
Here's the crucial 68 point: the coefficient of any x2b for b in L
is 67 precisely the number of pairs (a,c) in L
such that a+c=2b. [CLRS, Ex. 30.1-7] One 66 such pair is (b,b) always (so the coefficient 65 is at least 1), but if there exists any 64 other pair (a,c), then the coefficient is at 63 least 3, from (a,c) and (c,a). For the example above, we 62 have the coefficient of x10 to be 3 precisely 61 because of the AP (2,5,8). (These coefficients 60 x2b will always be odd numbers, for the reasons 59 above. And all other coefficients in q will 58 always be even.)
So then, the algorithm is 57 to look at the coefficients of these terms 56 x2b, and see if any of them is greater than 55 1. If there is none, then there are no evenly 54 spaced 1s. If there is a b in L
for which the 53 coefficient of x2b is greater than 1, then 52 we know that there is some pair (a,c) — other 51 than (b,b) — for which a+c=2b. To find the actual pair, we 50 simply try each a in L
(the corresponding 49 c would be 2b-a) and see if there is a 1 at position 48 2b-a in S
. This step is O(n).
That's all, folks.
One 47 might ask: do we need to use FFT? Many answers, such 46 as beta's, flybywire's, and rsp's, suggest that the approach that 45 checks each pair of 1s and sees if there 44 is a 1 at the "third" position, might 43 work in O(n log n), based on the intuition 42 that if there are too many 1s, we would 41 find a triple easily, and if there are too 40 few 1s, checking all pairs takes little 39 time. Unfortunately, while this intuition 38 is correct and the simple approach is better 37 than O(n2), it is not significantly better. As 36 in sdcvvc's answer, we can take the "Cantor-like set" of 35 strings of length n=3k, with 1s at the positions 34 whose ternary representation has only 0s 33 and 2s (no 1s) in it. Such a string has 32 2k = n(log 2)/(log 3) ≈ n0.63 ones in it and no evenly spaced 1s, so 31 checking all pairs would be of the order 30 of the square of the number of 1s in it: that's 29 4k ≈ n1.26 which unfortunately is asymptotically much 28 larger than (n log n). In fact, the worst 27 case is even worse: Leo Moser in 1953 constructed (effectively) such 26 strings which have n1-c/√(log n) 1s in them but no evenly 25 spaced 1s, which means that on such strings, the 24 simple approach would take Θ(n2-2c/√(log n)) — only a tiny bit 23 better than Θ(n2), surprisingly!
About the maximum 22 number of 1s in a string of length n with 21 no 3 evenly spaced ones (which we saw above 20 was at least n0.63 from the easy Cantor-like 19 construction, and at least n1-c/√(log n) with Moser's 18 construction) — this is OEIS A003002. It can also be 17 calculated directly from OEIS A065825 as the k such 16 that A065825(k) ≤ n < A065825(k+1). I 15 wrote a program to find these, and it turns 14 out that the greedy algorithm does not give 13 the longest such string. For example, for 12 n=9, we can get 5 1s (110100011) but the 11 greedy gives only 4 (110110000), for n=26 10 we can get 11 1s (11001010001000010110001101) but 9 the greedy gives only 8 (11011000011011000000000000), and 8 for n=74 we can get 22 1s (11000010110001000001011010001000000000000000010001011010000010001101000011) but 7 the greedy gives only 16 (11011000011011000000000000011011000011011000000000000000000000000000000000). They 6 do agree at quite a few places until 50 5 (e.g. all of 38 to 50), though. As the OEIS 4 references say, it seems that Jaroslaw Wroblewski 3 is interested in this question, and he maintains 2 a website on these non-averaging sets. The exact numbers are 1 known only up to 194.
Your problem is called AVERAGE in this paper (1999):
A 45 problem is 3SUM-hard if there is a sub-quadratic 44 reduction from the problem 3SUM: Given a 43 set A of n integers, are there elements 42 a,b,c in A such that a+b+c = 0? It is not 41 known whether AVERAGE is 3SUM-hard. However, there 40 is a simple linear-time reduction from AVERAGE 39 to 3SUM, whose description we omit.
When 38 the integers are in the range [−u ... u], 3SUM 37 can be solved in time O(n + u lg u) by representing 36 S as a bit vector and performing a convolution 35 using FFT.
This is enough to solve your problem 34 :).
What is very important is that O(n log n) is 33 complexity in terms of number of zeroes 32 and ones, not the count of ones (which could 31 be given as an array, like [1,5,9,15]). Checking 30 if a set has an arithmetic progression, terms 29 of number of 1's, is hard, and according 28 to that paper as of 1999 no faster algorithm 27 than O(n2) is known, and is conjectured that 26 it doesn't exist. Everybody who doesn't take this into account is attempting to solve an open problem.
Other interesting info, mostly 25 irrevelant:
Lower bound:
An easy lower bound 24 is Cantor-like set (numbers 1..3^n-1 not 23 containing 1 in their ternary expansion) - its 22 density is n^(log_3 2) (circa 0.631). So 21 any checking if the set isn't too large, and 20 then checking all pairs is not enough to 19 get O(n log n). You have to investigate 18 the sequence smarter. A better lower bound 17 is quoted here - it's n1-c/(log(n))^(1/2). This means Cantor set 16 is not optimal.
Upper bound - my old algorithm:
It 15 is known that for large n, a subset of {1,2,...,n} not 14 containing arithmetic progression has at 13 most n/(log n)^(1/20) elements. The paper 12 On triples in arithmetic progression proves more: the set cannot contain more 11 than n * 228 * (log log n / log n)1/2 elements. So 10 you could check if that bound is achieved 9 and if not, naively check pairs. This is 8 O(n2 * log log n / log n) algorithm, faster 7 than O(n2). Unfortunately "On triples..." is 6 on Springer - but the first page is available, and 5 Ben Green's exposition is available here, page 4 28, theorem 24.
By the way, the papers are 3 from 1999 - the same year as the first one 2 I mentioned, so that's probably why the 1 first one doesn't mention that result.
This is not a solution, but a similar line 14 of thought to what Olexiy was thinking
I was playing around 13 with creating sequences with maximum number 12 of ones, and they are all quite interesting, I 11 got up to 125 digits and here are the first 10 3 numbers it found by attempting to insert 9 as many '1' bits as possible:
- 11011000011011000000000000001101100001101100000000000000000000000000000000000000000110110000110110000000000000011011000011011
- 10110100010110100000000000010110100010110100000000000000000000000000000000000000000101101000101101000000000000101101000101101
- 10011001010011001000000000010011001010011001000000000000000000000000000000000000010011001010011001000000000010011001010011001
Notice they 8 are all fractals (not too surprising given the constraints). There 7 may be something in thinking backwards, perhaps 6 if the string is not a fractal of with a characteristic, then 5 it must have a repeating pattern?
Thanks 4 to beta for the better term to describe 3 these numbers.
Update: Alas it looks like the pattern 2 breaks down when starting with a large enough 1 initial string, such as: 10000000000001:
100000000000011
10000000000001101
100000000000011011
10000000000001101100001
100000000000011011000011
10000000000001101100001101
100000000000011011000011010000000001
100000000000011011000011010000000001001
1000000000000110110000110100000000010011
1000000000000110110000110100000000010011001
10000000000001101100001101000000000100110010000000001
10000000000001101100001101000000000100110010000000001000001
1000000000000110110000110100000000010011001000000000100000100000000000001
10000000000001101100001101000000000100110010000000001000001000000000000011
1000000000000110110000110100000000010011001000000000100000100000000000001101
100000000000011011000011010000000001001100100000000010000010000000000000110100001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001
1000000000000110110000110100000000010011001000000000100000100000000000001101000010010000010000001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001000000000000001
1000000000000110110000110100000000010011001000000000100000100000000000001101000010010000010000001100010000000010000000000000000000000000000000000000000100000010000000000000011
1000000000000110110000110100000000010011001000000000100000100000000000001101000010010000010000001100010000000010000000000000000000000000000000000000000100000010000000000000011000000001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001000000001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001000000001001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001000000000000001100000000110010000000010010000000000001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001000000000000001100000000110010000000010010000000000001000000001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001000000001001000000000000100000000100001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001000000001001000000000000100000000100001000001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001000000001001000000000000100000000100001000001001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001000000000000001100000000110010000000010010000000000001000000001000010000010010001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001000000000000001100000000110010000000010010000000000001000000001000010000010010001001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001000000000000001100000000110010000000010010000000000001000000001000010000010010001001000001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001000000001001000000000000100000000100001000001001000100100000100000000000001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001000000000000001100000000110010000000010010000000000001000000001000010000010010001001000001000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001000000001001000000000000100000000100001000001001000100100000100000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001000000000000001100000000110010000000010010000000000001000000001000010000010010001001000001000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000011
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001000000000000001100000000110010000000010010000000000001000000001000010000010010001001000001000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000011000001
1000000000000110110000110100000000010011001000000000100000100000000000001101000010010000010000001100010000000010000000000000000000000000000000000000000100000010000000000000011000000001100100000000100100000000000010000000010000100000100100010010000010000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000110000010000000000000000000001
1000000000000110110000110100000000010011001000000000100000100000000000001101000010010000010000001100010000000010000000000000000000000000000000000000000100000010000000000000011000000001100100000000100100000000000010000000010000100000100100010010000010000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000110000010000000000000000000001001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001000000001001000000000000100000000100001000001001000100100000100000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000001100000100000000000000000000010010000000000000000000000000000000000001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001000000000000001100000000110010000000010010000000000001000000001000010000010010001001000001000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000011000001000000000000000000000100100000000000000000000000000000000000011
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001000000000000001100000000110010000000010010000000000001000000001000010000010010001001000001000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000011000001000000000000000000000100100000000000000000000000000000000000011001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001000000001001000000000000100000000100001000001001000100100000100000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000001100000100000000000000000000010010000000000000000000000000000000000001100100000000000000000000001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001000000001001000000000000100000000100001000001001000100100000100000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000001100000100000000000000000000010010000000000000000000000000000000000001100100000000000000000000001001
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001000000001001000000000000100000000100001000001001000100100000100000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000001100000100000000000000000000010010000000000000000000000000000000000001100100000000000000000000001001000001
100000000000011011000011010000000001001100100000000010000010000000000000110100001001000001000000110001000000001000000000000000000000000000000000000000010000001000000000000001100000000110010000000010010000000000001000000001000010000010010001001000001000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000011000001000000000000000000000100100000000000000000000000000000000000011001000000000000000000000010010000010000001
1000000000000110110000110100000000010011001000000000100000100000000000001101000010010000010000001100010000000010000000000000000000000000000000000000000100000010000000000000011000000001100100000000100100000000000010000000010000100000100100010010000010000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000110000010000000000000000000001001000000000000000000000000000000000000110010000000000000000000000100100000100000011
10000000000001101100001101000000000100110010000000001000001000000000000011010000100100000100000011000100000000100000000000000000000000000000000000000001000000100000000000000110000000011001000000001001000000000000100000000100001000001001000100100000100000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000001100000100000000000000000000010010000000000000000000000000000000000001100100000000000000000000001001000001000000110000000000001
I suspect that a simple approach that looks 14 like O(n^2) will actually yield something 13 better, like O(n ln(n)). The sequences that 12 take the longest to test (for any given 11 n) are the ones that contain no trios, and 10 that puts severe restrictions on the number 9 of 1's that can be in the sequence.
I've 8 come up with some hand-waving arguments, but 7 I haven't been able to find a tidy proof. I'm 6 going to take a stab in the dark: the answer 5 is a very clever idea that the professor 4 has known for so long that it's come to 3 seem obvious, but it's much too hard for 2 the students. (Either that or you slept 1 through the lecture that covered it.)
Revision: 2009-10-17 23:00
I've run this 62 on large numbers (like, strings of 20 million) and 61 I now believe this algorithm is not O(n 60 logn). Notwithstanding that, it's a cool 59 enough implementation and contains a number 58 of optimizations that makes it run really 57 fast. It evaluates all the arrangements 56 of binary strings 24 or fewer digits in 55 under 25 seconds.
I've updated the code to 54 include the 0 <= L < M < U <= X-1
observation from earlier today.
Original
This 53 is, in concept, similar to another question I answered. That code also 52 looked at three values in a series and determined 51 if a triplet satisfied a condition. Here 50 is C# code adapted from that:
using System;
using System.Collections.Generic;
namespace StackOverflow1560523
{
class Program
{
public struct Pair<T>
{
public T Low, High;
}
static bool FindCandidate(int candidate,
List<int> arr,
List<int> pool,
Pair<int> pair,
ref int iterations)
{
int lower = pair.Low, upper = pair.High;
while ((lower >= 0) && (upper < pool.Count))
{
int lowRange = candidate - arr[pool[lower]];
int highRange = arr[pool[upper]] - candidate;
iterations++;
if (lowRange < highRange)
lower -= 1;
else if (lowRange > highRange)
upper += 1;
else
return true;
}
return false;
}
static List<int> BuildOnesArray(string s)
{
List<int> arr = new List<int>();
for (int i = 0; i < s.Length; i++)
if (s[i] == '1')
arr.Add(i);
return arr;
}
static void BuildIndexes(List<int> arr,
ref List<int> even, ref List<int> odd,
ref List<Pair<int>> evenIndex, ref List<Pair<int>> oddIndex)
{
for (int i = 0; i < arr.Count; i++)
{
bool isEven = (arr[i] & 1) == 0;
if (isEven)
{
evenIndex.Add(new Pair<int> {Low=even.Count-1, High=even.Count+1});
oddIndex.Add(new Pair<int> {Low=odd.Count-1, High=odd.Count});
even.Add(i);
}
else
{
oddIndex.Add(new Pair<int> {Low=odd.Count-1, High=odd.Count+1});
evenIndex.Add(new Pair<int> {Low=even.Count-1, High=even.Count});
odd.Add(i);
}
}
}
static int FindSpacedOnes(string s)
{
// List of indexes of 1s in the string
List<int> arr = BuildOnesArray(s);
//if (s.Length < 3)
// return 0;
// List of indexes to odd indexes in arr
List<int> odd = new List<int>(), even = new List<int>();
// evenIndex has indexes into arr to bracket even numbers
// oddIndex has indexes into arr to bracket odd numbers
List<Pair<int>> evenIndex = new List<Pair<int>>(),
oddIndex = new List<Pair<int>>();
BuildIndexes(arr,
ref even, ref odd,
ref evenIndex, ref oddIndex);
int iterations = 0;
for (int i = 1; i < arr.Count-1; i++)
{
int target = arr[i];
bool found = FindCandidate(target, arr, odd, oddIndex[i], ref iterations) ||
FindCandidate(target, arr, even, evenIndex[i], ref iterations);
if (found)
return iterations;
}
return iterations;
}
static IEnumerable<string> PowerSet(int n)
{
for (long i = (1L << (n-1)); i < (1L << n); i++)
{
yield return Convert.ToString(i, 2).PadLeft(n, '0');
}
}
static void Main(string[] args)
{
for (int i = 5; i < 64; i++)
{
int c = 0;
string hardest_string = "";
foreach (string s in PowerSet(i))
{
int cost = find_spaced_ones(s);
if (cost > c)
{
hardest_string = s;
c = cost;
Console.Write("{0} {1} {2}\r", i, c, hardest_string);
}
}
Console.WriteLine("{0} {1} {2}", i, c, hardest_string);
}
}
}
}
The principal 49 differences are:
- Exhaustive search of solutions
This code generates a power set of data to find the hardest input to solve for this algorithm. - All solutions versus hardest to solve
The code for the previous question generated all the solutions using a python generator. This code just displays the hardest for each pattern length. - Scoring algorithm
This code checks the distance from the middle element to its left- and right-hand edge. The python code tested whether a sum was above or below 0. - Convergence on a candidate
The current code works from the middle towards the edge to find a candidate. The code in the previous problem worked from the edges towards the middle. This last change gives a large performance improvement. - Use of even and odd pools
Based on the observations at the end of this write-up, the code searches pairs of even numbers of pairs of odd numbers to find L and U, keeping M fixed. This reduces the number of searches by pre-computing information. Accordingly, the code uses two levels of indirection in the main loop of FindCandidate and requires two calls to FindCandidate for each middle element: once for even numbers and once for odd ones.
The general idea is to work 48 on indexes, not the raw representation of 47 the data. Calculating an array where the 46 1's appear allows the algorithm to run in 45 time proportional to the number of 1's in 44 the data rather than in time proportional 43 to the length of the data. This is a standard 42 transformation: create a data structure 41 that allows faster operation while keeping 40 the problem equivalent.
The results are out 39 of date: removed.
Edit: 2009-10-16 18:48
On 38 yx's data, which is given some credence 37 in the other responses as representative 36 of hard data to calculate on, I get these 35 results... I removed these. They are out 34 of date.
I would point out that this data 33 is not the hardest for my algorithm, so 32 I think the assumption that yx's fractals 31 are the hardest to solve is mistaken. The 30 worst case for a particular algorithm, I 29 expect, will depend upon the algorithm itself 28 and will not likely be consistent across 27 different algorithms.
Edit: 2009-10-17 13:30
Further 26 observations on this.
First, convert the 25 string of 0's and 1's into an array of indexes 24 for each position of the 1's. Say the length 23 of that array A is X. Then the goal is to 22 find
0 <= L < M < U <= X-1
such that
A[M] - A[L] = A[U] - A[M]
or
2*A[M] = A[L] + A[U]
Since A[L] and A[U] sum 21 to an even number, they can't be (even, odd) or 20 (odd, even). The search for a match could 19 be improved by splitting A[] into odd and 18 even pools and searching for matches on 17 A[M] in the pools of odd and even candidates 16 in turn.
However, this is more of a performance 15 optimization than an algorithmic improvement, I 14 think. The number of comparisons should 13 drop, but the order of the algorithm should 12 be the same.
Edit 2009-10-18 00:45
Yet another 11 optimization occurs to me, in the same vein 10 as separating the candidates into even and 9 odd. Since the three indexes have to add 8 to a multiple of 3 (a, a+x, a+2x -- mod 7 3 is 0, regardless of a and x), you can 6 separate L, M, and U into their mod 3 values:
M L U
0 0 0
1 2
2 1
1 0 2
1 1
2 0
2 0 1
1 0
2 2
In 5 fact, you could combine this with the even/odd 4 observation and separate them into their 3 mod 6 values:
M L U
0 0 0
1 5
2 4
3 3
4 2
5 1
and so on. This would provide 2 a further performance optimization but not 1 an algorithmic speedup.
Wasn't able to come up with the solution 7 yet :(, but have some ideas.
What if we start 6 from a reverse problem: construct a sequence 5 with the maximum number of 1s and WITHOUT 4 any evenly spaced trios. If you can prove 3 the maximum number of 1s is o(n), then you 2 can improve your estimate by iterating only 1 through list of 1s only.
This may help....
This problem reduces to 22 the following:
Given a sequence of positive 21 integers, find a contiguous subsequence 20 partitioned into a prefix and a suffix such 19 that the sum of the prefix of the subsequence 18 is equal to the sum of the suffix of the 17 subsequence.
For example, given a sequence 16 of [ 3, 5, 1, 3, 6, 5, 2, 2, 3, 5, 6, 4 ]
, we would find a subsequence of [ 3, 6, 5, 2, 2]
with 15 a prefix of [ 3, 6 ]
with prefix sum of 9
and a suffix 14 of [ 5, 2, 2 ]
with suffix sum of 9
.
The reduction is 13 as follows:
Given a sequence of zeros and 12 ones, and starting at the leftmost one, continue 11 moving to the right. Each time another one 10 is encountered, record the number of moves 9 since the previous one was encountered and 8 append that number to the resulting sequence.
For 7 example, given a sequence of [ 0, 1, 1, 0, 0, 1, 0, 0, 0, 1 0 ]
, we would 6 find the reduction of [ 1, 3, 4]
. From this reduction, we 5 calculate the contiguous subsequence of 4 [ 1, 3, 4]
, the prefix of [ 1, 3]
with sum of 4
, and the suffix 3 of [ 4 ]
with sum of 4
.
This reduction may be computed 2 in O(n)
.
Unfortunately, I am not sure where to 1 go from here.
For the simple problem type (i.e. you search 14 three "1" with only (i.e. zero or more) "0" between it), Its quite simple: You 13 could just split the sequence at every "1" and 12 look for two adjacent subsequences having 11 the same length (the second subsequence 10 not being the last one, of course). Obviously, this 9 can be done in O(n) time.
For the more complex 8 version (i.e. you search an index i and an 7 gap g>0 such that s[i]==s[i+g]==s[i+2*g]=="1"
), I'm not sure, if there 6 exists an O(n log n) solution, since there are possibly 5 O(n²) triplets having this property (think of 4 a string of all ones, there are approximately 3 n²/2 such triplets). Of course, you are looking 2 for only one of these, but I have currently 1 no idea, how to find it ...
A fun question, but once you realise that 5 the actual pattern between two '1's does 4 not matter, the algorithm becomes:
- scan look for a '1'
- starting from the next position scan for another '1' (to the end of the array minus the distance from the current first '1' or else the 3rd '1' would be out of bounds)
- if at the position of the 2nd '1' plus the distance to the first 1' a third '1' is found, we have evenly spaces ones.
In code, JTest 3 fashion, (Note this code isn't written to 2 be most efficient and I added some println's 1 to see what happens.)
import java.util.Random;
import junit.framework.TestCase;
public class AlgorithmTest extends TestCase {
/**
* Constructor for GetNumberTest.
*
* @param name The test's name.
*/
public AlgorithmTest(String name) {
super(name);
}
/**
* @see TestCase#setUp()
*/
protected void setUp() throws Exception {
super.setUp();
}
/**
* @see TestCase#tearDown()
*/
protected void tearDown() throws Exception {
super.tearDown();
}
/**
* Tests the algorithm.
*/
public void testEvenlySpacedOnes() {
assertFalse(isEvenlySpaced(1));
assertFalse(isEvenlySpaced(0x058003));
assertTrue(isEvenlySpaced(0x07001));
assertTrue(isEvenlySpaced(0x01007));
assertTrue(isEvenlySpaced(0x101010));
// some fun tests
Random random = new Random();
isEvenlySpaced(random.nextLong());
isEvenlySpaced(random.nextLong());
isEvenlySpaced(random.nextLong());
}
/**
* @param testBits
*/
private boolean isEvenlySpaced(long testBits) {
String testString = Long.toBinaryString(testBits);
char[] ones = testString.toCharArray();
final char ONE = '1';
for (int n = 0; n < ones.length - 1; n++) {
if (ONE == ones[n]) {
for (int m = n + 1; m < ones.length - m + n; m++) {
if (ONE == ones[m] && ONE == ones[m + m - n]) {
System.out.println(" IS evenly spaced: " + testBits + '=' + testString);
System.out.println(" at: " + n + ", " + m + ", " + (m + m - n));
return true;
}
}
}
}
System.out.println("NOT evenly spaced: " + testBits + '=' + testString);
return false;
}
}
I thought of a divide-and-conquer approach 59 that might work.
First, in preprocessing 58 you need to insert all numbers less than 57 one half your input size (n/3) into a list.
Given 56 a string: 0000010101000100
(note that this particular example 55 is valid)
Insert all primes (and 1) from 54 1 to (16/2) into a list: {1, 2, 3, 4, 5, 6, 7}
Then 53 divide it in half:
100000101 01000100
Keep doing this until 52 you get to strings of size 1. For all size-one 51 strings with a 1 in them, add the index 50 of the string to the list of possibilities; otherwise, return 49 -1 for failure.
You'll also need to return 48 a list of still-possible spacing distances, associated 47 with each starting index. (Start with the 46 list you made above and remove numbers as 45 you go) Here, an empty list means you're 44 only dealing with one 1 and so any spacing 43 is possible at this point; otherwise the 42 list includes spacings that must be ruled 41 out.
So continuing with the example above:
1000 0101 0100 0100
10 00 01 01 01 00 01 00
1 0 0 0 0 1 0 1 0 1 0 0 0 1 0 0
In 40 the first combine step, we have eight sets 39 of two now. In the first, we have the possibility 38 of a set, but we learn that spacing by 1 37 is impossible because of the other zero 36 being there. So we return 0 (for the index) and 35 {2,3,4,5,7} for the fact that spacing by 34 1 is impossible. In the second, we have 33 nothing and so return -1. In the third we 32 have a match with no spacings eliminated 31 in index 5, so return 5, {1,2,3,4,5,7}. In 30 the fourth pair we return 7, {1,2,3,4,5,7}. In 29 the fifth, return 9, {1,2,3,4,5,7}. In the 28 sixth, return -1. In the seventh, return 27 13, {1,2,3,4,5,7}. In the eighth, return 26 -1.
Combining again into four sets of four, we 25 have:
1000
: Return (0, {4,5,6,7})
0101
: Return (5, {2,3,4,5,6,7}), (7, {1,2,3,4,5,6,7})
0100
: Return 24 (9, {3,4,5,6,7})
0100
: Return (13, {3,4,5,6,7})
Combining 23 into sets of eight:
10000101
: Return (0, {5,7}), (5, {2,3,4,5,6,7}), (7, {1,2,3,4,5,6,7})
01000100
: Return 22 (9, {4,7}), (13, {3,4,5,6,7})
Combining into 21 a set of sixteen:
10000101 01000100
As we've progressed, we 20 keep checking all the possibilities so far. Up 19 to this step we've left stuff that went 18 beyond the end of the string, but now we 17 can check all the possibilities.
Basically, we 16 check the first 1 with spacings of 5 and 15 7, and find that they don't line up to 1's. (Note 14 that each check is CONSTANT, not linear 13 time) Then we check the second one (index 12 5) with spacings of 2, 3, 4, 5, 6, and 7-- or 11 we would, but we can stop at 2 since that 10 actually matches up.
Phew! That's a rather 9 long algorithm.
I don't know 100% if it's 8 O(n log n) because of the last step, but everything 7 up to there is definitely O(n log n) as far as I can 6 tell. I'll get back to this later and try 5 to refine the last step.
EDIT: Changed my 4 answer to reflect Welbog's comment. Sorry 3 for the error. I'll write some pseudocode 2 later, too, when I get a little more time 1 to decipher what I wrote again. ;-)
I'll give my rough guess here, and let those 7 who are better with calculating complexity 6 to help me on how my algorithm fares in 5 O-notation wise
- given binary string 0000010101000100 (as example)
- crop head and tail of zeroes -> 00000 101010001 00
- we get 101010001 from previous calculation
- check if the middle bit is 'one', if true, found valid three evenly spaced 'ones' (only if the number of bits is odd numbered)
- correlatively, if the remained cropped number of bits is even numbered, the head and tail 'one' cannot be part of evenly spaced 'one',
- we use 1010100001 as example (with an extra 'zero' to become even numbered crop), in this case we need to crop again, then becomes -> 10101 00001
- we get 10101 from previous calculation, and check middle bit, and we found the evenly spaced bit again
I have no idea how to calculate 4 complexity for this, can anyone help?
edit: add 3 some code to illustrate my idea
edit2: tried 2 to compile my code and found some major 1 mistakes, fixed
char *binaryStr = "0000010101000100";
int main() {
int head, tail, pos;
head = 0;
tail = strlen(binaryStr)-1;
if( (pos = find3even(head, tail)) >=0 )
printf("found it at position %d\n", pos);
return 0;
}
int find3even(int head, int tail) {
int pos = 0;
if(head >= tail) return -1;
while(binaryStr[head] == '0')
if(head<tail) head++;
while(binaryStr[tail] == '0')
if(head<tail) tail--;
if(head >= tail) return -1;
if( (tail-head)%2 == 0 && //true if odd numbered
(binaryStr[head + (tail-head)/2] == '1') ) {
return head;
}else {
if( (pos = find3even(head, tail-1)) >=0 )
return pos;
if( (pos = find3even(head+1, tail)) >=0 )
return pos;
}
return -1;
}
I came up with something like this:
def IsSymetric(number):
number = number.strip('0')
if len(number) < 3:
return False
if len(number) % 2 == 0:
return IsSymetric(number[1:]) or IsSymetric(number[0:len(number)-2])
else:
if number[len(number)//2] == '1':
return True
return IsSymetric(number[:(len(number)//2)]) or IsSymetric(number[len(number)//2+1:])
return False
This 3 is inspired by andycjw.
- Truncate the zeros.
- If even then test two substring 0 - (len-2) (skip last character) and from 1 - (len-1) (skip the first char)
- If not even than if the middle char is one than we have success. Else divide the string in the midle without the midle element and check both parts.
As to the complexity 2 this might be O(nlogn) as in each recursion 1 we are dividing by two.
Hope it helps.
Ok, I'm going to take another stab at the 27 problem. I think I can prove a O(n log(n)) algorithm 26 that is similar to those already discussed 25 by using a balanced binary tree to store 24 distances between 1's. This approach was 23 inspired by Justice's observation about 22 reducing the problem to a list of distances 21 between the 1's.
Could we scan the input 20 string to construct a balanced binary tree 19 around the position of 1's such that each 18 node stores the position of the 1 and each 17 edge is labeled with the distance to the 16 adjacent 1 for each child node. For example:
10010001 gives the following tree
3
/ \
2 / \ 3
/ \
0 7
This 15 can be done in O(n log(n)) since, for a 14 string of size n, each insertion takes O(log(n)) in 13 the worst case.
Then the problem is to search 12 the tree to discover whether, at any node, there 11 is a path from that node through the left-child 10 that has the same distance as a path through 9 the right child. This can be done recursively 8 on each subtree. When merging two subtrees 7 in the search, we must compare the distances 6 from paths in the left subtree with distances 5 from paths in the right. Since the number 4 of paths in a subtree will be proportional 3 to log(n), and the number of nodes is n, I 2 believe this can be done in O(n log(n)) time.
Did 1 I miss anything?
This seemed liked a fun problem so I decided 11 to try my hand at it.
I am making the assumption 10 that 111000001 would find the first 3 ones 9 and be successful. Essentially the number 8 of zeroes following the 1 is the important 7 thing, since 0111000 is the same as 111000 6 according to your definition. Once you find 5 two cases of 1, the next 1 found completes 4 the trilogy.
Here it is in Python:
def find_three(bstring):
print bstring
dict = {}
lastone = -1
zerocount = 0
for i in range(len(bstring)):
if bstring[i] == '1':
print i, ': 1'
if lastone != -1:
if(zerocount in dict):
dict[zerocount].append(lastone)
if len(dict[zerocount]) == 2:
dict[zerocount].append(i)
return True, dict
else:
dict[zerocount] = [lastone]
lastone = i
zerocount = 0
else:
zerocount = zerocount + 1
#this is really just book keeping, as we have failed at this point
if lastone != -1:
if(zerocount in dict):
dict[zerocount].append(lastone)
else:
dict[zerocount] = [lastone]
return False, dict
This is 3 a first try, so I'm sure this could be written 2 in a cleaner manner. Please list the cases 1 where this method fails down below.
One inroad into the problem is to think 26 of factors and shifting.
With shifting, you 25 compare the string of ones and zeroes with 24 a shifted version of itself. You then take 23 matching ones. Take this example shifted 22 by two:
1010101010
1010101010
------------
001010101000
The resulting 1's (bitwise ANDed), must 21 represent all those 1's which are evenly 20 spaced by two. The same example shifted 19 by three:
1010101010
1010101010
-------------
0000000000000
In this case there are no 1's which 18 are evenly spaced three apart.
So what does 17 this tell you? Well that you only need 16 to test shifts which are prime numbers. For 15 example say you have two 1's which are six 14 apart. You would only have to test 'two' shifts 13 and 'three' shifts (since these divide six). For 12 example:
10000010
10000010 (Shift by two)
10000010
10000010 (We have a match)
10000010
10000010 (Shift by three)
10000010 (We have a match)
So the only shifts you ever need 11 to check are 2,3,5,7,11,13 etc. Up to the 10 prime closest to the square root of size 9 of the string of digits.
Nearly solved?
I think I am closer 8 to a solution. Basically:
- Scan the string for 1's. For each 1 note it's remainder after taking a modulus of its position. The modulus ranges from 1 to half the size of the string. This is because the largest possible separation size is half the string. This is done in O(n^2). BUT. Only prime moduli need be checked so O(n^2/log(n))
- Sort the list of modulus/remainders in order largest modulus first, this can be done in O(n*log(n)) time.
- Look for three consecutive moduli/remainders which are the same.
- Somehow retrieve the position of the ones!
I think the biggest 7 clue to the answer, is that the fastest 6 sort algorithms, are O(n*log(n)).
WRONG
Step 1 5 is wrong as pointed out by a colleague. If 4 we have 1's at position 2,12 and 102. Then 3 taking a modulus of 10, they would all have 2 the same remainders, and yet are not equally 1 spaced apart! Sorry.
I assume the reason this is nlog(n) is due 8 to the following:
- To find the 1 that is the start of the triplet, you need to check (n-2) characters. If you haven't found it by that point, you won't (chars n-1 and n cannot start a triplet) (O(n))
- To find the second 1 that is the part of the triplet (started by the first one), you need to check m/2 (m=n-x, where x is the offset of the first 1) characters. This is because, if you haven't found the second 1 by the time you're halfway from the first one to the end, you won't... since the third 1 must be exactly the same distance past the second. (O(log(n)))
- It O(1) to find the last 1 since you know the index it must be at by the time you find the first and second.
So, you have n, log(n), and 7 1... O(nlogn)
Edit: Oops, my bad. My brain had 6 it set that n/2 was logn... which it obviously 5 isn't (doubling the number on items still 4 doubles the number of iterations on the 3 inner loop). This is still at n^2, not solving 2 the problem. Well, at least I got to write 1 some code :)
Implementation in Tcl
proc get-triplet {input} {
for {set first 0} {$first < [string length $input]-2} {incr first} {
if {[string index $input $first] != 1} {
continue
}
set start [expr {$first + 1}]
set end [expr {1+ $first + (([string length $input] - $first) /2)}]
for {set second $start} {$second < $end} {incr second} {
if {[string index $input $second] != 1} {
continue
}
set last [expr {($second - $first) + $second}]
if {[string index $input $last] == 1} {
return [list $first $second $last]
}
}
}
return {}
}
get-triplet 10101 ;# 0 2 4
get-triplet 10111 ;# 0 2 4
get-triplet 11100000 ;# 0 1 2
get-triplet 0100100100 ;# 1 4 7
I think I have found a way of solving the 21 problem, but I can't construct a formal 20 proof. The solution I made is written in 19 Java, and it uses a counter 'n' to count 18 how many list/array accesses it does. So 17 n should be less than or equal to stringLength*log(stringLength) if 16 it is correct. I tried it for the numbers 15 0 to 2^22, and it works.
It starts by iterating 14 over the input string and making a list 13 of all the indexes which hold a one. This 12 is just O(n).
Then from the list of indexes 11 it picks a firstIndex, and a secondIndex 10 which is greater than the first. These 9 two indexes must hold ones, because they 8 are in the list of indexes. From there 7 the thirdIndex can be calculated. If the 6 inputString[thirdIndex] is a 1 then it halts.
public static int testString(String input){
//n is the number of array/list accesses in the algorithm
int n=0;
//Put the indices of all the ones into a list, O(n)
ArrayList<Integer> ones = new ArrayList<Integer>();
for(int i=0;i<input.length();i++){
if(input.charAt(i)=='1'){
ones.add(i);
}
}
//If less than three ones in list, just stop
if(ones.size()<3){
return n;
}
int firstIndex, secondIndex, thirdIndex;
for(int x=0;x<ones.size()-2;x++){
n++;
firstIndex = ones.get(x);
for(int y=x+1; y<ones.size()-1; y++){
n++;
secondIndex = ones.get(y);
thirdIndex = secondIndex*2 - firstIndex;
if(thirdIndex >= input.length()){
break;
}
n++;
if(input.charAt(thirdIndex) == '1'){
//This case is satisfied if it has found three evenly spaced ones
//System.out.println("This one => " + input);
return n;
}
}
}
return n;
}
additional 5 note: the counter n is not incremented when 4 it iterates over the input string to construct 3 the list of indexes. This operation is 2 O(n), so it won't have an effect on the 1 algorithm complexity anyway.
Assumption:
Just wrong, talking about log(n) number 19 of upper limit of ones
EDIT:
Now I found that using 18 Cantor numbers (if correct), density on 17 set is (2/3)^Log_3(n) (what a weird function) and 16 I agree, log(n)/n density is to strong.
If 15 this is upper limit, there is algorhitm 14 who solves this problem in at least O(n*(3/2)^(log(n)/log(3))) time 13 complexity and O((3/2)^(log(n)/log(3))) space 12 complexity. (check Justice's answer for 11 algorhitm)
This is still by far better than 10 O(n^2)
This function ((3/2)^(log(n)/log(3))) really 9 looks like n*log(n) on first sight.
How did I get this formula?
Applaying 8 Cantors number on string.
Supose that length 7 of string is 3^p == n
At each step in generation 6 of Cantor string you keep 2/3 of prevous 5 number of ones. Apply this p times.
That 4 mean (n * ((2/3)^p)) -> (((3^p)) * ((2/3)^p)) remaining 3 ones and after simplification 2^p.
This 2 mean 2^p ones in 3^p string -> (3/2)^p ones 1 . Substitute p=log(n)/log(3) and get
((3/2)^(log(n)/log(3)))
While scanning 1s, add their positions to 9 a List. When adding the second and successive 8 1s, compare them to each position in the 7 list so far. Spacing equals currentOne (center) - previousOne 6 (left). The right-side bit is currentOne 5 + spacing. If it's 1, the end.
The list of 4 ones grows inversely with the space between 3 them. Simply stated, if you've got a lot 2 of 0s between the 1s (as in a worst case), your 1 list of known 1s will grow quite slowly.
using System;
using System.Collections.Generic;
namespace spacedOnes
{
class Program
{
static int[] _bits = new int[8] {128, 64, 32, 16, 8, 4, 2, 1};
static void Main(string[] args)
{
var bytes = new byte[4];
var r = new Random();
r.NextBytes(bytes);
foreach (var b in bytes) {
Console.Write(getByteString(b));
}
Console.WriteLine();
var bitCount = bytes.Length * 8;
var done = false;
var onePositions = new List<int>();
for (var i = 0; i < bitCount; i++)
{
if (isOne(bytes, i)) {
if (onePositions.Count > 0) {
foreach (var knownOne in onePositions) {
var spacing = i - knownOne;
var k = i + spacing;
if (k < bitCount && isOne(bytes, k)) {
Console.WriteLine("^".PadLeft(knownOne + 1) + "^".PadLeft(spacing) + "^".PadLeft(spacing));
done = true;
break;
}
}
}
if (done) {
break;
}
onePositions.Add(i);
}
}
Console.ReadKey();
}
static String getByteString(byte b) {
var s = new char[8];
for (var i=0; i<s.Length; i++) {
s[i] = ((b & _bits[i]) > 0 ? '1' : '0');
}
return new String(s);
}
static bool isOne(byte[] bytes, int i)
{
var byteIndex = i / 8;
var bitIndex = i % 8;
return (bytes[byteIndex] & _bits[bitIndex]) > 0;
}
}
}
Here are some thoughts that, despite my 41 best efforts, will not seem to wrap themselves 40 up in a bow. Still, they might be a useful 39 starting point for someone's analysis.
Consider 38 the proposed solution as follows, which 37 is the approach that several folks have 36 suggested, including myself in a prior version 35 of this answer. :)
- Trim leading and trailing zeroes.
- Scan the string looking for 1's.
- When a 1 is found:
- Assume that it is the middle 1 of the solution.
- For each prior 1, use its saved position to compute the anticipated position of the final 1.
- If the computed position is after the end of the string it cannot be part of the solution, so drop the position from the list of candidates.
- Check the solution.
- If the solution was not found, add the current 1 to the list of candidates.
- Repeat until no more 1's are found.
Now consider input strings 34 strings like the following, which will not 33 have a solution:
101
101001
1010010001
101001000100001
101001000100001000001
In general, this is the 32 concatenation of k strings of the form j 31 0's followed by a 1 for j from zero to k-1.
k=2 101
k=3 101001
k=4 1010010001
k=5 101001000100001
k=6 101001000100001000001
Note 30 that the lengths of the substrings are 1, 2, 3, etc. So, problem 29 size n has substrings of lengths 1 to k 28 such that n = k(k+1)/2.
k=2 n= 3 101
k=3 n= 6 101001
k=4 n=10 1010010001
k=5 n=15 101001000100001
k=6 n=21 101001000100001000001
Note that k also 27 tracks the number of 1's that we have to 26 consider. Remember that every time we see 25 a 1, we need to consider all the 1's seen 24 so far. So when we see the second 1, we 23 only consider the first, when we see the 22 third 1, we reconsider the first two, when 21 we see the fourth 1, we need to reconsider 20 the first three, and so on. By the end 19 of the algorithm, we've considered k(k-1)/2 18 pairs of 1's. Call that p.
k=2 n= 3 p= 1 101
k=3 n= 6 p= 3 101001
k=4 n=10 p= 6 1010010001
k=5 n=15 p=10 101001000100001
k=6 n=21 p=15 101001000100001000001
The relationship 17 between n and p is that n = p + k.
The process 16 of going through the string takes O(n) time. Each 15 time a 1 is encountered, a maximum of (k-1) comparisons 14 are done. Since n = k(k+1)/2, n > k**2, so 13 sqrt(n) > k. This gives us O(n sqrt(n)) or 12 O(n**3/2). Note however that may not be 11 a really tight bound, because the number 10 of comparisons goes from 1 to a maximum 9 of k, it isn't k the whole time. But I'm 8 not sure how to account for that in the 7 math.
It still isn't O(n log(n)). Also, I 6 can't prove those inputs are the worst cases, although 5 I suspect they are. I think a denser packing 4 of 1's to the front results in an even sparser 3 packing at the end.
Since someone may still 2 find it useful, here's my code for that 1 solution in Perl:
#!/usr/bin/perl
# read input as first argument
my $s = $ARGV[0];
# validate the input
$s =~ /^[01]+$/ or die "invalid input string\n";
# strip leading and trailing 0's
$s =~ s/^0+//;
$s =~ s/0+$//;
# prime the position list with the first '1' at position 0
my @p = (0);
# start at position 1, which is the second character
my $i = 1;
print "the string is $s\n\n";
while ($i < length($s)) {
if (substr($s, $i, 1) eq '1') {
print "found '1' at position $i\n";
my @t = ();
# assuming this is the middle '1', go through the positions
# of all the prior '1's and check whether there's another '1'
# in the correct position after this '1' to make a solution
while (scalar @p) {
# $p is the position of the prior '1'
my $p = shift @p;
# $j is the corresponding position for the following '1'
my $j = 2 * $i - $p;
# if $j is off the end of the string then we don't need to
# check $p anymore
next if ($j >= length($s));
print "checking positions $p, $i, $j\n";
if (substr($s, $j, 1) eq '1') {
print "\nsolution found at positions $p, $i, $j\n";
exit 0;
}
# if $j isn't off the end of the string, keep $p for next time
push @t, $p;
}
@p = @t;
# add this '1' to the list of '1' positions
push @p, $i;
}
$i++;
}
print "\nno solution found\n";
I thought I'd add one comment before posting 14 the 22nd naive solution to the problem. For 13 the naive solution, we don't need to show 12 that the number of 1's in the string is 11 at most O(log(n)), but rather that it is 10 at most O(sqrt(n*log(n)).
Solver:
def solve(Str):
indexes=[]
#O(n) setup
for i in range(len(Str)):
if Str[i]=='1':
indexes.append(i)
#O((number of 1's)^2) processing
for i in range(len(indexes)):
for j in range(i+1, len(indexes)):
indexDiff = indexes[j] - indexes[i]
k=indexes[j] + indexDiff
if k<len(Str) and Str[k]=='1':
return True
return False
It's basically 9 a fair bit similar to flybywire's idea and 8 implementation, though looking ahead instead 7 of back.
Greedy String Builder:
#assumes final char hasn't been added, and would be a 1
def lastCharMakesSolvable(Str):
endIndex=len(Str)
j=endIndex-1
while j-(endIndex-j) >= 0:
k=j-(endIndex-j)
if k >= 0 and Str[k]=='1' and Str[j]=='1':
return True
j=j-1
return False
def expandString(StartString=''):
if lastCharMakesSolvable(StartString):
return StartString + '0'
return StartString + '1'
n=1
BaseStr=""
lastCount=0
while n<1000000:
BaseStr=expandString(BaseStr)
count=BaseStr.count('1')
if count != lastCount:
print(len(BaseStr), count)
lastCount=count
n=n+1
(In my defense, I'm 6 still in the 'learn python' stage of understanding)
Also, potentially 5 useful output from the greedy building of 4 strings, there's a rather consistent jump 3 after hitting a power of 2 in the number 2 of 1's... which I was not willing to wait 1 around to witness hitting 2096.
strlength # of 1's
1 1
2 2
4 3
5 4
10 5
14 8
28 9
41 16
82 17
122 32
244 33
365 64
730 65
1094 128
2188 129
3281 256
6562 257
9842 512
19684 513
29525 1024
I'll try to present a mathematical approach. This 39 is more a beginning than an end, so any 38 help, comment, or even contradiction - will 37 be deeply appreciated. However, if this 36 approach is proven - the algorithm is a 35 straight-forward search in the string.
Given 34 a fixed number of spaces
k
and a stringS
, the 33 search for a k-spaced-triplet takesO(n)
- We 32 simply test for every0<=i<=(n-2k)
ifS[i]==S[i+k]==S[i+2k]
. The test takes 31O(1)
and we do itn-k
times wherek
is a constant, so 30 it takesO(n-k)=O(n)
.Let us assume that there is an 29 Inverse Proportion between the number of 28
1
's and the maximum spaces we need to search 27 for. That is, If there are many1
's, there 26 must be a triplet and it must be quite dense; If 25 there are only few1
's, The triplet (if any) can 24 be quite sparse. In other words, I can prove 23 that if I have enough1
's, such triplet must 22 exist - and the more1
's I have, a more dense 21 triplet must be found. This can be explained 20 by the Pigeonhole principle - Hope to elaborate on this later.Say 19 have an upper bound
k
on the possible number 18 of spaces I have to look for. Now, for each 171
located inS[i]
we need to check for1
inS[i-1]
and 16S[i+1]
,S[i-2]
andS[i+2]
, ...S[i-k]
andS[i+k]
. This takesO((k^2-k)/2)=O(k^2)
for each 151
inS
- due to Gauss' Series Summation Formula. Note that this differs from 14 section 1 - I'm havingk
as an upper bound 13 for the number of spaces, not as a constant 12 space.
We need to prove O(n*log(n))
. That is, we need 11 to show that k*(number of 1's)
is proportional to log(n)
.
If we 10 can do that, the algorithm is trivial - for 9 each 1
in S
whose index is i
, simply look for 8 1
's from each side up to distance k
. If two 7 were found in the same distance, return 6 i
and k
. Again, the tricky part would be finding 5 k
and proving the correctness.
I would really 4 appreciate your comments here - I have been 3 trying to find the relation between k
and 2 the number of 1
's on my whiteboard, so far 1 without success.
Below is a solution. There could be some 3 little mistakes here and there, but the 2 idea is sound.
Edit: It's not n * log(n)
PSEUDO 1 CODE:
foreach character in the string
if the character equals 1 {
if length cache > 0 { //we can skip the first one
foreach location in the cache { //last in first out kind of order
if ((currentlocation + (currentlocation - location)) < length string)
if (string[(currentlocation + (currentlocation - location))] equals 1)
return found evenly spaced string
else
break;
}
}
remember the location of this character in a some sort of cache.
}
return didn't find evenly spaced string
C# code:
public static Boolean FindThreeEvenlySpacedOnes(String str) {
List<int> cache = new List<int>();
for (var x = 0; x < str.Length; x++) {
if (str[x] == '1') {
if (cache.Count > 0) {
for (var i = cache.Count - 1; i > 0; i--) {
if ((x + (x - cache[i])) >= str.Length)
break;
if (str[(x + (x - cache[i]))] == '1')
return true;
}
}
cache.Add(x);
}
}
return false;
}
How it works:
iteration 1:
x
|
101101001
// the location of this 1 is stored in the cache
iteration 2:
x
|
101101001
iteration 3:
a x b
| | |
101101001
//we retrieve location a out of the cache and then based on a
//we calculate b and check if te string contains a 1 on location b
//and of course we store x in the cache because it's a 1
iteration 4:
axb
|||
101101001
a x b
| | |
101101001
iteration 5:
x
|
101101001
iteration 6:
a x b
| | |
101101001
a x b
| | |
101101001
//return found evenly spaced string
How about a simple O(n) solution, with O(n^2) space? (Uses 27 the assumption that all bitwise operators 26 work in O(1).)
The algorithm basically works 25 in four stages:
Stage 1: For each bit in 24 your original number, find out how far away 23 the ones are, but consider only one direction. (I 22 considered all the bits in the direction 21 of the least significant bit.)
Stage 2: Reverse 20 the order of the bits in the input;
Stage 19 3: Re-run step 1 on the reversed input.
Stage 18 4: Compare the results from Stage 1 and 17 Stage 3. If any bits are equally spaced 16 above AND below we must have a hit.
Keep 15 in mind that no step in the above algorithm 14 takes longer than O(n). ^_^
As an added benefit, this 13 algorithm will find ALL equally spaced ones 12 from EVERY number. So for example if you 11 get a result of "0x0005" then there are 10 equally spaced ones at BOTH 1 and 3 units 9 away
I didn't really try optimizing the code 8 below, but it is compilable C# code that 7 seems to work.
using System;
namespace ThreeNumbers
{
class Program
{
const int uint32Length = 32;
static void Main(string[] args)
{
Console.Write("Please enter your integer: ");
uint input = UInt32.Parse(Console.ReadLine());
uint[] distancesLower = Distances(input);
uint[] distancesHigher = Distances(Reverse(input));
PrintHits(input, distancesLower, distancesHigher);
}
/// <summary>
/// Returns an array showing how far the ones away from each bit in the input. Only
/// considers ones at lower signifcant bits. Index 0 represents the least significant bit
/// in the input. Index 1 represents the second least significant bit in the input and so
/// on. If a one is 3 away from the bit in question, then the third least significant bit
/// of the value will be sit.
///
/// As programed this algorithm needs: O(n) time, and O(n*log(n)) space.
/// (Where n is the number of bits in the input.)
/// </summary>
public static uint[] Distances(uint input)
{
uint[] distanceToOnes = new uint[uint32Length];
uint result = 0;
//Sets how far each bit is from other ones. Going in the direction of LSB to MSB
for (uint bitIndex = 1, arrayIndex = 0; bitIndex != 0; bitIndex <<= 1, ++arrayIndex)
{
distanceToOnes[arrayIndex] = result;
result <<= 1;
if ((input & bitIndex) != 0)
{
result |= 1;
}
}
return distanceToOnes;
}
/// <summary>
/// Reverses the bits in the input.
///
/// As programmed this algorithm needs O(n) time and O(n) space.
/// (Where n is the number of bits in the input.)
/// </summary>
/// <param name="input"></param>
/// <returns></returns>
public static uint Reverse(uint input)
{
uint reversedInput = 0;
for (uint bitIndex = 1; bitIndex != 0; bitIndex <<= 1)
{
reversedInput <<= 1;
reversedInput |= (uint)((input & bitIndex) != 0 ? 1 : 0);
}
return reversedInput;
}
/// <summary>
/// Goes through each bit in the input, to check if there are any bits equally far away in
/// the distancesLower and distancesHigher
/// </summary>
public static void PrintHits(uint input, uint[] distancesLower, uint[] distancesHigher)
{
const int offset = uint32Length - 1;
for (uint bitIndex = 1, arrayIndex = 0; bitIndex != 0; bitIndex <<= 1, ++arrayIndex)
{
//hits checks if any bits are equally spaced away from our current value
bool isBitSet = (input & bitIndex) != 0;
uint hits = distancesLower[arrayIndex] & distancesHigher[offset - arrayIndex];
if (isBitSet && (hits != 0))
{
Console.WriteLine(String.Format("The {0}-th LSB has hits 0x{1:x4} away", arrayIndex + 1, hits));
}
}
}
}
}
Someone will probably comment 6 that for any sufficiently large number, bitwise 5 operations cannot be done in O(1). You'd 4 be right. However, I'd conjecture that 3 every solution that uses addition, subtraction, multiplication, or 2 division (which cannot be done by shifting) would 1 also have that problem.
Obviously we need to at least check bunches 44 of triplets at the same time, so we need 43 to compress the checks somehow. I have a 42 candidate algorithm, but analyzing the time 41 complexity is beyond my ability*time threshold.
Build 40 a tree where each node has three children 39 and each node contains the total number 38 of 1's at its leaves. Build a linked list 37 over the 1's, as well. Assign each node 36 an allowed cost proportional to the range 35 it covers. As long as the time we spend 34 at each node is within budget, we'll have 33 an O(n lg n) algorithm.
--
Start at the root. If 32 the square of the total number of 1's below 31 it is less than its allowed cost, apply 30 the naive algorithm. Otherwise recurse on 29 its children.
Now we have either returned 28 within budget, or we know that there are 27 no valid triplets entirely contained within 26 one of the children. Therefore we must check 25 the inter-node triplets.
Now things get incredibly 24 messy. We essentially want to recurse on 23 the potential sets of children while limiting 22 the range. As soon as the range is constrained 21 enough that the naive algorithm will run 20 under budget, you do it. Enjoy implementing 19 this, because I guarantee it will be tedious. There's 18 like a dozen cases.
--
The reason I think 17 that algorithm will work is because the 16 sequences without valid triplets appear 15 to go alternate between bunches of 1's and 14 lots of 0's. It effectively splits the nearby 13 search space, and the tree emulates that 12 splitting.
The run time of the algorithm 11 is not obvious, at all. It relies on the 10 non-trivial properties of the sequence. If 9 the 1's are really sparse then the naive 8 algorithm will work under budget. If the 7 1's are dense, then a match should be found 6 right away. But if the density is 'just 5 right' (eg. near ~n^0.63, which you can 4 achieve by setting all bits at positions 3 with no '2' digit in base 3), I don't know 2 if it will work. You would have to prove 1 that the splitting effect is strong enough.
No theoretical answer here, but I wrote 35 a quick Java program to explore the running-time 34 behavior as a function of k and n, where 33 n is the total bit length and k is the number 32 of 1's. I'm with a few of the answerers 31 who are saying that the "regular" algorithm 30 that checks all the pairs of bit positions 29 and looks for the 3rd bit, even though it 28 would require O(k^2) in the worst case, in 27 reality because the worst-case needs sparse 26 bitstrings, is O(n ln n).
Anyway here's the 25 program, below. It's a Monte-Carlo style 24 program which runs a large number of trials 23 NTRIALS for constant n, and randomly generates 22 bitsets for a range of k-values using Bernoulli 21 processes with ones-density constrained 20 between limits that can be specified, and 19 records the running time of finding or failing 18 to find a triplet of evenly spaced ones, time 17 measured in steps NOT in CPU time. I ran 16 it for n=64, 256, 1024, 4096, 16384* (still 15 running), first a test run with 500000 trials 14 to see which k-values take the longest running 13 time, then another test with 5000000 trials 12 with narrowed ones-density focus to see 11 what those values look like. The longest 10 running times do happen with very sparse 9 density (e.g. for n=4096 the running time 8 peaks are in the k=16-64 range, with a gentle 7 peak for mean runtime at 4212 steps @ k=31, max 6 runtime peaked at 5101 steps @ k=58). It 5 looks like it would take extremely large 4 values of N for the worst-case O(k^2) step 3 to become larger than the O(n) step where 2 you scan the bitstring to find the 1's position 1 indices.
package com.example.math;
import java.io.PrintStream;
import java.util.BitSet;
import java.util.Random;
public class EvenlySpacedOnesTest {
static public class StatisticalSummary
{
private int n=0;
private double min=Double.POSITIVE_INFINITY;
private double max=Double.NEGATIVE_INFINITY;
private double mean=0;
private double S=0;
public StatisticalSummary() {}
public void add(double x) {
min = Math.min(min, x);
max = Math.max(max, x);
++n;
double newMean = mean + (x-mean)/n;
S += (x-newMean)*(x-mean);
// this algorithm for mean,std dev based on Knuth TAOCP vol 2
mean = newMean;
}
public double getMax() { return (n>0)?max:Double.NaN; }
public double getMin() { return (n>0)?min:Double.NaN; }
public int getCount() { return n; }
public double getMean() { return (n>0)?mean:Double.NaN; }
public double getStdDev() { return (n>0)?Math.sqrt(S/n):Double.NaN; }
// some may quibble and use n-1 for sample std dev vs population std dev
public static void printOut(PrintStream ps, StatisticalSummary[] statistics) {
for (int i = 0; i < statistics.length; ++i)
{
StatisticalSummary summary = statistics[i];
ps.printf("%d\t%d\t%.0f\t%.0f\t%.5f\t%.5f\n",
i,
summary.getCount(),
summary.getMin(),
summary.getMax(),
summary.getMean(),
summary.getStdDev());
}
}
}
public interface RandomBernoulliProcess // see http://en.wikipedia.org/wiki/Bernoulli_process
{
public void setProbability(double d);
public boolean getNextBoolean();
}
static public class Bernoulli implements RandomBernoulliProcess
{
final private Random r = new Random();
private double p = 0.5;
public boolean getNextBoolean() { return r.nextDouble() < p; }
public void setProbability(double d) { p = d; }
}
static public class TestResult {
final public int k;
final public int nsteps;
public TestResult(int k, int nsteps) { this.k=k; this.nsteps=nsteps; }
}
////////////
final private int n;
final private int ntrials;
final private double pmin;
final private double pmax;
final private Random random = new Random();
final private Bernoulli bernoulli = new Bernoulli();
final private BitSet bits;
public EvenlySpacedOnesTest(int n, int ntrials, double pmin, double pmax) {
this.n=n; this.ntrials=ntrials; this.pmin=pmin; this.pmax=pmax;
this.bits = new BitSet(n);
}
/*
* generate random bit string
*/
private int generateBits()
{
int k = 0; // # of 1's
for (int i = 0; i < n; ++i)
{
boolean b = bernoulli.getNextBoolean();
this.bits.set(i, b);
if (b) ++k;
}
return k;
}
private int findEvenlySpacedOnes(int k, int[] pos)
{
int[] bitPosition = new int[k];
for (int i = 0, j = 0; i < n; ++i)
{
if (this.bits.get(i))
{
bitPosition[j++] = i;
}
}
int nsteps = n; // first, it takes N operations to find the bit positions.
boolean found = false;
if (k >= 3) // don't bother doing anything if there are less than 3 ones. :(
{
int lastBitSetPosition = bitPosition[k-1];
for (int j1 = 0; !found && j1 < k; ++j1)
{
pos[0] = bitPosition[j1];
for (int j2 = j1+1; !found && j2 < k; ++j2)
{
pos[1] = bitPosition[j2];
++nsteps;
pos[2] = 2*pos[1]-pos[0];
// calculate 3rd bit index that might be set;
// the other two indices point to bits that are set
if (pos[2] > lastBitSetPosition)
break;
// loop inner loop until we go out of bounds
found = this.bits.get(pos[2]);
// we're done if we find a third 1!
}
}
}
if (!found)
pos[0]=-1;
return nsteps;
}
/*
* run an algorithm that finds evenly spaced ones and returns # of steps.
*/
public TestResult run()
{
bernoulli.setProbability(pmin + (pmax-pmin)*random.nextDouble());
// probability of bernoulli process is randomly distributed between pmin and pmax
// generate bit string.
int k = generateBits();
int[] pos = new int[3];
int nsteps = findEvenlySpacedOnes(k, pos);
return new TestResult(k, nsteps);
}
public static void main(String[] args)
{
int n;
int ntrials;
double pmin = 0, pmax = 1;
try {
n = Integer.parseInt(args[0]);
ntrials = Integer.parseInt(args[1]);
if (args.length >= 3)
pmin = Double.parseDouble(args[2]);
if (args.length >= 4)
pmax = Double.parseDouble(args[3]);
}
catch (Exception e)
{
System.out.println("usage: EvenlySpacedOnesTest N NTRIALS [pmin [pmax]]");
System.exit(0);
return; // make the compiler happy
}
final StatisticalSummary[] statistics;
statistics=new StatisticalSummary[n+1];
for (int i = 0; i <= n; ++i)
{
statistics[i] = new StatisticalSummary();
}
EvenlySpacedOnesTest test = new EvenlySpacedOnesTest(n, ntrials, pmin, pmax);
int printInterval=100000;
int nextPrint = printInterval;
for (int i = 0; i < ntrials; ++i)
{
TestResult result = test.run();
statistics[result.k].add(result.nsteps);
if (i == nextPrint)
{
System.err.println(i);
nextPrint += printInterval;
}
}
StatisticalSummary.printOut(System.out, statistics);
}
}
# <algorithm>
def contains_evenly_spaced?(input)
return false if input.size < 3
one_indices = []
input.each_with_index do |digit, index|
next if digit == 0
one_indices << index
end
return false if one_indices.size < 3
previous_indexes = []
one_indices.each do |index|
if !previous_indexes.empty?
previous_indexes.each do |previous_index|
multiple = index - previous_index
success_index = index + multiple
return true if input[success_index] == 1
end
end
previous_indexes << index
end
return false
end
# </algorithm>
def parse_input(input)
input.chars.map { |c| c.to_i }
end
I'm having trouble with the worst-case scenarios 10 with millions of digits. Fuzzing from /dev/urandom
essentially 9 gives you O(n), but I know the worst case 8 is worse than that. I just can't tell how 7 much worse. For small n
, it's trivial to 6 find inputs at around 3*n*log(n)
, but it's surprisingly 5 hard to differentiate those from some other 4 order of growth for this particular problem.
Can 3 anyone who was working on worst-case inputs 2 generate a string with length greater than 1 say, one hundred thousand?
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.