Вы находитесь на странице: 1из 4

Explanation of my algorithm:

We start by creating an integer array that is the size of our alphabet (256 for ASCII).
This array will be a mapping of a character code to an integer indicating that we've seen
that character already.
Now, we start by iterating over our string.

We look at the current character. If the array value for that character is set to 1,
we've seen it before. If the array value for that character isn't set, we set the array value
to 1 and increase a counter indicating how many unique characters we've seen so far.

When we have seen [math]k[/math] unique characters, we call the substring we just
iterated over our first "block."

This block/substring has a size. If it is greater than the size of our largest unique
character substring, we update our largest unique character substring.

We then generate the next successive block of [math]k[/math] unique characters.


We can do this efficiently by using the integer 2 to represent characters we've already
seen before. If the second block we generate is bigger than our largest unique character
substring so far, we update our largest unique character substring.

Now we have generated two blocks, each of which has [math]k[/math] unique
characters. The bigger of the two blocks is our current result. However, we're missing
one edge case: the largest substring of [math]k[/math] unique characters could be
composed of the end characters of the first block concatenated with the beginning
characters of the second block. For example, suppose our string is "avcdcdvxor" and
[math]k=3[/math]. The first block will be "avc". The second block will be "dcdv".
(Observe that each has [math]k[/math] unique characters.) However, a longer string of
[math]k[/math] unique characters can be made from the letters in the middle:
"vcdcdv". This third possible substring can never be bigger than [math]2k[/math]
because it cannot exceed the length of two blocks. If it could exceed the length of two
blocks, then we would have two blocks in a row with [math]k[/math] unique characters
among both of them. This is a contradiction since our code generates the largest blocks
it can with [math]k[/math] unique characters.

Therefore, our next step is to find the largest block of [math]k[/math] unique
characters in between the last two blocks we've seen. One way to do this is to take the
last character of the first block and go right until we've reached [math]k[/math] unique
characters. Then we start expanding our substring to the left while we still can maintain
[math]k[/math] unique characters. We'll call this block 3. If block 3 is bigger than our
other blocks, we update our largest unique character substring.

Then, we perform a similar procedure. We take the first character of the second
block and go left until we've reached [math]k[/math] unique characters. Then we start

expanding our substring to the right while we can still maintain [math]k[/math] unique
characters. We'll call this block 4. If block 4 is bigger than our other blocks, we update
our largest unique character substring.

So far, we've found the largest substring of [math]k[/math] unique characters in the
first two blocks. So now we just repeat. We generate a new block after the first two
blocks and check its size. Then we check the sizes of the blocks in between the second
block and our new block.

We repeat until we've seen every block in the string and the blocks between them. By
the time we're done, our result will contain the longest substring with [math]k[/math]
unique characters.

Example Java code:


[code java]
public class LongestUniqueSubstring {
private String text;
private int k;
private int n;
private int[] existing;
private int blockNum;
public static void main(String[] args) {
String text = "avcdcdvxor";
int k = 3;
String longestSubstring = new LongestUniqueSubstring(text, k).toString();
System.out.println("The longest substring of '" + text + "' consisting of no more than " +
k + " unique characters is:\n\n" + longestSubstring);
}
public LongestUniqueSubstring(String text, int k) {
this.text = text;
this.k = k;
this.n = text.length();
}
public String toString() {
return this.findLongestUniqueSubstring();
}
public String findLongestUniqueSubstring() {

this.existing = new int[256];


this.blockNum = 0;
int leftIndexBest = 0;
int rightIndexBest = 0;
if (this.k < 1) {
return "";
}
for (int i = 0; i < n; i++) {
int startIndex = i;
int endIndex = this.findIndexOfKthUniqueCharacter(startIndex, 1);
if (endIndex - startIndex > rightIndexBest - leftIndexBest) {
leftIndexBest = startIndex;
rightIndexBest = endIndex;
}
if (i == 0) {
i = endIndex;
continue;
}
int startIndexRight = Math.max(0, startIndex - 1);
int endIndexRight = this.findIndexOfKthUniqueCharacter(startIndexRight, 1);
startIndexRight = this.findIndexOfKthUniqueCharacter(endIndexRight, -1);
int endIndexLeft = Math.min(n - 1, startIndex + 1);
int startIndexLeft = this.findIndexOfKthUniqueCharacter( endIndexLeft, -1);
endIndexLeft = this.findIndexOfKthUniqueCharacter(startIndexLeft, 1);
if (endIndexRight - startIndexRight > endIndexLeft - startIndexLeft) {
if (endIndexRight - startIndexRight > rightIndexBest - leftIndexBest) {
leftIndexBest = startIndexRight;
rightIndexBest = endIndexRight;
}
} else {
if (endIndexLeft - startIndexLeft > rightIndexBest - leftIndexBest) {
leftIndexBest = startIndexLeft;
rightIndexBest = endIndexLeft;
}
}

i = endIndex;
}
return this.text.substring(leftIndexBest, rightIndexBest + 1);
}
private int findIndexOfKthUniqueCharacter(int startIndex, int direction) {
this.blockNum++;
int numUnique = 0;
int endIndex = 0;
for (int i = startIndex; (direction == 1 && i < this.n) || (direction == -1 && i >= 0); i +=
direction) {
char character = this.text.charAt(i);
if (this.existing[character] != this.blockNum) {
numUnique++;
}
if (numUnique <= k) {
this.existing[character] = this.blockNum;
}
if (numUnique > k) {
i -= direction;
}
if (numUnique > k || (direction == 1 && i == this.n - 1) || (direction == -1 && i == 0))
{
endIndex = i;
break;
}
}
return endIndex;
}
}
[/code]

Вам также может понравиться