Вы находитесь на странице: 1из 4

GENOME SEQUENCING

In this article, I will discuss my solution to the Genome Sequencing puzzle in Codingame. My solution
uses a brute forced approach.

Merge method
First thing’s first – merging two strings. This method returns the best possible merged value of 2
strings. This is it does by superimposing the 2 strings, and moving them apart every turn. As soon as
we have a match, we return the currently merged string.
public static String merge(String s1,String s2) {
if (s1.length()<s2.length()) {
String tmp = s2;
s2 = s1;
s1 = tmp;
}
if (s1.indexOf(s2)!=-1) return s1;
for (int i = s2.length()-1; i > 0; i--) {
String sub1 = s2.substring(0,i);
String sub2 = s2.substring(s2.length()-i);
if (s1.substring(0,i).equals(sub2)) {
String result = s2.substring(0,s2.length()-i)+s1;
return result;
}
if (s1.substring(s1.length()-i).equals(sub1)) {
String result = s1+s2.substring(i);
return result;
}
}
return "";
}

if (s1.length()<s2.length()) {
String tmp = s2;
s2 = s1;
s1 = tmp;
}
In our method, s2 is the smaller string and s1 the larger, so we first do this quick swap.

if (s1.indexOf(s2)!=-1) return s1;


If the bigger string contains the shorter string, then the best merge is s1. We return s1 in this case.

String sub1 = s2.substring(0,i);


String sub2 = s2.substring(s2.length()-i);
Sub1 is the front part of s2 of the given length. Sub2 is the back part of s2 of given length.

if (s1.substring(0,i).equals(sub2)) {
String result = s2.substring(0,s2.length()-i)+s1;
return result;
}
If the front of s1 matches with the back of s2, then this is the best possible merge.
if (s1.substring(s1.length()-i).equals(sub1)) {
String result = s1+s2.substring(i);
return result;
}
If the back of s1 matches with the front of s2, then this is the best possible merge.

findSmall method
given the current string, we recursively try to find the shortest string that can be made by combining
the remaining DNA strands.
public static void findSmall(String LAST,int I,int J,Vector<String> strips,int N)
{
if (N==1) {
shortest=Math.min(LAST.length(),shortest);
return;
}
strips.set(I,LAST);
strips.removeElementAt(J);
if (I>J) I--;
for (int i = 0; i < N; i++) {
if (i==I) continue;
String merged = merge(LAST,strips.elementAt(i));
if (!merged.equals("")) {
Vector<String> check = new Vector<String>();
check.addAll(0,strips);
findSmall(merged,I,i,check,N-1);
}
else {
Vector<String> check = new Vector<String>();
check.addAll(0,strips);
findSmall(LAST+strips.elementAt(i),I,i,check,N-1);
findSmall(strips.elementAt(i)+LAST,I,i,check,N-1);
}
}
}

public static void findSmall(String LAST,int I,int J,Vector<String> strips,int N)


 LAST = the merged string so far
 I,J = position of the 2 strings last merged
 strips = the DNA strands left to be merged
 N = number of DNA strands left
if (N==1) {
shortest=Math.min(LAST.length(),shortest);
return;
}
If only 1 strand is left, we update shortest, which is the length of the shortest strand. Then we return
strips.set(I,LAST);
strips.removeElementAt(J);
if (I>J) I--;
Set the strip at I as the strip so far. Remove the strand at J. This serves to remove both the strands,
and add the new strand. If I is greater than J, we decrement it, because number of elements are now
less, since a DNA strand has been removed.
String merged = merge(LAST,strips.elementAt(i));
Get the best merged string between ith strand and the current strand.
if (!merged.equals("")) {
Vector<String> check = new Vector<String>();
check.addAll(0,strips);
findSmall(merged,I,i,check,N-1); // N is decremented (1 more merge!)
}
If merging is possible, we create a new dictionary of DNA strands. Then we recursively call findSmall,
by passing merged as the string so far.

else {
Vector<String> check = new Vector<String>();
check.addAll(0,strips);
findSmall(LAST+strips.elementAt(i),I,i,check,N-1);
findSmall(strips.elementAt(i)+LAST,I,i,check,N-1);
}
If no merging is possible (O_O), we can try 2 combinations –

(I) LAST+strips.get(i)
(II) strips.get(i)+LAST.

getDNA method
this method is called by main, to return the best match. It brute forces every 2 pairs of strands to
start with, and then call findSmall().
public static void getDNA(Vector<String> strips,int N) {
for (int i = 0; i < N-1; i++) {
for (int j = i+1; j < N; j++) {
String merged = merge(strips.elementAt(i),strips.elementAt(j));
if (!merged.equals("")) {
Vector<String> check = new Vector<String>();
check.addAll(0,strips);
findSmall(merged,i,j,check,N-1);
}
}
}
}
I feel this method is pretty self explanatory. It tries every possible pair of first 2 strands, then calls
merge method, if there is any use merging the two strands.

Note: see that getDNA does not provide for a test case where no mergings are possible. This is why
in the main method we set smallest as the sum of the length of all the strings.

Conclusion
Yaay! We are done. A simple solution to the genome sequencing puzzle, which might seem tough at
first site. In conclusion I say that the code shared here is in Java. The main method and import
statements are missing. I believe the reader can do this. (strips passed to getDNA is the collection of
the DNA strands).