Академический Документы
Профессиональный Документы
Культура Документы
Contents
Problem Description ..................................................................................................................................... 2 A. Three Approaches to this problem (First Step) ........................................................................................ 2 1. Duplet generation and liner search with comparison ...................................................................... 2 Theory ................................................................................................................................................... 2 Practice.................................................................................................................................................. 3 Efficiency ............................................................................................................................................... 6 2. Duplet generation and binary search with comparison ................................................................... 7 Theory ................................................................................................................................................... 7 Practice.................................................................................................................................................. 7 Efficiency ............................................................................................................................................... 7 3. Making a GET/POST request ............................................................................................................. 8 Theory ................................................................................................................................................... 8 Practice.................................................................................................................................................. 8 Efficiency ............................................................................................................................................. 10 4. i. ii. Other possible approaches ............................................................................................................. 10 Hamming distance ...................................................................................................................... 10 Microsoft Common Speller Application Programming Interface (CSAPI) ................................... 10
B. How to make my problem use any desired method at the wave of the wand making my problem scalable and flexible .................................................................................................................................... 10 i. ii. iii. Method 1: Basic Refactoring ....................................................................................................... 10 Method 2: Using Factory Pattern................................................................................................ 10 Method 3: Using Spring Framework ........................................................................................... 10
C. Final Step- Wrong Spelling Generator .................................................................................................... 11 Theory ..................................................................................................................................................... 11 Practice ................................................................................................................................................... 11 Output ..................................................................................................................................................... 12 Main program with final step piped to first step .................................................................................... 13
Karan Bhandari
Problem Description
Spell check Solution
Now if the user inputs marvol so duplet LHS is {ma,ar,rv,vo,ol}. And duplet RHS of dictionary entry is {ma,ar,rv,ve,el}.3 out of 5 match. Before we plunge into the problem let us define a duplet with respect to this problem. It is an alternate/acronym that I'm using to call a pair of characters. Any word can be exploded into a collection of duplets. For example: Word, duplets Marvel, {ma,ar,rv,ve,el}
Karan Bhandari
With the advent of duplets with can perform fussy string matching. The user input is divided into duplets and the string we compare against(from dictionary) is also divided into duplets. For brevity we call user input as 'left hand side' (LHS) and one from dictionary string as 'right hand side' (RHS). In order to perform spell check I'm setting the strictness factor to 55%. That is if 55 percent of duplets of LHS match the duplets of RHS we arrive at an approximate equality or fussy equality. For example if the user inputs 'marvel'. LHS is {ma,ar,rv,ve,el} and the dictionary contains : marvel so RHS is {ma,ar,rv,ve,el}.100% match so marvellous -We are above 55% strictness factor. Here we can surmise that if marvol did not exist in the dictionary then marvel is the closest match. One may complain that certain words are similar like call and ball which may have similar strictness ratio with conflicting or non conflicting words. We will hail the one with max strictness factor as the new emperor. I have copied large list of English words (e.g. from /usr/share/dict/words on a unix system) to a file called dictionaryFile.txt and copied it to the location where the code is compiled. In the main function this program warns you if you have uppercase characters. It detects regular express pattern. Practice Code import java.io.BufferedReader; import java.io.DataInputStream; import java.io.FileInputStream; import java.io.IOException; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.List;
public class SpellCheckDupletFashion { /** * @param args */ static double STRICT=0.55; Duplet generator Marvel, {ma,ar,rv,ve,el} Achieve, {ac,hi,ev,ch,ie,ve}
// // //
Karan Bhandari
static public List<char[]> duplet(String input) { ArrayList<char[]> duplet = new ArrayList<char[]>(); for (int i = 0; i < input.length() - 1; i++) { char[] charArr = new char[2]; charArr[0] = input.charAt(i); charArr[1] = input.charAt(i+1); duplet.add(charArr); } return duplet; } //Function that detects approximate equality or fussy equality static public double strictnessFactor(List<char[]> duplet1, List<char[]> duplet2) { List<char[]> slave = new ArrayList<char[]>(duplet2); int flag = 0; for (int i = duplet1.size(); --i >= 0;) { char[] duplet = duplet1.get(i); for (int j = slave.size(); --j >= 0;) { char[] toMatch = slave.get(j); if (duplet[0] == toMatch[0] && duplet[1] == toMatch[1]) { slave.remove(j); flag += 2; break; } } } return (double) flag / (duplet1.size() + duplet2.size()); } //Java version of Read or Console.ReadLine or Scanf public static String getString() throws IOException { InputStreamReader isr = new InputStreamReader(System.in); BufferedReader br = new BufferedReader(isr); return br.readLine(); }
public static String suggestMeTheRighteousOne(String userInput) { List<char[]> userDuplets=duplet(userInput); double maxStrictnessFactor=STRICT; String latestStrictWord=null;
Karan Bhandari
//Access file stream FileInputStream fstream = new FileInputStream("dictionaryFile.txt"); DataInputStream in = new DataInputStream(fstream); BufferedReader br = new BufferedReader(new InputStreamReader(in)); // end of Access file stream String strLine; while ((strLine = br.readLine()) != null) { List<char[]> dictionaryDuplets=duplet(strLine); double currStrictFactor=strictnessFactor(dictionaryDuplets, userDuplets); if(currStrictFactor==1) { return "Bravo, you have cracked the spelling bee contest, the word exists in the dictionary"; } else if(currStrictFactor>=maxStrictnessFactor) { latestStrictWord=strLine; maxStrictnessFactor=currStrictFactor; } } in.close(); } catch (Exception e) { System.out.println("Error espousing out of dictionary File-"+e.toString()); } return latestStrictWord; }
public static void main(String[] args) { String userInput = null; System.out.println("Enter word"); try {userInput=getString();} catch (IOException e) {System.out.println("Error due to string insertion:"+e.toString());} String suggested=suggestMeTheRighteousOne(userInput); if(suggested.isEmpty()) suggested="NO SUGGESTION"; System.out.println("Antidote:"+suggested); if(userInput.matches(".*[A-Z].*")) System.out.println("Beware- your string contains uppercase");
Karan Bhandari
} Output
Efficiency Suppose if duplet generation of user input takes m time intervals and duplet generation of individual dictionary words take on average of p time intervals per word. If the size of dictionary is n and neglecting time for file operations. Strictness factor check could take h time intervals. It takes approximately k(m+p*n+n*h) time interval. Of the order of O(n) for reading the dictionary file since we read the file once. The next step extenuates it slightly.
Karan Bhandari
int first = 0; int last = dictionaryWords.size(); while (first <= last) { int middle = (first + last) /2; if (key.substring(0,1).compareTo( dictionaryWords.get(middle).substring(0,1)) < 0) { last = middle-1; } else if (key.substring(0,1).compareTo(dictionaryWords.get(middle).substring(0,1)) > 0) { first = middle+1; } else { //Here we do linear strictness check between dictionaryWords.get(first) and //dictionaryWords.get(last) as done in previous section } Efficiency When equality searching is taken into consideration binary search could deliver performance anywhere between O(1) to O(log n). But here it is hybrid- a mix of binary and linear. So our performance is better than O(n) as execution time is mildly reduced due to the genes of binary being injected into it.
Karan Bhandari
Karan Bhandari
href="http://dictionary.reference.com/browse/marvellous" onmousedown="return pk(this,{lk:'rtxtk5',en:'scpmean',io:'0',b:'dym',tp:'mid',m:'scpmean'})">marvellous</a></span><span class="baud"> 100 65917 0 65917 0 0 107k 0 --:--:-- --:--:-- --:--:-108k ------------------------------------bhandari@linux-qty1:~> curl "http://dictionary.reference.com/browse/marvelous" | grep -o "Did you mean.*<\/a><\/span><span class=\"baud\">" % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 79203 0 79203 0 0 125k 0 --:--:-- --:--:-- --:--:-125k How to execute linux commands from java //to exe any linux command, create instance of my Linux class and call exeCmd(curl command), obtain //the substring of suggested Did you mean part. public class Linux { public static ArrayList exeCmd(String cmd) { Process p = null; String s = null; ArrayList arr=new ArrayList<String>(); try { p = Runtime.getRuntime().exec(cmd); BufferedReader stdInput = new BufferedReader(new InputStreamReader(p.getInputStream())); BufferedReader stdError = new BufferedReader(new InputStreamReader(p.getErrorStream())); while((s = stdInput.readLine())!=null) arr.add(s); while((s = stdError.readLine()) != null) arr.add(s); return arr; } catch (IOException e) { return arr; } }
Karan Bhandari
Or GET request in Java Create object of type HttpURLConnection and do InputStream reading. Here endpoint is the string URL of dictionary website allowing GET requests HttpURLConnection conn = get_connection(endpoint, "GET"); conn.connect(); Efficiency This method does not involve disk operations, we do not need to access hard drive so we do not have rotational delay, seek time, latency, TLB misses, cache miss, DMA, etc. But instead we have network related delays like propagation delay, queueing delay, transmission delay and processing delay. We could enhance network efficiency by adding an institutional cache within the LAN environment.
B. How to make my problem use any desired method at the wave of the wand making my problem scalable and flexible
i. Method 1: Basic Refactoring 1. Draw suggestMeTheRighteousOne() into an interface 2. Use string builders instead of string. 3. Explain parameters and function and add comments to support javadocs ii. Method 2: Using Factory Pattern Suppose if type 1, it will extract dictionary from file, type 2 and it will use dictionary from FTP, 3database, 4- API call, 5- HTTP request. We could use switch case or if else block to select between them or even have several instances of derived class calling the spellcheck polymorphically. iii. Method 3: Using Spring Framework Here I try to achieve inversion of control and inject dependencies by using the spring framework.
Karan Bhandari
Suppose if my manager asked me to read the dictionary from a file. I will implement a read method. It has file, input stream reader and buffered reader. I deploy it to client application. Now if my manager asks me to change the method from a file or to FTP or datastore then I am stumped. Twitch.tv already has millions of users. I need to change million clients- bad idea. One way to remedy is that I declare an interface and pass the interface to the read method, read will call the right service. But there is still partial dependency. So instead I create an xml file and declare all beans, args, values and properties in it and access it via ApplicationContext variable. So I can change xml from server. The client just calls interfaceInstance.read() and it oblivious of the method used to access the dictionary.
Practice
import java.util.ArrayList; public class WrongSpellingGenerator { String wordUnderScrutiny; public WrongSpellingGenerator(String userInput){ wordUnderScrutiny=userInput; } ArrayList<String> missingCharacterCulprits() { ArrayList<String> returnList=new ArrayList<String>(); for(int i=1;i<wordUnderScrutiny.length()+1;i++) returnList.add(wordUnderScrutiny.substring(0,i1)+wordUnderScrutiny.substring(i,wordUnderScrutiny.length())); return returnList; } ArrayList<String> jaggedCases() { String original=wordUnderScrutiny; ArrayList<String> returnList=new ArrayList<String>(); for(int i=0;i<wordUnderScrutiny.length();i++)
Karan Bhandari
returnList.add(wordUnderScrutiny.replace(String.valueOf(wordUnderScrutiny.charAt(i)), String.valueOf(wordUnderScrutiny.charAt(i)).toUpperCase())); for(int i=1;i<wordUnderScrutiny.length()+1;i++) returnList.add(wordUnderScrutiny.substring(0,i1)+wordUnderScrutiny.substring(i,wordUnderScrutiny.length()).toUpperCase()); for(int i=2;i<wordUnderScrutiny.length()+1;i++) returnList.add(wordUnderScrutiny.substring(0,i1).toUpperCase()+wordUnderScrutiny.substring(i,wordUnderScrutiny.length())); wordUnderScrutiny=original; return returnList; } ArrayList<String> vowelsConvulator() { String original=wordUnderScrutiny; ArrayList<String> returnList=new ArrayList<String>(); char[] vowels={'a','e','i','o','u'}; for (char ch : vowels) { if(wordUnderScrutiny.contains(String.valueOf(ch))) { for (char cha : vowels){ returnList.add(wordUnderScrutiny.replace(ch, cha)); } } wordUnderScrutiny=original; } return returnList; } }
Output
Enter word constitutionally Antidote:Bravo, you have cracked the spelling bee contest, the word exists in the dictionary -----------Wrong Spelling Generator----------------Missing char instances:[onstitutionally, cnstitutionally, costitutionally, contitutionally, consitutionally, consttutionally, constiutionally, constittionally, constituionally, constitutonally, constitutinally, constitutioally, constitutionlly, constitutionaly, constitutionaly, constitutionall] Jagged cases instances[Constitutionally, cOnstitutiOnally, coNstitutioNally, conStitutionally, consTiTuTionally, constItutIonally, consTiTuTionally, constitUtionally, consTiTuTionally, constItutIonally, cOnstitutiOnally, coNstitutioNally, constitutionAlly, constitutionaLLy, constitutionaLLy,
Karan Bhandari
constitutionallY, ONSTITUTIONALLY, cNSTITUTIONALLY, coSTITUTIONALLY, conTITUTIONALLY, consITUTIONALLY, constTUTIONALLY, constiUTIONALLY, constitTIONALLY, constituIONALLY, constitutONALLY, constitutiNALLY, constitutioALLY, constitutionLLY, constitutionaLY, constitutionalY, constitutionall, Cnstitutionally, COstitutionally, CONtitutionally, CONSitutionally, CONSTtutionally, CONSTIutionally, CONSTITtionally, CONSTITUionally, CONSTITUTonally, CONSTITUTInally, CONSTITUTIOally, CONSTITUTIONlly, CONSTITUTIONAly, CONSTITUTIONALy, CONSTITUTIONALL] Vowels convulators[constitutionally, constitutionelly, constitutionilly, constitutionolly, constitutionully, constatutaonally, constetuteonally, constitutionally, constotutoonally, constututuonally, canstitutianally, censtitutienally, cinstitutiinally, constitutionally, cunstitutiunally, constitationally, constitetionally, constititionally, constitotionally, constitutionally]
public class SpellCheckDupletFashion { /** * @param args */ static double STRICT=0.55; Duplet generator Marvel, {ma,ar,rv,ve,el} Achieve, {ac,hi,ev,ch,ie,ve} static public List<char[]> duplet(String input) { ArrayList<char[]> duplet = new ArrayList<char[]>(); for (int i = 0; i < input.length() - 1; i++) { char[] charArr = new char[2]; charArr[0] = input.charAt(i); charArr[1] = input.charAt(i+1); duplet.add(charArr); } return duplet; } //Function that detects approximate equality or fussy equality
// // //
Karan Bhandari
static public double strictnessFactor(List<char[]> duplet1, List<char[]> duplet2) { List<char[]> slave = new ArrayList<char[]>(duplet2); int flag = 0; for (int i = duplet1.size(); --i >= 0;) { char[] duplet = duplet1.get(i); for (int j = slave.size(); --j >= 0;) { char[] toMatch = slave.get(j); if (duplet[0] == toMatch[0] && duplet[1] == toMatch[1]) { slave.remove(j); flag += 2; break; } } } return (double) flag / (duplet1.size() + duplet2.size()); } //Java version of Read or Console.ReadLine or Scanf public static String getString() throws IOException { InputStreamReader isr = new InputStreamReader(System.in); BufferedReader br = new BufferedReader(isr); return br.readLine(); }
public static String suggestMeTheRighteousOne(String userInput) { List<char[]> userDuplets=duplet(userInput); double maxStrictnessFactor=STRICT; String latestStrictWord=""; try{ //Access file stream FileInputStream fstream = new FileInputStream("dictionaryFile.txt"); DataInputStream in = new DataInputStream(fstream); BufferedReader br = new BufferedReader(new InputStreamReader(in)); // end of Access file stream String strLine; while ((strLine = br.readLine()) != null) { List<char[]> dictionaryDuplets=duplet(strLine); double currStrictFactor=strictnessFactor(dictionaryDuplets, userDuplets); if(currStrictFactor==1)
Karan Bhandari
return "Bravo, you have cracked the spelling bee contest, the word exists in the dictionary"; } else if(currStrictFactor>=maxStrictnessFactor) { latestStrictWord=strLine; maxStrictnessFactor=currStrictFactor; } } in.close(); } catch (Exception e) { System.out.println("Error espousing out of dictionary File-"+e.toString()); } return latestStrictWord; }
public static void main(String[] args) { String userInput = null; System.out.println("Enter word"); try {userInput=getString();} catch (IOException e) {System.out.println("Error due to string insertion:"+e.toString());} String suggested=suggestMeTheRighteousOne(userInput); if(suggested.isEmpty()) suggested="NO SUGGESTION"; System.out.println("Antidote:"+suggested); if(userInput.matches(".*[A-Z].*")) System.out.println("Beware- your string contains uppercase"); System.out.println("-----------Wrong Spelling Generator-----------------"); WrongSpellingGenerator wrong=new WrongSpellingGenerator(userInput); System.out.println("Missing char instances:"+wrong.missingCharacterCulprits()); System.out.println("Jagged cases instances"+wrong.jaggedCases()); System.out.println("Vowels convulators"+wrong.vowelsConvulator()); }