Вы находитесь на странице: 1из 6

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING 6367(Print), ISSN 0976

6375(Online) Volume 4, Issue 3, May June (2013), IAEME & TECHNOLOGY (IJCET) ISSN 0976 6367(Print) ISSN 0976 6375(Online) Volume 4, Issue 3, May-June (2013), pp. 176-181 IAEME: www.iaeme.com/ijcet.asp Journal Impact Factor (2013): 6.1302 (Calculated by GISI) www.jifactor.com

IJCET
IAEME

DATA MINING WITH HUMAN GENETICS TO ENHANCE GENE BASED ALGORITHM AND DNA DATABASE SECURITY
Vijay Arputharaj J Research Scholar, Department of Computer Science, Karpagam University, Coimbatore, Tamil Nadu, India Dr.R.Manicka Chezian Associate Professor, Department of Computer Science, NGM College (Autonomous), Pollachi,Tamil Nadu, India

ABSTRACT The goal of data mining in DNA Database is to check some possible combinations of DNA sequences and to generate a common sympathetic code or algorithm to formulate the sequence on mutations. Since the data mining is the best technique to analyze and extract the data, it is also helpful to formulate the common algorithm. Data mining in the area of study on human genetics, an important goal is to understand the mapping relationship between the inter-individual variation in human DNA sequences and variability in disease, mutation susceptibility. In lay terms, it is used to find out how the changes in an individual's DNA sequence affect the risk of developing common diseases and mutations with high level security. This investigation also helps in parental identification algorithms for DNA sequences, genome expressions. Data mining, data extraction techniques are used to understand the need for analyses of large, complex, information-rich data sets in DNA Sequences. Regulation of gene expression includes the processes that cells and viruses use to regulate the way that the information in genes is turned into gene products. An important challenge in use of large scale gene expression data for biological classification occurs when the expression dataset being analyzed involves multiple classes. To overcome this kind of problems data mining is used.

176

International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 6375(Online) Volume 4, Issue 3, May June (2013), IAEME Key Words- Data mining, DNA Database, DNA Sequence, Gene Expression, Biological classification, Multiple class 1. INTRODUCTION The Human Genome Task or Project is a worldwide scientific study mission with a main aim of formative the succession of chemical base pairs which structure DNA, also to identify and map the genes of the human genome from the corporeal and serviceable position. A DNA database or DNA databank is a database of contains all DNA data. A DNA Databank can be used in the analysis of parental comparison, genetic diseases, genetic fingerprinting for criminology, genetic genealogy etc. Data mining in the area of human genetics, an important goal is to understand the mapping relationship between the individual variation in human DNA sequences and variability in various algorithms for database security issues, for mutation susceptibility and parental identification differences. In our country India which is solidly populated there is huge need for DNA databases which may help in stopping different types of fraud as like Passport fraud, Other fraud etc. Data mining, data extraction techniques are used to understand the need for analyses of large, complex, information-rich data sets in DNA Sequences. Several visualizations and data mining techniques are already available, and they are used to validate and attempt to discover new methods for differentiating DNA sequences or exons, from non-coding DNA sequences or introns. Since the data mining is the best technique to analyze and extract the data, it is also helpful to formulate the common algorithm. 2. LITERATURE STUDY 2.1 INTERNATIONAL STATUS In northern countries data exploration techniques designed to classify DNA sequences, many different classification techniques including rule-based classifiers and neural networks. It is used visualization of both the original data and the results of the data mining to help verify patterns and to understand the distinction between the different types of data and classifications. Forensic identification problems are examples in which the study of DNA profiles is a common approach. Here we present some problems and develop their treatment putting the focus in the use of Object-Oriented Bayesian Networks - OOBN. The use of DNA databases, which began in 1995 in England, has created new challenges about its use. In Portugal, the legislation for the construction of a genetic database was defined in 2008. Cryptographic, Authentication and High Definition Security approaches for databases are used for several countries like Thailand, US, UK etc 2.2 NATIONAL STATUS Genetic features and environmental factors which were involved in multi factorial diseases. data mining tools were required and we proposed a 2-Phase approach using a specific genetic algorithm. For the first phase, the feature selection problem, we used a genetic algorithm (GA). To deal with this very specific problem, some advanced mechanisms had been introduced in the genetic algorithm such as sharing, random immigrant, dedicated genetic operators and a particular distance operator had been defined. Then, the second phase,
177

International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 6375(Online) Volume 4, Issue 3, May June (2013), IAEME a clustering based on the features selected during the previous phase, will use the clustering algorithm k-means. INDIA CHENNAI: The FBI has a DNA index system. The UK has a similar database. And if Parliament passes the DNA Profiling Bill, 2007, India will soon join the league, creating a national DNA database that will help police arrest serial offenders and give a boost to forensic investigation. The bill, drafted and sent to all ministries and departments for their feedback, has been modified. The final version has been sent to the law ministry, which has sent it to the legal department for final drafting, 2.3 SIGNIFICANCE OF THE STUDY The important significance of this research is useful for entire society, the identity of the citizen can be stored thru the Secured DNA Database, Which might not contain any fraud like passport fraud, Ration card fraud etc. This research advances and aids in criminal and forensic databases, This application is also useful for the government and for the society This research is primarily deals with the advancement of genetic algorithm with proper security features in DNA Databases and it enhances the special features in DNA database security. 3. RESEARCH STUDY AND DEVELOPMENT 3.1 AIMS AND OBJECTIVES To Enhance Database Security This research is primarily deals with the advancement of genetic algorithm with proper security features in DNA Databases and it enhances the special features in DNA database security. Mapping relationships in DNA sequences and variability in disease, mutation susceptibility Effective Solution in parental identification algorithms for DNA sequences, genome expressions. 3.2 MATERIAL AND METHODS 1. Data mining and information retrieval 2. Visual Analytics and Collaboration 3. Combination of Parallel algorithms for sequence analysis 4. Seamless high-performance computing 5. Security Algorithms a) Reverse Encryption algorithm to protect data b) Advance Cryptography algorithm to protect data c) Advanced Encryption Standard (AES) The above methodologies the Data mining technique is used for knowledge discovery from entire DNA Database, There can be three levels of genome data mining. The simplest is an in-depth analysis of the result from a single query using a genome browser. In this level, one may start with a gene or marker name, or by mapping a sequence to the genome. Cross comparison of various annotation 'tracks' may help make sense of the query region. This is the most popular use of any genome browser. Data mining is opposite to the
178

International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 6375(Online) Volume 4, Issue 3, May June (2013), IAEME information retrieval in the sense, it does not based on predetermine criteria; it will uncover some hidden patterns by exploring our data. Visual Analytics, Parallel algorithms are used in the implementation of security issues in the database. Seamless High performance computing is connects with speed of access in the database records Information retrieval is what based on predetermine criteria, like you are interested in retrieving group of certain peoples belongs to certain class, having certain mortgage plan, or having certain characteristics which you already know. Cryptography is usually referred to as "the study of secret", while nowadays is most attached to the definition of encryption. Encryption is the process of converting plain text "unhidden" to a cryptic text "hidden" to secure it against data thieves. This process has another part where cryptic text needs to be decrypted on the other end to be understood. In the broad meadow of cryptography, encryption is the procedure of indoctrination letters (or information) within such a method that hackers cannot understand writing it, other than that approved parties only can used it. In an encryption scheme, the memorandum or information, it is also called as plain text; this text is encrypted using an encryption algorithm, turning it into an unreadable cipher text. This is usually done with the use of an encryption key, which specifies how the message is to be encoded. After that decryption is also done by the authorized party. Encryption is a method of hiding data so that it cannot be read by anyone who does not know the key. The key is used to lock and unlock data. To encrypt a data one would perform some mathematical functions on the data and the result of these functions would produce some output that makes the data look like garbage to anyone who doesn't know how to reverse the operations. The Advanced Encryption Standard (AES) is a measurement for the encryption of electronic records which is conventional scheme by the U.S.National Institute of Standards and Technology (NIST) in 2001, STEPS: 1. KeyExpansionround keys are derived from the cipher key using Rijndael's key schedule. 2. InitialRound 1. AddRoundKeyeach byte of the state is combined with the round key using bitwise xor. 3. Rounds 1. SubBytesa non-linear substitution step where each byte is replaced with another according to a lookup table. 2. ShiftRowsa transposition step where each row of the state is shifted cyclically a certain number of steps. 3. MixColumnsa mixing operation which operates on the columns of the state, combining the four bytes in each column. 4. AddRoundKey 4. Final Round (no MixColumns) 1. SubBytes 2. ShiftRows 3. AddRoundKey

179

International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 6375(Online) Volume 4, Issue 3, May June (2013), IAEME 3.3 FINDINGS The DNA aging & sequencings success in sequencing the chemical bases of DNA is almost transformed accord to the biological changes in age. It is form new knowledge about fundamental biological processes. The initial segment of the task, called mapping, it has fragmented the chromosomes into groups as a combined set of regulated expressions. High Data mined Processors can be used to point out the location of these grouped genes and expression of genes. Age correlated with an increasing percentage of sperm with highly damaged DNA (range: 083%) and tended to inversely correlate with percentage of apoptotic sperm (range: 0.3%23%). Gene mutations prevent one or more of these proteins from working properly. By changing a genes instructions for making a protein, a mutation can cause the protein to malfunction or to be missing entirely. When a mutation alters a protein that plays a critical role in the body, it can disrupt normal development or cause a medical condition. A condition caused by mutations in one or more genes is called a genetic disorder FUTURE OF GENOMIC RESEARCH Develop and apply genome-based strategies for the early detection, diagnosis, and treatment of diseases Develop new technologies to study genes and DNA on a large scale and store genomic data efficiently

5. RESULT AND DISCUSSION It is form new knowledge about fundamental biological processes. High Data mined Processors can be used to point out the location of these grouped genes and expression of genes. The various algorithms and ideas are identified for DNA Database security also. AGE CORRELATION Age correlated with an increasing percentage of sperm with highly damaged DNA (range: 083%) and tended to inversely correlate with percentage of apoptotic sperm (range: 0.3%23%). The DNA aging & sequencings success in sequencing the chemical bases of DNA is almost transformed accord to the biological changes in age. It is form new knowledge about fundamental biological processes. The initial segment of the task, called mapping, it has fragmented the chromosomes into groups as a combined set of regulated expressions. High Data mined Processors can be used to point out the location of these grouped genes and expression of genes. 6. CONCLUSION The successful module in aging sequences of DNA genome expressions achieved completely. The research process is yet to achieve further goals and objectives in disease, mutation susceptibility, and parental modules with DNA Database security

180

International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 6375(Online) Volume 4, Issue 3, May June (2013), IAEME REFERENCES
[1] B. Figg. (2004). Cryptography and Network Security. Internet: http:/www.homepages.dsu.edu/figgw/Cryptography%20&%20Network%2 0Security.ppt.[March 16, 2010]. A. Kahate, Cryptography and Network Security (2nd ed.). New Delhi: Tata McGraw Hill, 2008. M. Milenkovic. Operating System: Concepts and Design, New York: McGrew-Hill, Inc., 1992. P.R. Zimmermann. An Introduction to Cryptography. Germany: MIT press. Available: http://www.pgpi.org/doc/pgpintro, 1995, [March 16, 2009]. W. Stallings. Cryptography and Network Security (4th ed.). Englewood (NJ):Prentice Hall,1995. V. Potdar and E. Chang. Disguising Text Cryptography Using Image Cryptography, International Network Conference, United Kingdom: Plymouth, 2004. S.A.M. Diaa, M.A.K. Hatem, and M.H. Mohiy (2010). Evaluating The Performance of Symmetric Encryption Algorithms International Journal of Network Security, 2010, 10(3), pp.213-219 T. Ritter. Crypto Glossary and Dictionary of Technical Cryptography. Internet: www.ciphersbyritter.com/GLOSSARY.HTM , 2007, [August 17, 2009] K.M. Alallayah, W.F.M. Abd El-Wahed, and A.H. Alhamani.Attack Of Against Simplified Data Encryption Standard Cipher System Using Neural Networks. Journal of Computer Science,2010, 6(1), pp. 29-35. D. Rudolf. Development and Analysis of Block Cipher and DES System. Internet:http://www.cs.usask..ca/~dtr467/400/, 2000, [April 24, 2009] H. Wang. (2002). Security Architecture for The Teamdee System. An unpublished MSc Thesis submitted to Polytechnic Institution and State University, Virginia, USA. G.W. Moore. (2001). Cryptography Mini-Tutorial. Lecture notes University of Maryland School of Medicine. Internet: http://www.medparse.com/whatcryp.htm [March16, 2009]. T. Jakobsen and L.R. Knudsen. (2001). Attack on Block of Ciphers of Low Algebraic Degree. Journal of Cryptography, New York, 14(3), pp.197-210. N. Su, R.N. Zobel, and F.O. Iwu. Simulation in Cryptographic Protocol Design and Analysis. Proceedings 15th European Simulation Symposium, University of Manchester, UK., 2003. Dr.R.Manicka Chezian, and Dr.T.Devi. Termination of triggers in active databases International Journal of Information Systems and Change Management, USA, Vol-5, No-3 PP 251-266, 2011 Dr.R.Manicka Chezian, and Dr.T.Devi. A new algorithm to detect the non termination of triggers in active databases International Journal of Advanced Networking and Applications, Vol-3, Issue-2 PP 1098-1104, 2011 Dr.R.Manicka Chezian, and P.M.Nishad A vital approach to compare the size of DNA sequence using LZW with fixed length binary code and tree structures, International Journal of Computer Applications, Vol-3, No-1, PP 7-9, 2012 Dr.R.Manicka Chezian, and C.Bagyalakshmi A survey on cloud data security using encryption technique International Journal of Advanced Research in Computer Engineering and Technology, Vol-1, Issue-5, PP 263-265, 2012. B.Saichandana, Dr.K.srinivas and Dr. Reddi Kiran Kumar, Visual Cryptography Scheme for Color Images, International Journal of Computer Engineering & Technology (IJCET), Volume 1, Issue 1, 2010, pp. 207 - 212, ISSN Print: 0976 6367, ISSN Online: 0976 6375. Ahmad Salameh Abusukhon, Block Cipher Encryption for Text-To-Image Algorithm, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 3, 2013, pp. 50 - 59, ISSN Print: 0976 6367, ISSN Online: 0976 6375.

[2] [3] [4] [5] [6] [7]

[8] [9]

[10] [11] [12] [13] [14] [15]

[16]

[17]

[18]

[19]

[20]

181

Вам также может понравиться