Vol 15 No 3 - May 2015

International Journal of Computer Science
and Business Informatics

(IJCSBI.ORG)
ISSN: 1694-2507 (Print)

VOL 15, NO 3
ISSN: 1694-2108 (Online) MAY 2015
IJCSBI.ORG
Table of Contents VOL 15, NO 3 MAY 2015
A Hybrid Algorithm for Improvement of XML Documents Clustering ................................................... 1

Somayeh Ghazanfari and Hassan Naderi
A Novel Collaborative Filtering Friendship Recommendation Based on Smartphones ......................... 16

Dhananjaya G. M., Sachin C. Raykar and Mushtaq Ahmed D. M
Efficient Computational Tools for Nonlinear Flight Dynamic Analysis in the Full Envelope .................. 26
P. Lathasree and Abhay A. Pashilkar
Quantitative Aspects of Knowledge Knowledge Potential and Utility..................................................... 45

Syed V. Ahamed
A Novel Approach for Recommending Items based on Association Rule Mining ................................... 60
Vasundhara M. S.and Gururaj K. S.
Cluster Integrated Self Forming Wireless Sensor Based System for Intrusion Detection and Perimeter
Defense Applications ................................................................................................................................. 70
A. Inigo Mathew, M. Raj Kumar, S. R. Boselin Prabhu and Dr. S. Sophia
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
A Hybrid Algorithm for

Improvement of XML Documents
Clustering
Somayeh Ghazanfari
Department of Computer Engineering, Islamic Azad University,
Science and Reasearch Campus Lorestan , Iran
Hassan Naderi
Department of Computer Engineering, Iran University of Science and Technology,
Resalat St., Tehran, Iran
ABSTRACT
As Extensible markup language (XML) documents are now widely used in the Web World,
improving the speed and accuracy of search engines based on these documents is
important. Clustering is a way that can be effective in improving the speed of the search
engine. Clustering of XML documents can be divided into pair wise and incremental
algorithms. The main challenge in the class of incremental algorithms such as Level
Structure (XCLS), XCLS+ and XCLS++ is that the order of input XML documents
influences the clustering. In this paper, the sensitivity of incremental XML clustering
algorithms is introduced by a representative algorithm i.e. XCLS+. A typical solution to
this problem has been proposed which includes two interleaved phases: online and semi-
offline. Experimental results show that the proposed algorithm has a higher speed with a
relatively higher precision for large number of documents compared to previous
incremental algorithms such as XCLS+.
Keywords
Incremental algorithms, XML clustering, XCLS+, Input of documents.
1. INTRODUCTION
The popularity use of web and internet causes a large amount of data and
information. The growth of stored data requires automatic tools to allow the
transformation of large amounts of data into information and knowledge
intelligently. Data mining is proposed as a solution for this issue [1] and
clustering is an important technique applied in data mining. A cluster refers
to a set of data which have most similarity to each other (inter-class) and are
less similar to other clusters (intra cluster). Clustering can be applied on
various types of data such as images, numbers, documents and texts.
Nowadays XML becomes the standard for data transmission and
development [2] and most internet data is in semi-structured forms such as
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 1

IJCSBI.ORG
XML documents [3]. This is because of the simplicity, expandability, easy
access and openness of XML. XML is one of the data types that clustering
can be performed on it.
One of the problems in incremental algorithms is that, the order of input of

documents, affects the clustering. In this paper, a new algorithm called
WXCLS + is introduced to reduce the sensitivity in the order of input of
documents in incremental clustering algorithms. The new algorithm clusters
with a combination of online and hierarchical clustering which is an offline
method. With this method, a cluster prevents enlargement and enhance the
accuracy of clustering.
This paper is organized as follows. Section 2 discusses related works on

XML document clustering algorithms especially incremental ones. Section
3 introduces the problem of sensitivity of incremental clustering algorithms
to the order of input of documents. XCLS+ algorithm has been used in this
paper as a sample to show its problem and to compare with our proposed
method. The algorithm for overcoming these hardships has been proposed
in Section 4. The simulation results and algorithm evaluations via
experiments are presented in section 5, and finally sections 6, 7 conclude
the paper and suggest the future works of the paper, respectively.
2. RELATED WORK
The clustering algorithms of XML documents can be categorized into pair

wise and incremental approaches. In pair wise approach, the clustering
algorithms are supposed to posses all documents at first. Some of these
methods might also investigate each document repeatedly. On the other
hand, possessing only one document at a time in the incremental approach;
therefore, it must investigate and cluster each document only once. The
main goal of these approaches is higher speed in clustering while
maintaining acceptable accuracy. Using a global criterion for computing
similarity is a considerable point in incremental methods. In order to
decrease the processing time, incremental algorithms reduce each cluster by
just maintaining a document representative. A cluster representative is an
aggregated document which combines cluster documents in a single
document. To be able to cluster a set of documents, its necessary to have a
similarity calculation method. Documents similarity with a cluster
representative is an appropriate measure for determine its cluster. Methods
to determine the similarity between one XML document and a
representative cluster are generally divided into three categories: content-
based [10], structure-based [2] [5] [11] and combination of mentioned
methods [1] [3] [4] [8] [9]. For example, XCLS, XCLS+, and XCLS++
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 2

IJCSBI.ORG
algorithms are all incremental algorithms that consider the structure of
documents [5,6]. XCLS performs clustering well for the heterogeneous
documents, but does not consider the node relations in the tree structure;
therefore, it is not proper for homogeneous documents. The XCLS+
algorithm is introduced after XCLS, which have more information in its
level structure compare to XCLS method, also as well as to the elements
name, contains information about their parents. The XCLS+ similarity
criteria are performed based on parent-child relationship [7]. The XCLS++
algorithm has improved the similarity criteria of the XCLS+ algorithm By
considering father-child relations. Despite all attempts which have been
done to improve clustering in XCLS, XCLS+, XCLS++ algorithms, they
still suffer from the problem of sensitivity to the order of input documents.
It means, different results are observed in clustering by changing the input
documents order. This problem occurs when very similar documents are
entered after each other (homogeneous documents). XCLS+ has been
selected in this research as an example to show the mentioned problem and
to be evaluated with our proposed method.
3. XCLS+ ALGORITHM
In XCLS+ each XML document and cluster representative is modeled by a
level structure object. The Level structure stores elements parent as well as
the elements name. The new input document is compared with updated
Level structure of the clusters. This new document will be merged with a
most similar cluster representative (Fig. 1) [5].
1,0
10,1
13,10 14,10
11,13 12,13 2,14
3,2 4,2
1,0
10,1 15,1
Cluster Level Structure
1,0 13,10 14,10 14,15 16,15
11,13 12,13 12,14 2,14
10,1 15,1
3,2 4,2
13,10 14,15 16,15
11,13 12,14
3,2
Fig. 1: Cluster Level structure merging in XCLS+ method
The formula to be used for calculating the similarity between a XML

document and cluster representative is as follow:
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 3

IJCSBI.ORG
L 1 L 1
0.5 (CN1i CP1i ) r L i 1 0.5 (CN2j CP2j ) r L j1
i 0 j0
LevelSim12
L 1 k L k 1
L 1 L 1

k 0
N (r)

0.5 (
i 0
CP1
i
r L i 1

j 0
CP2j r L j1 )
(1)
The parameters used this formula are:

CN1i Sum of occurrences of every common element in the level i of the
object 1.
CN2j Sum of occurrences of every common element in the level j of the
object 2.
CP1i Number of occurrences of all common elements in level i of the
object 1 which have the same parent.
CP2j Number of occurrences of all common elements in level j of the
object 2 which have the same parent.
Nk Number of elements in level k of the document.
R Base Weight: the increasing factor of weight. This is usually larger
than 1 to indicate that the higher level elements have more importance
than the lower level elements.
L Number of levels in the document.
In equation (1), CP indicates the Number of all common elements which
have the same parent while it is clearer in homogeneous XML document.
Instead of using number tags for elements, their own names are used in
order to perform a full search in XCLS+ algorithm. The Fig. 2 indicates a
tabular view of a document including element name (Tag Name), Parent
name (Parent), and level number (Level).[5]
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 4

IJCSBI.ORG
Fig. 2: Tabular view of a XML document suitable for XCLS+ clustering
The steps of matching of two objects in XCLS+ method are as follows:
1) First, the Level structures of both objects must be turned into tabular
presentation. The tables must be arranged based on element names.
2) Then, start with searching for common elements in the first level of
both tables. If at least one common element is found, mark the
number of common elements with the level number in object 1
(CN10 ) and the number of common elements with the level number in
object 2 (CN20 ), then go to step 3 otherwise, go to step 4 .
3) Move both objects to the next level tables (level i++, level j++) and
search for common elements in these new levels; if at least one
common element is found, mark the number of common elements
with the level number in object 1 (CN1i ) ) and the number of
j
common elements with the level number in object 2 (CN2 ), then go
to step 3. Otherwise, go to step 4.
4) the element names are compared, and as the element names are
sorted, the change of level only occurs in a table row where the
elements name is smaller. Because, in the next table row with
smaller element name, the possibility of finding common elements
exists, but the contrary is impossible.
5) Matching continues until one table reaches its final row.
This structural matching for two objects has the advantage of finding all
common elements between both objects. To find common elements with the
same parents we do the same as explained in finding common elements.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 5

IJCSBI.ORG
With all advantages of XCLS+ against XCLS method, it has some problems
which are introduced below. [5]
3.1 First problem of XCLS+ and its solution
According to Equation (1) LevelSim is a value between 0 and 1; 0 indicates

completely different objects and 1 indicates homogenous objects. LevelSim
is not symmetric, meaning that LevelSim12 is different with
LevelSim21. Asymmetry is problematic when the documents are
homogenous, and even in some cases the similarity will be more than 1. Fig.
3 shows the structure of two XML homogenous documents named Movie1
and Movie2.
Fig. 3: Two sample XML documents (Movie1,Movie2)
Using XCLS+ formula we have:
0.5 22 4 +22 3 +42 2 +02 1 +02 0 +0.5(22 2 +22 1 +42 0 )

LevelSim1 2 = =
12 4 +12 3 +52 2 +32 1 +22 0 +0.5( 12 4 +12 3 +22 2 +0+0 +(12 2 +12 1 +22 0 ))
0.555
LevelSim 2 1
0.5 2 24 + 2 23 + 4 22 + 0 21 + 0 20 + 0.5(2 22 + 2 21 + 4 20 )
=
1 22 + 1 21 + 2 20 + 0.5( 1 24 + 1 23 + 2 22 + 0 + 0 + (1 22 + 1 21 + 2 20 ))
= 1.428
This significant difference between two objects Movie1 and Movie2 is due
to Nk variable in the denominator of the formula. With such variable, the
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 6

IJCSBI.ORG
number of input document nodes will be important. For example, in
homogenous documents due to existence of many common nodes, the
numerator of the fraction is large. Now a) if the input document has fewer
amounts of common nodes, the denominator of function is small and in
result the similarity is high. b) If the input document has more amounts of
nodes, the denominator of function is large and in result the similarity is
decreases dramatically.
3.2 Our solution for this problem

To solve this problem, the similarity formula can be redefined as follows:
L 1 L 1 (2)
(CN1i CP1i ) r Li 1 (CN1j CP1j ) r L j1
i 0 j0
LevelSim12 L 1 L 1 L 1 L 1
Nk (r)Lk 1 M k (r)Lk 1 CP1i r Li 1 CP1j r L j1

k 0 k 0 i 0 j0
In this formal a new variable Mk is defined. M is the number of nodes in k-

th level of comparison cluster. Using this formula the similarity of
documents of Fig 3 is as follow:
LevelSim1 2 = LevelSim2 1 = 0.8
So, not only formula 2 is symmetric but also its result is more acceptable
than that the result of formula 1.
3.3 Second problem of XCLS+
After comparing two documents and determining their related similarity

value, if any, XCLS+ algorithm merges them. In the cases that the number
of documents is high, changing the sequential order of input documents,
affect the results of clustering algorithm. So the algorithm encounters
difficulty to appropriately distinguish the right cluster for the input
document. In our solution to this problem, we will first pre-cluster the input
document into a more coherent cluster. It means that for clustering a
document, its similarity to the cluster representative has to be sufficiently
decisive; otherwise the new document will create a new cluster. The result
of this decision will be a lot of small clusters in the first phase of our
approach. In the second phase, these small rigid clusters will be merged by
an offline algorithm to create a list of final appropriate clusters. Using this
hybrid approach (combination of an online and an offline clustering
algorithm) would permit gaining the speed of online algorithms as well as
the precision of offline algorithms. This approach is called WXCLS+ and
will be described in detail.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 7

IJCSBI.ORG
Fig. 4: WXCLS+ a hybrid clustering algorithm to overcome the problem of order of input documents
4. COMBINATION OF ONLINE WITH OFFLINE

CLUSTERING ALGORITHMS
WXCLS+ calculates the similarity of documents using our proposed
equation (2). This new formula uses a level structure to obtain the similarity
of XML documents. However, the main difference between the new
technique and previously suggested incremental algorithms such as XCLS+
is in the clustering process. Clustering process for the prior methods is done
completely in online method, however, simultaneous offline and online
clustering process was utilized for this new method to overcome the
problem of order of input documents. Fig 4 shows the proposed algorithm.
According to this algorithm, new clustering is performed in two phases:
online (incremental) phase and offline (hierarchical) phase.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 8

IJCSBI.ORG
Online phase
A threshold variable named BunchSize is defined for the maximum size of
incrementally created clusters. So, if we consider N as the number of whole
documents, the number of categories will be at least k= N/BunchSize. The
major difference in the new way is creating smaller but rigid clusters which
are considered for comparison and documents clustering.
Offline phase
After termination of documents, there exists a number of sufficiently small
and rigid clusters. In the second phase, these clusters are combined using a
hierarchical clustering algorithm. In other words, when the traffic load is
low and the number of clusters is higher than a determined value, clustering
is performed in offline mode using a hierarchical clustering algorithm.
Using this hybrid approach (combination of an online and an offline
clustering algorithm) would permit gaining the speed of online algorithms
as well as the precision of offline algorithms. Moreover, in previous
methods by increasing the number of documents and creating bigger
clusters, comparison between a document and a cluster representative is so
time consuming which will reduce the speed of comparison. It must be
mentioned that this algorithms is sensible to the value of BunchSize. If its
value is very small, the overhead of the program will be increased. Instead,
if the value is very large the the algorithm as previous algorithm suffers
from the problem of enlargement of cluster representative. Fig. 5 shows
graphically different steps of proposed clustering algorithm.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 9

IJCSBI.ORG
number of documents in the same category that

which are clustering incrementally
B1
Cluster Cluster
.
0 1
N=Number of XML B2
Documents B=bunchSize Phase Online Clustering by
Level Similarity
Cluster Cluster
...
0 1
Bn
Merge Pair of Clusters Phase Semi- Cluster Cluster

...
0 1
By Hierarchical Algorithm Offline
Fig. 5: Clustering WXCLS+ method
5. Evaluation of the proposed method
Both XCLS+ and WXCLS+ methods are implemented by Microsoft visual

studio2010 using the programming language C #. Three external criteria of
Entropy, Purity and Fscore [3], [17] are used to compare these two methods.
The evaluation criteria were performed in the same conditions on a data set.
Two data sets are used to evaluate the performance of WXCLS+ against
XCLS+ including both homogenous documents (single type DTD)[15] and
heterogeneous documents (multi-type DTD) [16]. Results of both sets are
examined and shown separately. The heterogeneous documents set consists
of 700 XML documents, while homogenous documents contain 120
department documents consisting of four Sub_DTDs.
At first, both clustering approaches are applied on the set of heterogeneous

data consisting of 700 documents. Tables 1, 2, 3 and 4, show the clustering
results for two methods using different threshold values With the Same
Order of Input Documents and with different Order of Input Documents in
XCLS+ & WXCLS+.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 10

IJCSBI.ORG
Table 1: The Results on heterogeneous documents with Same Order of Input Documents
ENTROP
THRESHOLD PURITY FSCORE
Y
AlGORITH
M WXCLS XCLS WXCLS XCLS WXCLS XCLS WXCLS XCLS
+ + + + + + + +
0.7 0.7 0.05 0.03 0.9 0.9 0.9 0.9
0.8 0.8 0 0.01 1 0.9 0.9 0.9
0.9 0.9 0 0.01 1 0.9 0.9 0.9
Table 2: The Results of algorithm XCLS+ on heterogeneous documents with different Order of Input
Documents
ENTROPY PURITY FSCORE

AlGORITHM Changing the order of Changing the order of Changing the order of
THRESHOLD
documents for three documents for three documents for three time
time time
0.7 0 0.04 0.02 1 0.92 0.95 0.99 0.93 0.96

XCLS+ 0.8 0 0 0 1 1 1 0.99 0.99 0.99
0.9 0 0 0 1 1 1 0.98 0.97 0.98
Table3: The Results of algorithm WXCLS+ on heterogeneous documents with different Order of Input
Documents (Bunch Size = 50)
AlGORITHM THRESHOLD ENTROP PURITY FSCORE

Y Changing the order of Changing the order of
Changing the order of documents for three documents for three
documents for three time time
time
0.7 0.05 0.05 0.05 0.92 0.92 0.92 0.93 0.93 0.93
WXCLS+ 0.8 0 0.01 0 1 0.97 1 0.99 0.97 0.99
0.9 0 0 0.01 1 1 0.97 0.99 0.99 0.99
Table 4: The Results of algorithm WXCLS+ on heterogeneous documents with different Order of Input
Documents (Bunch Size = 100)
AlGORITHM THRESHOLD ENTROPY PURITY FSCORE

Changing the order of Changing the order Changing
documents for three of documents for the order of
time three time documents for three
time
0.7 0.06 0.05 0.05 0.9 0.92 0.92 0.91 0.93 0.93
WXCLS+ 0.8 0 0 0 1 1 1 0.99 0.99 0.99
0.9 0 0 0 1 1 1 0.99 0.99 0.99
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 11

IJCSBI.ORG
In new method we were looking for an approach which clustering process is

not affected by altering the order of input documents. This goal was
achieved based on the results is these four the tables. Another objective of
WXCLS+ method was to improve clustering results compared to XCLS+
method. Whereas in some section of these four tables, the XCLS+ method is
better than our method. The reason behind this fact is that because XCLS+
and WXCLS+ algorithms consider parent-child relationship in comparing
documents, they are more effective in evaluation of homogeneous
documents. The methods WXCLS+ and XCLS+ will be evaluated for
homogenous documents and it will be seen that WXCLS+ method
compared to XCLS+ has considerable improvement.
To evaluate the results of the XCLS+ and WXCLS+ methods on

homogeneous documents, we have used the DTD of the department to
create 4 sub DTDs. We created a total number of 80 homogeneous XML
documents. Nodes of faculty, staff and grad student were eliminated from
Sub_DTD1. While, nodes of undergrad student, faculty and staff were
removed from Sub_DTD2. In the case of Sub_DTD3, nodes of undergrad
student, staff and grad student were omitted, whereas, undergrad student,
faculty and grad student were discarded in Sub_DTD4.
To evaluate the homogeneous documents, all of the sub-DTDs are put into
one class. then the documents' position are changed to see the evaluation
criteria in different threshold values at the output. Fscore criterion is used
here for the evaluation of both XCLS+ and WXCLS+ methods. Tables 5
and 6 show the evaluation results by several changes of the order of the
input documents for the methods XCLS+ and WXCLS+. To have a better
evaluation, several states are considered for documents formerly and give
them as inputs into the program to obtain the evaluation results in identical
conditions.
Table 5: The Results of the evaluation on homogeneous documents by method XCLS+

FScore FScore FScore FScore
AlGORITHM THRESHOLD
)first( )Second) )Third( )Fourth(
0.8 0.84 0.9 0.89 0.98
XCLS+
0.87 0.46 0.9 0.89 0.98
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 12

IJCSBI.ORG
Table 6: The Results of the evaluation on homogeneous documents by method WXCLS
+(BunchSize=25)
FScore FScore FScore FScore
AlGORITHM THRESHOLD
)first( )Second) )Third( )Fourth(
0.8 1 1 1 1
WXCLS+
0.87 1 1 1 1
Tables 5 and 6 show that the proposed new method has better results in
comparison to XCLS+ method and performs exact clustering with the
threshold values 0.8 and 0.87.
6. CONCLUSION AND FUTURE WORK
The incremental algorithms like XCLS and XCLS+ perform clustering

process with an acceptable speed. However, a careful study of the XCLS+
shows two major problems: (1) asymmetry in the computation of structural
similarity between two documents based on defined similarity formula, 2)
because of the incremental nature of the algorithm, with increasing the
number of documents, clusters are grown and the quality of clustering
process is decreased. To give a solution for these problems, two proposals
are offered: (1) by defining a new variable, asymmetry problem for
calculation of similarity between two documents (document with clustering)
has been resolved; (2) by combining two offline and online clustering
algorithm, we avoid the enlargement of clusters representative which
affects the quality and speed of clustering process. The advantage of this
new approach is that accuracy and speed of clustering process are
significantly improved.
In this paper, the three criteria Purity, Entropy and Fscore were used for
evaluation. But these criteria in Homogeneous documents have some
problems which in the future work new evaluation criteria for the evaluation
algorithms can be defined. Another issue that can be addressed is the
similarity formula between the documents. If a formula is provided that the
variables are less, will significantly increase the efficiency of algorithm.The
final proposal is that the other offline algorithms will be combined with
online method in order to cluster XML documents.
REFERENCES
[1] Jiawei, H. and Kamber, M., 2001. Data mining: concepts and techniques. San
Francisco, CA, itd: Morgan Kaufmann, Vol. 5.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 13

IJCSBI.ORG
[2] Bray, T. Paoli, J. Sperberg-McQueen, C.M. Maler, E. and Yergeau, F., 2004. Extensible
markup language (xml) 1.0.
[3] Nayak, R., 2008. Fast and effective clustering of XML data using structural
information, Knowledge and Information Systems, Vol. 14, No. 2, pp. 197215.
[4] Nayak, R. and Xu, S., 2006. XCLS: a fast and effective clustering algorithm for
heterogeneous XML documents, In Advances in Knowledge Discovery and Data Mining
Springer Berlin Heidelberg.
[5] Alishahi, M. Ravakhah, M. Shakeriaski, B. and Naghibzade, M., 2009. XML document
clustering based on common tag names anywhere in the structure, In Computer.
[6] Naghibzadeh, M., 2010. Tag Name Structure-based Clustering of XML Documents,
International Journal of Computer and Electrical Engineering (IJCEE), No. 2.
[7] Qaramaleki, A. K. E. and Naderi, H., 2013. A New Online XML Document Clustering
Based on XCLS++, International Journal of Computer Science and Business
Informatics,Vol. 2, No. 1.
[8] Nierman, A. and Jagadish, H. V., 2002. Evaluating Structural Similarity in XML
Documents, In WebDB, Vol. 2, pp. 61-66.
[9] Peng, J. Dong, Q. and Yang, S., 2008. Similarity in Chinese text processing, A New
Similarity competing method based on concept, series F: Information science, Vol. 51, No.
9, pp. 1212-1230.
[10] Ghosh, S. and Mitra, P., 2008. Combining content and structure similarity for XML
document classification using composite SVM kernels, In ICPR, pp. 1-4.
[11] Choi, I. Moon, B. and Kim, H. J., 2007. A clustering method based on path similarities
of XML data, Data and Knowledge Engineering, Vol. 60, No. 2, pp. 361-376.
[12] Tran, T. Nayak, R. and Bruza, P., 2008. Combining structure and content similarities
for XML document clustering, In Proceedings of the 7th Australasian Data Mining
Conference, Vol. 87, Australian Computer Society, Inc, pp. 219-225.
[13] Kim, W., 2008. XML document similarity measure in terms of the structure and
contents, In Proceedings of the International Conference on Computer Engineering and
Applications (CEA 2008), pp. 205-21.
[14] Viyanon, W. Madria, S. K. and Bhowmick, S. S., 2008. XML data integration based
on content and structure similarity using keys, In On the Move to Meaningful Internet
Systems: OTM, Springer Berlin Heidelberg, pp. 484-493.
[15] Dalamagas, T. Cheng,T. Winkel,K. J. and Sellis, T., 2006. A methodology for
clustering XML documents by structure, Information Systems, Vol. 31, No. 3, pp. 187-228.
[16] Lian, W. Cheung, D. L. Mamoulis, N. and Yiu, S. M., 2004. An efficient and scalable
algorithm for clustering XML documents by structure, Knowledge and Data Engineering,
IEEE Transactions on, Vol. 16, No. 1, pp. 82-96.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 14

IJCSBI.ORG
[17] Zhao, Y. and Karypis, G., 2001. Criterion functions for document clustering:
Experiments and analysis, Technical report, pp. 01-40.
BIOGRAPHY
Somayeh Ghazanfari received master science of computer engineering in 2015 from Islamic
Azad University of Lorestan, Iran. Her current research area is data mining and specially
clustering.
Hassan Naderi received his PhD degree in 2006 from INSA-LYON university of France.
His current research areas are text mining, search engine and massive data processing.
This paper may be cited as:

Ghazanfari, S. and Naderi, H., 2015. A Hybrid Algorithm for Improvement
of XML Documents Clustering. International Journal of Computer
Science and Business Informatics, Vol. 15, No. 3, pp. 1-15.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 15

IJCSBI.ORG
A Novel Collaborative Filtering

Friendship Recommendation Based
on Smartphones
Dhananjaya G. M. and Sachin C. Raykar
M.Tech.4th Sem Computer Science and Engineering
AMC Engineering College
18th Km, Bannerghatta Road, Bangalore-83
Mushtaq Ahmed D. M.
Associative Lecturer
Department of Computer Science and Engineering
AMC Engineering College, Bangalore-83
ABSTRACT
The existing social networking providers advocate close friends to help end users according
to their own interpersonal charts, which most likely are not the most likely to help reflect a
users personal preferences about pal assortment throughout real life. Within this cardstock,
all of us existing Friendsbook, a book semantic primarily based pal advice technique for
internet sites, which recommends close friends to help end users according to their own
way of life rather than interpersonal charts. Through benefiting from sensor-rich
smartphones, Friendsbook detects way of life involving end users through user-centric
sensor information, steps your likeness involving way of life between end users, and also
recommends close friends to help end users when their own way of life include large
likeness. Motivated by simply textual content exploration, all of us style a users daily life
while lifestyle files, from which his/her way of life are generally produced with the Latent
Dirichlet Algorithm protocol. Most of us more recommend a likeness metric to help gauge
your likeness involving way of life between end users, and also estimate users result with
regard to way of life having a friend-matching chart. When receiving a ask, Friendsbook
earnings a summary of those with greatest advice results for the dilemma person.
Eventually, Friensdbook integrates a opinions procedure for boosting your advice precision.
We now have carried out Friendsbook for the Android-based smartphones, and also looked
at its efficiency about both equally small-scale studies and also large-scale simulations. The
final results indicate that the suggestions accurately reflect your personal preferences
involving end users throughout picking close friends.
Keywords
Social network, Daily Activity, Smartphone sensor, Lifestyles, Friends recommendation.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 16

IJCSBI.ORG
1. INTRODUCTION
In your everyday lifestyles, organic meats have a huge selection of pursuits,
which in turn kind important sequences of which shape our lifestyles. With
this paper, we all utilize phrase exercise to particularly consider the actions
taken in this order connected with seconds, for example sitting,
walking, or typing, even though we all utilize term way of living to
consider higher-level abstractions connected with everyday lifestyles, for
example office work or shopping. In particular, this shopping way of
living mostly consists of this walking exercise, however might also secure
the standing or this sitting pursuits. To style everyday lifestyles
adequately, we all bring a analogy in between peoples everyday lifestyles
along with papers, seeing that demonstrated in Number 1. Earlier analysis
upon probabilistic theme types in text mining offers cared for papers seeing
that combos connected with matters, along with matters seeing that combos
connected with terms. Prompted through this particular, likewise, we can
address our everyday lifestyles (or lifestyle documents) seeing that a number
of standards of living (or topics), along with every single way of living
seeing that a number of pursuits (or words).Monitor here, in essence, we all
signify everyday lifestyles using life documents, as their semantic
explanations are generally shown by way of their matters, which are
standards of living in your research. Much like terms work for the reason
that time frame connected with papers, peoples pursuits normally work for
the reason that primitive vocab of these lifestyle papers..
1.1 Mobile Computing

The usage of mobile devices has increased dramatically over the last decade.
It is now estimated that there are more than 1 billion mobile users in the
world.
1.1.1 Smartphones
Smart-phones are becoming more and more popular and more and more
powerful in peoples lives. People use smartphones in daily activities for
accessing and storing information in various situations. In this paper, we
present a work in progress for detecting and automating some of these
activities
1.1.2 Sensors
These smartphones (e.g., iPhone or Android-based smartphones) are
equipped with a rich set of embedded sensors. such as GPS, accelerometer,
microphone, gyroscope, and camera.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 17

IJCSBI.ORG
2. RELATED WORKS
The idea of extracting usage patterns and routines from smartphone usage
data is not unique or novel as such. There has been a body of research
exploring different quantitative methods to mine patterns of human
activities from large datasets. Eagle and Pentland demonstrate the ability to
use mobile devices to recognize social patterns, identify significant
locations, and model organizational rhythms.
Farrahi and Gatica-Perez suggest that human interaction data, or human
proximity, obtained by mobile phone Bluetooth sensor data, can be
integrated with human location data, obtained by mobile cell tower
connections, to mine meaningful details about human activities from large
and noisy datasets [4].
Bian and Holtzman [3] presented Matchmakers, a collaborative filtering
friend recommendation system based on personality matching.
Kwon a n d K i m [6] p r o p o s e d a friend recommendation method using
physical and social context. However, the authors did not explain what the
physical and social context is and how to obtain the information. Yu et al.
[32] recommended geographically related friends in social network by
combining GPS information and social network structure.
Yu et al. [1] recommended geographically related friends in social network
by combining GPS information and social network structure. Hsu et al. [12]
studied the problem of link recommendation in weblogs and similar social
networks and content-based recommendation using mutual declared
interests.
3. METHODOLOGY
The particular offered style will likely be found Friend Seeker, a fresh
advice technique with regard to my space, which implies close friends to
help consumers according to his or her life-style rather then social charts.
FriendSeeker finds life-style connected with consumers coming from user-
centric sensor data, personalized awareness and methods their bond
connected with life-style between consumers, and advise close friends to
help consumers in case his or her life-style have got higher fit. The
particular offered style will develop a general buddy advice technique by
using Latent Dirichlet Part (LDA) algorithm and close friends advice will
likely be provided to anyone. Then propose the similarity metric to look for
the similarity connected with life-style between consumers, and figure out
users result regarding life-style having a friend-matching graph. Upon
finding a obtain, FriendSeeker returns a listing of those with highest advice
results for the query user. Ultimately the offered models can put into action
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 18

IJCSBI.ORG
on the Android-based Process or even Smartphones. The outcome can

demonstrate how the tips precisely return the personal preferences
connected with consumers with picking out close friends. We all get the
base structures from the cardstock since the Process Structures is actually
demonstrated with fig. 1 with the offered Do the job.
Fig1: System Architecture
3.1 Life Style Modeling

Standards of living and activities tend to be manifestation connected with
daily life on a couple different stage in which daily stay might be cared for
because mixture of life-style and life-style because a combination of
activities. By subtracting the benefit of recent trends in neuro scientific text
message exploration, that they style the daily life connected with user.
3.2 Activity Recognition

In action reputation you can find a couple movement devices, accelerometer
and gyroscope, are utilized to help infer users movement activities. Now
there tend to be a couple well-known techniques: Supervised finding out and
unsupervised finding out. That they employ unsupervised finding out ways
to realize activities. The following, that they follow the most popular K-
means clustering algorithm to help collection data in to clusters, in which
each and every bunch presents a task.
3.3 Life-style Removal applying LDA
It's also well worth noting that because our system works by using
unsupervised finding out algorithms to acknowledge activities as well as the
theme style to find out life-style, the physical explanations connected with
produced activities (or bunch facilities from the K-means algorithm) or
even topics tend to be unidentified to help you. As mentioned with, this
sort of which means might be predicted by way of any additional move
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 19

IJCSBI.ORG
connected with contrasting this issue activations for the true structure with
the subjects day after which pinpointing issues that match achievable daily
programs. In Friendbook, since they will be to help solely examine
similarity with activities or even theme patterns, you don't have to help
infer the physical which means of each one bunch core or even theme. In
contrast, certainly not revealing the particular physical which means
connected with activities and an issue has strengths from the perspective
connected with preserving comfort.
3.4 Friend Corresponding Chart and End user Impact
Friend-matching graph can be used to help symbolize the similarity between
his or her life-style and how they affect people inside graph. Particularly,
that they make use of the hyperlink weight between a couple consumers to
help symbolize the similarity in their life-style. Using the friend-matching
graph, they will receive a users affinity sending how very likely that user
will likely be selected because yet another users buddy inside system.
3.5 User Result Standing
Result standing implies a new users capability to create friendships inside
network. Page rank that's employed in web site standing that they form the
idea that a users standing is actually mirrored by his / her neighbors inside
friend-matching graph in addition to just how much his / her neighbors
suggest the consumer like a good friend.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 20

IJCSBI.ORG
3.6 Good friend Endorsement

It receives users ask in addition to server would likely extract this users
life-style vector in addition to dependent on which usually advocate good
friend to the user. Endorsement answers are very relying on users choice. A
number of users may perhaps like the program in order to advocate users
using excessive influence, although some people might users may want to
recognize users with the most equivalent life-style.
Algorithm: Good friend endorsement
Input: This query user my spouse and i, this endorsement coefficient as well as the
essential variety of suggested pals on the program v.
Output: Good Friend list Fi.
1. Fi , Q .
2. extracting is life vector Li using the LDA
3. for single life style zk this likelihood of which throughout Li isn't accomplish
4. put users in the entry of zk into Q
5. 5: placed users inside access connected with zk in to Queen
6. end intended for
7. S(i,j) 0
8. end for
9. for single user j in the database do
10:
10. end for
11. form almost all users throughout lowering buy as outlined by Ri(j)
12. placed the top v users inside sorted listing in order to Fi
4. EFFECTS
4.1 Friend Endorsement Outcomes
You'll find four cost-free variables accustomed to produce this good friend
endorsement effects, such as likeness threshold intended for friend-matching
graph Sthr, this threshold which adjustments the volume of predominant
life-style, this damping element which emphasizes benefit on the good
friend coordinating graph in addition to the volume of life-style. We now
have applied these beliefs while default via empirical scientific studies, we
age, this likeness threshold Sthr is determined in order to 0: 5, this threshold
Fig. 2 Shows various user interfaces connected with Friendsbook.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 21

IJCSBI.ORG
Fig.3 The gray image representation of the eight users similarity.
is determined in order to 0: 8, this damping element is determined in order

to 0: 85, in addition to the volume of life-style is determined in order to
10program. These IDs in addition to endorsement lots connected with
recom-mended pals are generally revealed inside listing. Observe that
Friendsbook dividends this USERNAME connected with users as opposed
to their particular true labels as a result of privacy considerations inside our
findings. Figs. 2b in addition to 2c indicate this picks connected with user
comments interfaces. Consumers may connect with men and women inside
suggested good friend listing via our system and as well offer a credit score
for the suggested pals. Observe that we intentionally anonymize an
individual can details throughout Fig. 2 to safeguard this solitude connected
with themes. Inside the true program, when a user wants to make use of the
program, he/she will be encour-aged to try and do his/her personal account,
age. g., brand in addition to photography. Consequently, this brand in
addition to photography details along with the likeness credit score of each
and every suggested good friend will be proven to the consumer.
Fig. 3, user 1 provides sturdy romantic relationship using user a couple of in
addition to user 5, user 3 provides sturdy romantic relationship using user 7,
user 6 provides romantic relationship while using abovementioned users but
not very strong, while user 5 in addition to user 8 don't have any romantic
relationship using other people in any way. The actual result is actually
consistent with the earth simple fact connected with jobs revealed
Table 1.Profession of Users
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 22

IJCSBI.ORG
Throughout Dining room table 1 because men and women possess the exact
same career most often have the same life-style.
Table 1 shows there are four nights in the full week and this each day on the
full week really correspond to a number collected from one of in order to
four! We can consequently conclude when there was clearly yet another
morning in a single full week, it should be several!
Table 2.User result standing of Eight Users
The user result standing of the eight users are shown in Table 2. The top
ranks are users 1 and 7, following by users 4 and 8 who seem to have high
results. How-ever, users 4 and 8 are not supposed to be higher than others
because they have not connected with any one. Indeed, because of this, they
should always maintain the initial score. Since we only have eight users in
the system, each of whom uses 18 0:125 as its initial random result, as
described in Algorithm 1, which results in that their results are even higher
than some of the connected end-users.
5. CONCLUSION
In this cardstock, we presented the style in addition to setup connected with
Friendsbook, a new semantic-based good friend endorsement program
intended for social networks. Totally different from this good friend
endorsement parts depending on sociable chart throughout present social
media companies, Friendsbook produced life-style via user centric
information gathered via sensors for the smart-phone in addition to
suggested likely pals in order to users should they share equivalent life-
style. We executed Friendsbook for the Android-based smartphones, in
addition to look at their performance about equally small-scale findings in
addition to large-scale simulations. The effects confirmed that the tips
effectively reveal these inclinations connected with users throughout
deciding on pals. The near future perform can be four-fold Initial, we wish
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 23

IJCSBI.ORG
to gauge our system about large-scale discipline findings. Subsequent, we

intend to put into practice living fashion extraction making use of LDA as
well as the iterative matrix-vector multiplication procedure throughout user
influence standing incrementally. Next, this likeness threshold used for this
friend-matching graph is actually predetermined inside our recent prototype
connected with Friendsbook. Eventually, we plan to incorporate far more
sensors for the mobile phones in the program and as well utilize the details
via wearable equipments (e. g., Fit bit, I-watch, Google goblet, Nike+, in
addition to Galaxy Gear) to discover far more useful in addition to
purposeful life-style. Really, we be ready to incorporate Friendsbook in to
present sociable companies (e. g., Face-book, Tweets, LinkedIn) in order
that Friendsbook may utilize much more information forever development.
6. ACKNOWLEDGMENTS
These writers would choose to thank these anonymous reviewers whose
informative reviews include assisted help the speech of the cardstock
significantly. This specific perform was reinforced partially by NSF CNS-
1017156 in addition to CNS-0953238, in addition to partially by NSFC
61273079, ANRNSFC 61061130563, NSFC 61373167 in addition to Pure
Technology Basis connected with Hubei Province Simply no. 2013CFB297.
REFERENCES
[1] X. Yu, A. Pan, L.-A. Tang, Z. Li, and J. Han. Geo-friends recommendation in gps-
based cyber-physical social network. Proc. Of ASONAM, pages 361-368, 2011.
[2] L. Bian and H. Holtzman. Online friend recommendation through personality
matching and collaborative filtering. Proc. of UBICOMM, pages 230-235, 2011
[3] K. Farrahi and D. Gatica-Perez, Probabilistic mining of socio-geographic routines
from mobile phone data, IEEE J. Select. Topics Signal Process., vol. 4, no. 4, pp.
746755, Aug. 2010.
[4] T. Huynh, M. Fritz, and B. Schiel. Discovery of Activity Patterns using Topic Models.
Proc. of UbiComp, 2008.
[5] J. Kwon and S. Kim. Friend recommendation method using physical and social
context. International Journal of Computer Science and Network Security, 10(11):116-
120, 2010.
[6] K. Farrahi and D. Gatica-Perez. Probabilistic mining of sociogeographic routines from
mobile phone data. Selected Topics in Signal Processing, IEEE Journal of, 4(4):746-
755, 2010.
[7] K. Farrahi and D. Gatica-Perez. Discovering Routines from Largescale Human
Locations using Probabilistic Topic Models. ACM Transactions on Intelligent Systems
and Technology (TIST), 2(1), 2011.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 24

IJCSBI.ORG
[8] Z. Wang, C. E. Taylor, Q. Cao, H. Qi, and Z. Wang. Demo: Friendbook: Privacy
Preserving Friend Matching based on Shared Interests. Proc. of ACM SenSys, pages
397-398, 2011.
[9] J. Biagioni, T. Gerlich, T. Merrifield, and J. Eriksson. EasyTracker: Automatic Transit
Tracking, Mapping, and Arrival Time Prediction Using Smartphones. Proc. of SenSys,
pages 68-81, 2011.
[10] B. A. Frigyik, A. Kapila, and M. R. Gupta. Introduction to the dirichlet distribution
and related processes. Department of Electrical Engineering, University of
Washignton, UWEETR-2010-0006, 2010.
[11] L. Gou, F. You, J. Guo, L. Wu, and X. L. Zhang. Sfviz: Interestbased friends
exploration and recommendation in social networks. Proc. of VINCI, page 15, 2011.
[12] Q. Li, J. A. Stankovic, M. A. Hanson, A. T. Barth, J. Lach, and G. Zhou. Accurate,
Fast Fall Detection Using Gyroscopes and Accelerometer-Derived Posture
Information. Proc. of BSN, pages 138-143, 2009.
[13] G. Spaargaren and B. Van Vliet. Lifestyles, Consumption and the Environment: The
Ecological Modernization of Domestic Consumption. Environmental Politics, 9(1):50-
76, 2000.
[14] A. D. Sarma, A. R. Molla, G. Pandurangan, and E. Upfal. Fast distributed pagerank
computation. Springer Berlin Heidelberg, pages 11-26, 2013.
[15] S. Reddy, M. Mun, J. Burke, D. Estrin, M. Hansen, and M. Srivastava. Using Mobile
Phones to Determine Transportation Modes. ACM Transactions on Sensor Networks
(TOSN), 6(2):13, 2010.
[16] I. Ropke. The Dynamics of Willingness to Consume. Ecological Economics,
28(3):399-420, 1999.
[17] Y. Zheng, Y. Chen, Q. Li, X. Xie, and W.-Y. Ma. Understanding Transportation
Modes Based on GPS Data for Web Applications. ACM Transactions on the Web
(TWEB), 4(1):1-36, 2010.
[18] W. H. Hsu, A. King, M. Paradesi, T. Pydimarri, and T. Weninger. Collaborative and
structural recommendation of friends using weblog-based social network analysis.
Proc. of AAAI Spring Symposium Series, 2006.
[19] N. Eagle and A. S. Pentland. Reality Mining: Sensing Complex Cocial Systems.
Personal Ubiquitous Computing, 10(4):255-268, March 2006.

Dhananjaya, G. M., Mushtaq, A. D. M., and Raykar, S. C., 2015. A Novel
Collaborative Filtering Friendship Recommendation Based on
Smartphones. International Journal of Computer Science and Business
Informatics, Vol. 15, No. 3, pp. 16-25.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 25

IJCSBI.ORG
Efficient Computational Tools for

Nonlinear Flight Dynamic Analysis
in the Full Envelope
P. Lathasree
CSIR-National Aerospace Laboratories
Old Airport Road, PB No. 1779, Bangalore - 560017
Address2 of institution (optional)
Abhay A. Pashilkar
CSIR-National Aerospace Laboratories
Old Airport Road, PB No. 1779, Bangalore - 560017
Address2 of institution (optional)
ABSTRACT
Equilibrium analysis for an aircraft is very important for control law design and
development. Computation of equilibrium point is also required to initialize the aircraft
model in flight simulation. This equilibrium point is obtained by solving for the zeros of the
right hand sides of the aircraft equations of motion simultaneously. Mathematically, this is
achieved using the conventional numerical optimization methods which are iterative and
require more number of iterations. The typical flight envelope of a fighter aircraft ranges
from 20% to 200% of the speed of sound and sea level to 15 km in terms of altitude.
This places significant computational demand to generate hundreds of linearized aircraft
mathematical models needed for control law design and evaluation. The Approximate Trim
calculations, proposed in this paper, provide good initial guess values throughout the flight
envelope for the conventional optimization methods resulting in faster convergence.
Thus the time and effort required to generate the aircraft mathematical models is reduced.
The aerodynamic database is obtained by wind tunnel testing. To reduce the wind tunnel
testing costs, the aerodynamic database with respect to angle of attack is generated within
the aircraft performance limits. This results in a reduction in the range of the aerodynamic
data with respect to angle of attack as speed increases. Therefore, only three points are
available for the interpolation at the extreme points in the flight envelope. In order to solve
this problem, we propose barycentric (triangular) interpolation in combination with the
conventional rectangular interpolation for these two dimensional tables.
Keywords
Flight Dynamics, Equilibrium Analysis, Numerical Optimisation, Flight Envelope,
Interpolation.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 26

IJCSBI.ORG
Nomenclature
h= Altitude, m
.
h= Time derivative of Altitude, m/s
I XY = Inertia cross product
IYZ = Inertia cross product
I ZX = Inertia cross product
J= Inertia Matrix
L= Rolling Moment
M= Pitching Moment
N= Yawing Moment
L1 , L2 , L3 = Elements of Direction Cosine Matrix
M1 , M 2 , M 3 =Elements of Direction Cosine Matrix

N1 , N 2 , N 3 = Elements of Direction Cosine Matrix
pB = Body axis roll rate (deg/s)
qB = Body axis pitch rate (deg/s)
rB = Body axis yaw rate (deg/s)
pT = Earth axis Roll rate (deg/s)
qT = Earth axis Pitch rate (deg/s)
rT = Earth axis Yaw rate (deg/s)
TEB = Transformation matrix from Earth to Body axis
uB = Body Axis forward velocity, m/s

vB = Body Axis lateral velocity, m/s
wB = Body Axis vertical velocity, m/s
.
uB = time derivative of Body Axis forward velocity, m/s2
.
vB = time derivative of Body Axis lateral velocity, m/s2
.
wB = time derivative of Body Axis vertical velocity, m/s2
VN = Inertial A/c Velocity along North |
VE = Inertial A/c Velocity along East | wrto Earth Axis
VD = Inertial A/c Velocity along Down |
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 27

IJCSBI.ORG
.
VN = Time derivative of Inertial A/c Velocity along North |
.
VE = Time derivative of Inertial A/c Velocity along East |wrto Earth Axis
.
VD = Time derivative of Inertial A/c Velocity along Down |
x= Position in X direction, m
y= Position in Y direction, m
.
x= Time derivative of Position in X direction, m/s
.
y= Time derivative of Position in X direction, m/s
= angle of attack, deg
= angle of sideslip, deg
.
= time derivative of angle of attack, deg/s
.
= time derivative of angle of sideslip, deg
= flight path angle, deg
= aircraft bank angle, deg
= aircraft pitch angle, deg
= aircraft heading angle, deg
.
= time derivative of aircraft bank angle, deg
.
= time derivative of aircraft pitch angle, deg
.
= time derivative of aircraft heading angle, deg
= Air Density of Air Kg/m3
Qbar = Dynamic Pressure, pascals
CL = CL-AoA curve slope
AoA = Angle of Attack, deg
CL = Lift force coefficient
CD = Drag force coefficient
Cm = Pitching moment coefficient
Mass = Aircraft mass in Kg
Sref = Aircraft wing area, m2
PLA = Power Lever Angle (deg)
n = Load Factor (ratio of Lift force to Weight)
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 28

IJCSBI.ORG
1. INTRODUCTION
The flight envelope of any fighter aircraft is encompassed by Mach number
and altitude and ranges from 20% to 200% of the speed of sound and sea
level to 15 km altitude. Aircraft exhibit non-linear behavior within this
range of speeds. They are represented by non-linear mathematical models.
A common approach for analyzing aircraft dynamics consists of local
stability and controllability analysis by linearizing the equations of motion.
This requires linearization of nonlinear aircraft dynamic model at many
chosen analysis points within the flight envelope. Before linearization, it is
required to determine the value of the states and controls such that the
aircraft is in at equilibrium at each analysis point. The linearisation of
aircraft non-linear model about an operating point is achieved by using
small perturbations in the motion of airplane about the equilibrium point.
The linear system matrices are determined by numerical perturbation using
the Taylor series expansion approach about the equilibrium. As the
linearization needs to be carried at hundreds of such points within the flight
envelope, there is a need to develop efficient computational methods for this
purpose.
Modern fighter aircraft are designed to be unstable to achieve high

maneuverability, and therefore a flight controller is required for stability and
control augmentation (Bugajski and Enns, 1992; Chetty, Deodhare and
Misra, 2002). Towards this flight controller design, we need to generate
hundreds of linearized aircraft mathematical models.
Conventional multivariable numerical optimization methods are used for

aircraft trim (Stevens and Lewis, 1992; Rolfe and Staples, 1991).
The aircraft trim is achieved by solving the first order differential equations
that represent aircraft equations of motion. These conventional methods may
take more number of iterations to arrive at the solution. Hence, the
generation of hundreds of linearized aircraft mathematical models for flight
control laws design and evaluation requires more time and effort.
The large aerodynamic and engine databases representing a fighter aircraft

are generally accessed for analysis and synthesis tasks by using linear
interpolation. This database will be in the form of multidimensional data
tables. As an example to represent the aerodynamic and engine
characteristic of a typical tailless delta wing fighter aircraft, about 400 data
tables are used. The aerodynamic database is normally generated using wind
tunnel testing, analytical and Computational Fluid Dynamics tools.
Generally, linear interpolation with rectangular points is used for the two
dimensional data tables (Rolfe and Staples, 1991; Allerton 2009).
To reduce the wind tunnel testing outside the flight envelope, the
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 29

IJCSBI.ORG
aerodynamic database is made available with two dimensional data tables
tapered at the extreme points of flight envelope. This is most commonly
seen in case of dependency of the various aerodynamic parameters as joint
functions of aircraft speed and its angle of attack (i.e., angle made by the
wing with respect to the direction of air flow). The conventional way of
linear interpolation requires four points, whereas in this case at the
boundaries of the flight envelope only three points are available for
interpolation. Therefore, to exploit the full aerodynamic or engine database,
use of suitable interpolation schemes is required.
In this paper, authors propose to use approximate trim calculations that

provide close to trim initial guess values for the conventional optimization
routines. This will result in faster convergence and hence reduced time and
effort to generate hundreds of linearized aircraft mathematical models.
It allows us to generate equilibrium points throughout the flight envelope.
It is also proposed to employ the barycentric interpolation scheme where
only three points are available for interpolation in addition to the
conventional rectangular interpolation thus enabling full coverage of
aerodynamic database.
2. METHODOLGY
As discussed already, we need good initial guess values for the optimization
methods for faster convergence. The process of obtaining aircraft trim using
optimization method and the triangular interpolation are discussed now.
2.1 Aircraft Trim
Aircraft Trim or Equilibrium is defined as the state of aircraft when resultant
forces and moments about its center of gravity (c.g.) is zero.
Mathematically, an aircraft is said to be in equilibrium or trim state when all
the state derivatives vanish simultaneously i.e. will be equal to zero.
This assumes a certain number of states to define the aircraft flight.
The well known set of equations of flight that adequately describe rigid
airplane motion is the Six Degree Of Freedom (6 DOF) motion equations.
The derivation of this is described in any standard text book (Mcruer,
Ashkenas and Graham, 1973;Nandan Sinha and Ananthkrishnan, 2013).
This equation set is given by Eqns (1) & (2).
u B 0 rT qT uB VN
v =- r
B T 0 pT v +T
B EB V E (1)
w B qT pT 0 w B VD

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 30

IJCSBI.ORG
p B L 0 I ZX I XY p B2 pB q B
q J M + I 2
B YZ 0 I XY J
qB -
J

q r
B B (2)

rB N I YZ I ZX 0 rB2 rB p B

The above six degree freedom equations have six states namely uB ,vB, wB ,
pB qB, rB. Further six more states namely x, y, h, , , are derived from
the above six states to completely describe the aircraft flight state as in
equations (3) and (4).
0 sin . cos pB
1 0 cos cos sin cos q (3)
cos B
cos sin sin sin cos rB
.
x. L1 M1 N1 uB
y = v (4)
. L2 M2 N2 B
h L 3 w B
M 3 N 3
All of these twelve states can be simultaneously constant for an aircraft only
on ground. This means, for a rigid aircraft, equilibrium is possible only if
the aircraft is resting on ground. However, following assumptions are made
for up and away flight states. With assumption that the Earth is flat, last
three equations and heading rate can be ignored. Now, we are left with only
eight equations which can result in a quasi steady state. The following flight
states which fall into the equilibrium state defined above are very useful for
flight dynamics analysis.
2.1.1 Flight Trim States
For the aircraft to be trimmed for different flight states, constraints relevant
to that state need to be satisfied in addition to the equality mentioned above.
Each trim type or flight state can be described by mathematical constraints
according to the nature of the aircraft flight. The description of different
well understood states follows next.
Straight and Level flight:

A level flight is defined as flight with wings level implying zero roll angle,
constant flight path angle for a given Mach number and altitude.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 31

IJCSBI.ORG
When translated to mathematical constraints, these conditions are given by
Eqn (5).
u B p B
v 0 and q 0 0 ; x 0, = 0 and h 0
B B y (5)
w B rB
.
If h is zero, it means a wings level, horizontal flight with zero flight path
angle.
.
If h is not equal to zero, then the flight can be climbing or gliding with
wings level
Level Turn:
A steady turning flight is that where the wings are not level ( 0 ). It can
still be a level turn with constant turn rate at a specified load factor for a
given mach and altitude. The equilibrium conditions in this equilibrium
flight state are given by Eqn (6).
.
u B p B x.
v 0 and q 0

B B 0 ; y 0, 0 (6)
w B rB .
h

Pull Up / Push over:
A pull-up is defined as that state of the aircraft where the aircraft has its
wings level and is pitching up at a constant pitch rate or load factor for a
given Mach number and altitude. The steady sate conditions to be satisfied
for a steady pull-up are given by Eqn (7).
u B p B
v 0 ; q 0 ; 0; x 0, p B =0;
B B y r (7)
w B rB B
= 0 ( 0 or qB 0 ) and h 0
For a pull-up, load factor is greater than one and for a push-over, load factor
is less than one.
2.1.2 Trimming Strategy
Equilibrium flight is obtained mathematically by solving the nonlinear
aircraft equations of equation that make the state derivatives
B 0 . Multi-variable numerical optimization algorithm
p B , qB , rB , uB , vB , w
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 32

IJCSBI.ORG
(Newton-Raphson) is used to solve these nonlinear flight equations.
The control settings obtained as a result of solving the nonlinear flight
equations (aircraft trim) are referred to as trim points and equilibrium
analysis is carried out at these trim points. Any flight state at trim has to
satisfy the steady state conditions discussed above according to the nature of
that flight state. The equilibrium flight is obtained mathematically by
.
solving the non-linear flight equations that make the state derivatives p B ,
. .
q B , r B , u B , v B , w B 0 along with the constraint equations according to the
flight state. In the computing environment, a multi-variable numerical
optimization algorithm is used to solve the non-linear flight equations by
adjusting the control variables and other appropriate state variables to satisfy
the relevant equalities discussed above. Associated with the six equations
of accelerations are the six unknown controls. The influence of each of the
control settings on the corresponding accelerations are given by,
* the Power Level Angle(PLA) controlling the acceleration V or u B
.
* the aileron setting used for controlling roll acceleration, p B
.
* the rudder controlling the yaw acceleration, r B
.
* the elevator controlling the pitch acceleration, q B
.
* alpha controlling the vertical linear acceleration, or w B and
.
* beta controlling the lateral linear acceleration, or v B
The use of all six equations results in six degree of freedom trim. A block
schematic of trim algorithm is shown in Figure 1.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 33

IJCSBI.ORG
Flight Condition
(Mach number & Altitude
Minimization
Algorithm Trim Data
Constraint Routine
Aircraft Model
Scalar Cost Function
Figure 1. Aircraft Trim Algorithm
It is observed that the conventional optimization routines may take more

number of iterations (around 100) to arrive at the solution if the initial guess
values are not close to trim. As we need to generate hundreds of linearised
aircraft mathematical models for flight controller design, it is desirable to
have faster convergence for the optimization methods i.e. arriving at
solution in less number of iterations. This leads to a reduction in the design
time and effort. Hence, we have proposed to use approximate calculations
that can provide close to trim initial guess values.
With an example of straight level flight, the steps involved in approximate

trim calculations are explained below.
The approximate trim calculations for the Steady Level Flight case at the
chosen Mach number and Altitude are given by:
Qbar 0.5 * * V 2
( mass * 9.81)
(8)
Qbar * Sref * CL

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 34

IJCSBI.ORG
From equation (8), we can see that the approximate value of trim angle of
attack can be calculated. Figure 2 provides the procedural steps for
approximate trim calculations.
1. From Figure 2, it is noticed that corresponding to the trim Angle of

Attack (AoA), drag can be computed from the CD AoA curve.
2. For a level flight, Thrust = Drag at equilibrium/trim. Thus we obtain

the thrust.
3. Further, it is understood that the thrust is function of Mach number,

Altitude and throttle position. Knowing the thrust value, Mach
number and altitude, the power lever angle (PLA) required for trim
is estimated based on the inverse calculations of engine database.
From the static engine database,
Thrust f ( Machnumber , Altitude, PLA)
With inverse formulation,
PLA f 1 ( Machnumber , Altitude, Thrust )
Morelli has addressed the issue of the global non-linear parametric

modeling for steady aerodynamics with an example of F16 (Morelli,
1995; 1998). The concept of replacing engine database in the table
look-up form by the global non-linear polynomial models has been
used here. The technique of multivariate orthogonal functions in one
and two variables is used to arrive at the global non-linear
polynomial models. The technique of multivariate orthogonal
polynomials also has been used to model the unsteady aerodynamics
(Abhay Pashilkar and Pradeep, 1999).
The global nonlinear polynomial models as function of Mach
number and altitude are obtained. The polynomial coefficients are
given below.
a1 = 29302*mach**2 - 60149*mach + 15661
a2 = 39*mach**2 + 1104*mach - 75
a3 = 164380*mach**2 -529390*mach + 424320
a4 = 7528*mach**2 - 8797*mach - 1985
T1 = a1*mach + a2*za/1.e3
T2 = a3*mach + a4*za/1.e3
platrim = 30. + (Drag - T1) / (T2-T1) *(130.-30.)
(where mach is Mach number and za is pressure altitude)
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 35

IJCSBI.ORG
4. Generally, pitching moment coefficient is comprised of aerodynamic
component and engine component Corresponding to the thrust
obtained in Step (2), the pitch moment contribution due to engine is
computed first and thereby the corresponding pitching moment
coefficient (i.e. Cmthrust). Similarly, Cmaero will be computed using
the Cm AoA curve.
Hence, Cmtotal = Cmthrust + Cmaero
5. For trim, Cm should be equal to zero. The elevator required to

satisfy Cmtotal=0 is the trim elevator. In this manner, we obtain
approximate trim values of AoA, throttle position and elevator. This
is a non iterative procedure
These approximate trim values are used as initial guess values for
conventional optimization method.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 36

IJCSBI.ORG
For Level Flight with Mach number and altitude:

CL Qbar = 0.5**V2
trim = (mass*9.81)/(Qbar*Sref*CL)
trim For Level Flight with Mach number and altitude:
Obtain CD from CD Alpha curve.

Thrust = Drag (Level Flight)
Thrust = f(Mach number, Alt, PLA)
Inverse Formulation results in
PLA = f-1(Mach number, Alt, Thrust)
Accordingly, obtain Cm(thrust)
CD
trim
For Level Flight with Mach number and altitude:
Cm(tot) = Cm(aero)+Cm(thrust)
For trim, Cm(tot) = 0
Cm => detrim = (Cm(tot)-Cm0-Cm*trim)/Cmde
trim
Figure 2. Procedural steps for Approximate Trim Calculations
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 37

IJCSBI.ORG
The same steps are used for the approximate trim calculations of pull up and
level turn trim options. The approximate trim calculations for the pull up /
push over and level turn are given below.
Pull Up / Push over (for given Mach number, Altitude and the Load
factor, n)
Qbar 0.5* * V 2
mass * 9.81
n * (10)
Qbar * Sref * CL

9.81
q ( n 1)*
V
Level Turn (for given Mach number, Altitude and Alpha)
Qbar 0.5* * V 2
mass * 9.81
max(min(1.0, , 1.0)
Qbar * Sref * CL *
(11)
sin 1 cos * sin

9.81
tan *
V
p * sin
q * cos * sin( )
r * cos * cos( )
As these values are very close to trim, the convergence is faster and in very
less number of iterations (around 10) we can obtain the trim. This leads to
significant reduction in time and effort when it is required to generate
hundreds of linearized aircraft mathematical models for control law design
and evaluation.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 38

IJCSBI.ORG
In the following section, the issue of interpolation where only three points
are available is discussed.
2.2 Barycentric Interpolation
Large aerodynamic and engine databases are used for the flight dynamic
analysis. This data is accessed for analysis and synthesis tasks by table look
up and linear interpolation. The reason for some of the data tables made
available in the hypercube format is already discussed. The typical Mach
number and AoA envelope is shown in Figure 3 where at higher Mach
numbers limited range of angle of attack will be available.
Figure 3. Angle of Attack Mach number Envelope for a fighter aircraft

(black vertical line indicates Mach number 1.0)
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 39

IJCSBI.ORG
Table 1. 2D table as function of Mach number and Angle of Attack

M => .00 . .30 .50 .60 .70 .80 .90 .95 1.00 1.05 1.10 1.20
AOA
-15.00 .2130 .2130 .2019 .1877 .1883
-14.00 .2130 .2130 2019 .1877 .1883
-12.00 .2130 .2130 .2019 .1877 .1883 .1500
-10.00 .2130 .2130 .2019 .1877 .1883 .1510 .1508 .1628 .1610 .1819 .1700
-8.00 .2130 .2130 .2019 .1877 .1883 .1524 . 1508 .1628 .1610 .1819 .1700 .1443
-6.00 .2130 .2130 .2019 .1877 .1883 .1539 .1508 .1628 .1610 .1819 .1700 .1443
-5.00 .2130 .2130 .2019 .1877 .1883 .1543 .1508 .1628 .1610 .1819 .1700 . 1443
-4.00 .2130 .2130 .2019 .1877 .1883 .1543 .1508 . 1628 .1610 .1819 .1700 .1443
-2.00 .2130 .2130 .2019 .1877 .1883 .1530 .1508 .1628 .1610 .1819 .1700 .1443
.00 .2072 .2072 .1965 .1822 .1826 .1532 .1514 .1618 .1657 .1849 .1699 .1382
2.00 .2034 .2034 .1940 .1810 .1814 .1509 .1502 .1599 .1689 .1816 .1659 .1320
4.00 .2036 .2036 .1920 .1797 .1799 .1490 .1465 .1573 .1655 .1734 .1586 .1270
6.00 .2039 .2039 .1892 .1761 .1763 .1447 .1397 .1514 .1547 .1649 .1510 .1223
8.00 .2052 .2052 .1894 .1755 .1757 .1402 .1324 .1435 .1418 .1554 .1438 .1199
10.00 .2076 .2076 .1927 .1783 .1784 .1382 .1300 .1358 .1330 .1486 .1408 .1202
11.00 .2081 .2081 .1930 .1783 .1781 .1384 .1315 .1332 .1310 .1486 .1420 .1220
12.00 .2083 .2083 .1931 .1783 .1784 .1387 .1323 .1301 .1292 .1513 .1444 .1238
13.00 .2085 .2085 .1927 .1790 .1788 .1333 .1303 .1255 .1273 .1528 .1442 .1244
14.00 .2080 .2080 .1910 .1787 .1787 .1314 .1256 .1200 .1222 .1493 .1391 .1213
15.00 .2078 .2078 .1895 .1778 .1782 .1316 .1223 .1172 .1150 .1398 .1294 .1134
16.00 .2083 .2083 .1879 .1758 .1771 .1352 .1203 .1160 .1107 .1294 .1200 .1037
17.00 .2093 .2093 .1877 .1756 .1770 .1394 .1221 .1172 .1101 .1220 .1156 .0979
18.00 .2104 .2104 .1904 .1783 .1794 .1447 .1250 .1189 .1107 .1163 .1136 .0961
19.00 .2115 .2115 .1948 .1831 .1849 .1483 .1284 .1199 .1104 .1128 .1125 .0986
20.00 .2132 .2132 .2011 .1899 .1918 .1506 .1288 .1179 .1094 .1104 .1104 .0950
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 40

IJCSBI.ORG
21.00 .2157 .2157 .2079 .1967 .1987 .1527 .1283 .1155 .1081 .1093 .1093
22.00 .2189 .2189 .2158 .2034 .2035 .1587 .1245 .1120 .1067 .1081 .1108
23.00 .2225 .2225 .2227 .2088 .2052 .1654 .1213 .1107 .1045 .1051 .1103
24.00 .2269 .2269 .2285 .2139 .2087 .1655 .1171 .1113 .1029 .1075
25.00 .2312 .2312 .2345 .2208 .2179 .1612 .1138 .1137 .1050
26.00 .2345 .2345 .2384 .2269 .2281 .1573 .1122 .1144
27.00 .2370 .2370 .2386 .2299 .2341 .1575 .1143 .1120
28.00 .2393 .2393 .2356 .2305 .2364 .1537 .1191 .1102
29.00 .2394 .2394 .2292 .2288 .2380 .1467
30.00 .2377 .2377 .2237 .2252 .2271 .1334
31.00 .2342 .2342 .2210 .2222
32.00 .2295 .2295 .2171 .2183
33.00 .2242 .2242 .2297 .2277
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 41

IJCSBI.ORG
Accordingly, from Table 1, it is observed that at the areas marked with ovals
only three points available for interpolation instead of conventional four
points. To address this issue, we used Barycentric interpolation this
facilitates full coverage of the aerodynamic database.
Given a point r which lies within a triangle bounded by three vertices (

r1 , r2 , r3 ) in the plane, the barycentric weights ( 1 , 2 , 3 ) for each vertex
respectively are given by (Wikepedia, 2014):
1
1 ( x1 x3 ) ( x 2 x3 ) ( x x3 )
( y y ) ( y y ) ( y y )
2 1 3 2 3 3
3 1 1 2
where,
r1 ( x1 , y1 ), r2 ( x2 , y2 ), r3 ( x3 , y3 ), r ( x, y)
If the function values at the three vertices ( r1 , r2 , r3 ) are given by the scalars (
z1 , z 2 , z 3 ) respectively, then the linearly interpolated value at point r is given
by:
3
z i z i
i 1
It is noted that the weights ( 1 , 2 , 3 ) are all greater than zero if the point r
lies within the triangle. If the point lies on the edge, the weight of the
opposite vertex is zero.
3. RESULTS
As discussed already, with the approximate trim calculations used as initial

guess values for conventional optimization methods we can have faster
convergence. Accordingly, a study is carried out for different flight
conditions within the flight envelope for a level flight. The results are
tabulated and presented in Table 2.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 42

IJCSBI.ORG
Table 2. Comparison of trim without and with Approximate Trim Calculations
Sl No. Case Conventional Conventional
Optimization Optimization
method method
with approximate
trim calculations
1 0.3100M and 7.7374 km 198 10
2 0.4881M and 12.198 km 163 11
3 0.4000M and 4.572 km 86 17
4 0.7889M and 9.6387 km 43 15
5 1.2458M and 9.6387 km 47 24
With the Barycentric interpolation, it is possible to cover full aerodynamic

database with respect to Figure 3.
4. CONCLUSIONS
For the nonlinear flight dynamic analysis and flight controller design,
hundreds of linearised aircraft mathematical models are required. Aircraft
trim is obtained by using the conventional numerical optimization methods.
Approximate trim calculations are used to provide good initial guess values
for the optimization methods for faster convergence. This also ensures
global convergence within the flight envelope for the generation of
equilibrium points. Similarly, for the cases at extreme pockets of the
aerodynamic database where only three points are available for
interpolation, we have used the Barycentric or triangular interpolation.
Employing approximate trim calculations for optimization methods and
Barycentric interpolation result in a computationally efficient nonlinear
flight dynamic analysis and flight controller design with full coverage of
aerodynamic database.
REFERENCES
[1] A. A. Pashilkar and S. Pradeep, 1999. Unsteady Aerodynamic modeling using
Multivariate Orthogonal Polynomials. In: AIAA Atmospheric Flight Mechanics
Conference and Exhibit. Portland, OR, August 9th 11th, 1999. Reston: AIAA
Publications
[2] Brian L. Stevens and Frank L. Lewis, 1992. Aircraft Control and Simulation. New
York: John Wiley & Sons Inc.
[3] D. J. Bugajski and D. F. Enns, 1992. Nonlinear control law with application to high
angle-of-attack flight. AIAA Journal of Guidance, Control, and Dynamics, 15(3),
p 761-767.
[4] David Allerton, 2009. Principles of Flight Simulation. Great Britain: Wiley
Publications.
[5] Eugene A. Morelli, 1995. Global Nonlinear Aerodynamic Modeling using multivariate
orthogonal functions. AIAA Journal of Aircraft. 32 (2), p270-277
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 43

IJCSBI.ORG
[6] Eugene A. Morelli, 1998. Global Nonlinear Parametric Modeling with Application to
F-16 Aerodynamics. In: American Control Conference. Philadalphia, Pennsylvania,
June 24-28, 1998. Piscataway, New Jersey: IEEE Publications
[7] http://en.wikipedia.org/wiki/Barycentric_coordinate_system, accessed on 23rd June
2014.
[8] J. M. Rolfe and K. J. Staples eds., 1991. Flight Simulation. New York: Cambridge
University Press.
[9] Mcruer D.T., Ashkenas I. and Graham D., 1973. Aircraft Dynamics and Automatic
Control. Princeton: Princeton University Press.
[10] Nandan K. Sinha and N.Ananthkrishnan, 2013. Elementary Flight Dynamics with an
Introduction to Bifurcation and Continuation methods. New York: CRC Press
[11] Shyam Chetty, Girish Deodhare and B. B. Misra, 2002. Design, development and
flight testing of control law for the Indian Light Combat Aircraft. In:AIAA Guidance
Navigation and Control Conference and Exhibit. Monterey, CA, August 5th 8th, 2002.
Reston: AIAA Publications

Lathasree, P. and Pashilkar, A. A., 2015. Efficient Computational Tools For
Nonlinear Flight Dynamic Analysis In The Full Envelope. International
Journal of Computer Science and Business Informatics, Vol. 15, No. 3, pp.
26-44.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 44

IJCSBI.ORG
Quantitative Aspects of Knowledge

Knowledge Potential and Utility
Syed V. Ahamed
Professor Emeritus, Computer Science Department
City University of New York,
New York City
ABSTRACT
In this paper, we enhance and extend the quantitative theory of knowledge. It emphasizes
the truism that academic knowledge is acquired over a time by process of learning from
faculty and staff at the colleges and universities. A formal model of student environment
from high schools to various levels of universities granting doctoral degrees training is
assumed in this research. In this paper, we also include the effects of learning in post
secondary schools and in post-doctoral institutions. The net effect is that most human
beings continue to learn but to varying degrees depending on the characteristic of the
student/employee, the faculty attitude to teaching/job environment and the duration of such
interactive process. The proposed model allows for desirable growth of individuals who
reward the society in a beneficial way. This is the primary reason for the development of
the model. However, the same model is also applicable for those who live to hurt and
destroy social values. The Mafia schools and warring nations train their terrorists and offer
them all the lethal tools of hurt and destruction. The systematic decay of human
civilizations appears as scientific to the negative thinkers as the science would appear
attractive to the civilized societies. The portrait of different forms of life is thus tracked as
a mathematical approximation based on statistics and norms drawn from the society itself.
The model is predictive and becomes a good leading indicator of where living and learning
can take any individual over a given period. Different scenarios of student personality who
learn to earn, who learn to learn, who love to learn are presented in conjunction with the
faculty personality who teach to earn, who teach to educate, who love to teach are
examined to gauge the knowledge potential gained by the learners from postsecondary
training centers to postdoctoral centers advanced research and social contributions.
Keywords
Knowledge Acquisition, Knowledge Potential or KnP, Knowledge Deployment, Utility of
Knowledge, Annual Income.
1. INTRODUCTION
Quantitative measures of knowledge exist in the literature [1], even though
they are not widely used. A greatly enhanced model in this paper is based
on two axioms that (a) that learning and living are two continuous processes
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 45

IJCSBI.ORG
and (b) that the society rewards the human beings by the expected
contribution to the job environment. The caliber of learning is established
by the knowledge potential gained by the individual1 at any given stage in
life. At the lowest level of mandatory secondary school, the Knowledge
Potential2 (or KnP) is relatively low at level of 0 K at the graduation from
high school or at subzero level for lower level. Through the continued
schooling, the KnP can reach levels of 100 at the Bachelors Degree level,
levels of 220 K at the Masters Degree level and attain levels in a wide
range 270 K to 1004 K (or even higher) at the Doctoral Degree levels.
Differences in universities, faculty and facilities influence the institution tier
levels. Though not important at the lower levels of learning, these
differences become influential in the KnP gained by the students at the
Masters and Doctoral Degree levels. Individual differences in the student
capabilities are also reflected by their achievements and the Grade Point
Averages (GPAs). In addition, the students who wish to finish their degree
as their ulterior motives of learn only to earn, whereas students who actively
pursue the degree to learn to learn acquire higher KnPs throughout at an
exponential rate, through their post-graduate programs and perhaps the rest
of their lives.
There is a surprising extent of correlation between the annual incomes with
the KnPs gained at almost all levels of education from secondary school to
postdoctoral training. The study confirms two universal observations. First,
those learning only to earn and gratify their own lower level needs [2] as
human beings, reaches a premature saturation level at lowest income level
of about 20-22 thousand dollars (2012 National Labor Statistics). Second,
those learning to learn the skills to gratify the outstanding
social/technological needs saturate at 4 to 5 times the annual income (2012
National Labor Statistics) at graduation. Further, those who continue to
live and learn together reach a much higher level and the accelerated growth
continue until the biological process of age hinder the learning, memory and
retention functions.
1
The learning scenario is universal in all situations of a student in a school, a disciple in a shrine,
an apprentice in a job, a child from a parent, an intelligent chip in a network, etc. The flow of
knowledge (unidirectional or bidirectional) like the flow of power, are the prime features in
consideration.
2
The symbol K should be treated as degrees of knowledge and not as degrees Kelvin as a
measure of the temperature. The interpretation should depend in the context of knowledge and
not as the temperature. Thermodynamics (with A and K to designate temperature) is a branch
of Physics whereas Knowledge Science (with ( K)) is a branch of Learning and Education and
Retention of knowledge after various levels of Schooling.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 46

IJCSBI.ORG
2. REPRESENTATION OF GENERIC FORMAT OF
INTERACTIONS
Knowledge is generated when any noun (object) performs a verb function ;
how the verb is performed, adds more dimension(s) to new or the older
knowledge already collected and stored by noun object(s). The value of
such knowledge can rank as low as triviality, a reiteration of what is already
known, or as high as new oracle(s) of perpetual wisdom. The structure of
knowledge can be founded on this truism.
2.1 Truisms about the Structure of Knowledge
Knowledge results due to effects of interactions between noun objects
(nos) and verb functions (vfs) and vice versa. For example, when one
human talks (*) to another and the other responds, knowledge is generated.
How the interaction takes place adds another dimension in the interaction
process and its effects. For example, if talks are replaced by yells, then
the effects that follow can be different.
There are five components in such an interactive process.
There are five components (a through e) in such an interactive process.
A noun object no1, initiates a verb function vf and the mode of interaction is
establishes as *. This basic elementary process is represented as
no1 * vf .
Further, broken down this process is written down as:
no1 *12 vf12 no2 ; or as
no1 vf12* no2
and its response from no2 is written as:
This entire element of any elementary transactional process can be written
as:
(i) a forward process by no1 (full lines)
The entire personality The entire personality

of no1 (act/react) of no2 (react/act)
no1 *vf21 and vf12* no2
a __________(i) ________ b c d e
Followed by Interactive
Process
j i h g . . . (ii) .....f
(ii) a backward process by no2 (dashed lines)
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 47

IJCSBI.ORG
in a time sequence. Represented as a diagram the, a to j interactive process
is depicted as:
vf12*
no1 no2
*vf21
Figure 1. Depiction of an interactive process between two participants no1 and no2.
This diagram does not have an easy flow chart that be implemented on a computer
system. However, the diagram can be partitioned into two symmetric halves, one for
each participant and linked via a current interactive event in a process.
Any number of these processes will give rise to an interaction and
knowledge is accumulated at each of the minor steps a through j in each
process depicted as and by the directional arrows. Significant knowledge
is added when these steps are arranged in an orderly and systematic fashion.
Such accumulated knowledge can occur for a few microseconds in
computerized and networked elements it can occur over decades and
lifetimes in cultures and societies. In Figure 1, the methodology for the
accumulation of knowledge has syntactic and semantic relations between
the elements a to e, and then through f to j and then again a to j in a
contextual sense. The rules for the flow and accumulation of knowledge
have their cultural and societal foundations.
2.2 Computational Approach to the Generic Interactive Process
The logical and functional processes in Figure 1 are not evident to be
programmed on a typical computer system. Programming of social
computers can become a selected expertise. Alternatively, definitive
approaches become necessary to force the constraints in the social
elements of any social system to be simulated on any typical computer
system.
Two such parameters are reversibility of the social elements and the
continuous scanning of all parameters of each social element to forcing
the computer system to emulate the social system. Social systems act
and react in real time; and the simulation software should be able to
track the changes of all parameters that influence the social interaction.
However, the representations in Figure 1 can be decomposed by
realizing the roles of the interactive participants are reversible and
symmetric, i.e., the processes of no1 or no2 can be imaged in subroutines
but with the parameters being updated from those from no2 to no1
respectively, and then the vice versa. A programmable flow chart of the
generic inter action process is shown in Figure 2.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 48

IJCSBI.ORG
The generality of the interactive professes depicted in Figures 1 and 2
is exemplified in the three following situations. First, a student and
teacher interaction is modified by a history of prior events stored and
updated in the computer memory. This depiction is programmable by
two routines or tasks for the CPU that functions for no1 and no2
alternately to depict one or more series of interactions. Second, an atom
of carbon can interact with a molecule (two atoms) of oxygen to form a
molecule of CO2. The bondage between the atoms is a programmable
set of events that makes one molecule of CO2. Third is a universal
example for all species. One of the XX or XY chromosomes from the
male sperm interacts with one of the female XX chromosomes to make
the genetic imprint of the unborn baby. Randomness and statistical
coupling occurs during most of the natural process, such as the birth
process of a fetal, or the germination of a seed. Such interactive
processes are innumerable and most prevalent in nature .
2.3 Interaction of Knowledge Elements in Human Minds
An element of knowledge in mind is like a chromosome in the womb.
Under controlled environments, a new specimen (or even a new species)
may evolve. Largely, the processes are probabilistic and circumstantial and
the new product of knowledge-evolution can occur as a coincidence or as a
matter of intense training in shrines and universities. It is our contention the
pearls of wisdom and invention can be farmed by careful implanting of
pearl fragments in the tissues of an oysters. A nursery for pearls can be
reconstructed in the universities much like an artificial pearl-farm in tropical
oceans.
3. KNOWLEDGE ACQUISITION IN INSTITUTIONS
Most institutions generally offer a systematic and a stylized format of
learning for students. Typical schooling in the United States consists of
Secondary and High schooling followed by formalized junior and/or senior
college education and finally graduate education, for the Masters, Doctoral
and Post-Doctoral training or internships.
During the last few decades, knowledge is gained in a series of classroom
sessions with well-defined faculty and over finite durations of time (class
hours per week during semesters and 2/3 semesters per years). Knowledge
gain can thus be integrated based on the attitudes of the students, the setting
of the institution, type and quality of faculty members, and the duration for
the degree(s). On a statistical basis, the parameters that facilitate the
educational status, or the potential of knowledge of each student, become
quantifiable. In a sense, the compilation of knowledge in the human mind
becomes an integrative process and it can the represented as a knowledge
potential (KnP) in degrees of knowledge symbolized as a finite number of
K in the knowledge domain.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 49

IJCSBI.ORG
3.1 Knowledge Potential Defined
Knowledge potential of a student is a number (measured in degree of
knowledge or K) gained by the student3 from numerous faculty members
over the student ( and ) faculty contact integrated over the duration of
the study/contact. The parameters in the leaning process(es) are individual
and/or statistical, the integration is mathematical, and the type(s) of
interaction is definable by the social/cultural modes of behavior such as
collegiate, friendly, congenial as in civilized and elite circles, or even
hostile, detrimental or destructive as in brutal, invasive or wars. A
(temporarily) stationary baseline of knowledge is desirable in most
situations and can be arbitrarily chosen to suit the particular study. For
graduate studies, we have suggested at school graduation the knowledge is
at 0 K.
In a strict sense, the knowledge potential of any individual should be
considered as zero at the formation of the seminal cell with inception of XX
or XY chromosomes derived from the male and the female of the parent
members in any given species. The knowledge is thus embedded in the
genetic code with certain degrees of conformity to offer the physiology of
the member species and a certain degree of latitude to give the freedom of
the personality of the fetus. For genetic studies, the baseline with a KnP of
K is perhaps founded in the knowledge embedded in genetic coded of
parents or ancestors.
Knowledge potential has a utilitarian value. In an immediate sense, it
indicates how that potential can be utilized for solving current problem(s) at
hand. While the quality of the solution may be the highest in the direction
of the specialization, the enhanced training that was necessary to attain the
KnP will also be valuable for solving generic problems. For example, a
Master's degree holder in Biochemistry with a KnP of 240 K may solve a
problem in organic chemistry much better than a layman. In a longer term
perspective, KnP multiplied by the expected contributions for 30 years in
the career trail would have a utilitarian value of 7200 knowledge-years.
Certain precautionary rules should be considered since the KnP value can
swing up (or down) by the job effects, social setting, diligence of the
individual, etc. In reality, the acquired skill over a lifetime can be
significant.
3
The scenario is universal in all situations of a student in a school, a disciple in a shrine, an
apprentice in a job, a child from a parent, an intelligent chip in a network, etc. The flow
of knowledge (unidirectional or bidirectional) like the flow of power, are the prime
features in consideration.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 50

IJCSBI.ORG
no1 no2 no2 no1
Memory Role of no2 Including Memory

from 0 to t
of vf21 *
vf12* Current Step and time t effects that
of vf21*
monitor the current process
t to t +t
no1 Current Step

no2
of vf21*
Role of no1 Including Memory t to t +t
Memory from
* vf21 0 to t of
and time t effects that vf12*
monitor the current process
Ce n t e r L i n e o f S y m m e t r y
Any Knowledge Centric Any Knowledge Centric
Object no1 interacting with A Step in the Object no2 interacting with
Similar Object no2 Similar Object no1
Interactive Process
Notation: Interactive; Unidirectional
Figure 2. The depiction of a step in the interaction that has built-in memory effects for both participants and the effect on the current event in a
chain of interactions. Full lines indicate student to faculty learning interaction and dashed lines indicate faculty to student teaching interaction.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 51

IJCSBI.ORG
In a true sense, the net utilitarian value should be an integrative process
every learning experience of the individual. Furthermore, the acceleration
of the learning process and its retention are both generally, the highest in the
early job experiences compared to those in the declining year of ones
career. Some of these deliberations are considered by technical managers in
corporations.
3.2 Student Traits
Students offer various mindsets to learning depending on who is teaching
what, when the teaching occurs, and then how the teaching occurs.
These variables contribute to the mindset in a psychological framework
defined by kristivity (st) in mind, a parameter unique to the student.
Next, the path of communication () and the area of psychological contact
() combine to offer kristance (=st . / kohms)) that facilitates the flow
of knowledge as current, and grow of KnP by an incremental amount.
Initially, the quantity of knowledge received depends on the of the student.
3.3 Faculty Factors
In the prior section, faculty factor (fst) influences the who, how and
when aspects of the knowledge delivered to the student. This factor,
though not very critical in the early stages of learning become important the
student develops a personality and a mindset of his/her own. Thus the
kenergy i.e., knowledge energy, delivered over a time will become
Kenergy = (KnPf - KnPs)*{Kurrent (as a function of fst and
kristance)}*Duration of Study.
The stored version (or memory effects) of this kenergy enhances the KnP of
the student. It is important to note this energy could be counter-productive
and act as a drain on the energy already stored in the student KnP
previously acquired. This condition frequently appears as confusion or
negation on the part of the student. In general, this is frequent situation,
found during a period of culture shock or when negative propaganda that is
delivered by TV and Internet.
3.4 University Facilities and Settings
The environmental and extraneous factors, such as classrooms, libraries,
duration of the commute, housing, and student facilities provide tangential
effects of learning. Such effects may sometimes have emotional influences
on the net change in the Student KnP. The gain of student KnP due to these
factors may add or subtract some marginal numbers to the final KnP gained.
Such effects are included by incrementing or decrementing the KnP gained.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 52

IJCSBI.ORG
4. GRADUATE EDUCATION
4.1 Masters Degree Students
The details of gain in the knowledge potential and a basis for the
quantification of knowledge potential or KnP are both presented in
Reference [3]. Knowledge potential is (almost) derived as the measure of
temperature when an object (student) is in a hot/cold setting (Shrine/Mafia
institution). The KnP rises to gain kenergy to serve and benefit the society
or sinks low to deplete the morality and spread violence4. In this paper
conductive mode of knowledge-transfer is considered, even though
inspirational and Transmission mode are known to exist.
Knowledge potential thus serves as an indication of how well and how
quickly individuals can address, comprehend and gainfully solve problems
in unique, distinctive and creative fashion(s) that are also economical and
productive. The concepts have been applied to the educational platform as
students as they go through high school through to doctoral degrees (if they
do). In a generic sense, this is a universal principle that if an solution of any
problem is to be reached, the knowledge potential in each and every prior
solution has to be evaluated and excelled by students.
The gain in KnP for Masters Degree students is presented in Figures 3 and
4. The GPA along the X-axis is a good indication as to how well the
students have integrated their learning to become knowledgeable. There are
five (A through E) trends shown and indicated for the cases where the
students with good and bad learning-attitudes learn from excellent, average
and poor faculty members. The good students learn about how to learn
while learning the course material and become proactive to the additional
course material taught thus boosting their KnPs. The average students do
learn but to pass the examinations and complete the degree. In a similar
mode, the average faculty can teach the course material, whereas the
excellent faculty would learn (love) to teach what they teach and how they
teach.
This later synergy of faculty student interaction generates a series of Verb-
functions (VFs) from the faculty to teach the foundations of course material
knowledge, and conversely (VFs) from the students to distill concepts from
knowledge and infuse them into wisdom trail of productive lives.
4.2 Doctoral Degree Students
The expected of KnP for the PhD students is shown in Figure 5.
Three trends (A, B, and C) and two curves (D and E) are depicted.
4
This analogy can be taken only to a certain extent since no mature society need Mafia to
survive, whereas cooling is desirable for life in hot environments.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 53

IJCSBI.ORG
Snapshot of Knowledge Potential (KnP) at Graduation with Masters Degree
300K (D) (A) Dashed Line: National Average Line
Ex. Fac. Good Average Faculty and Av. Student Attitude
Student Attitude
(B) Poor Faculty-Good Students Attitude
260K National Average (A) (C) Good students who multiplier and
Av. Faculty Average Ex. Faculty Av. increment their own effort with Poor
280K
Student Attitude Student Attitude facultys teaching.
(C) Excellent Faculty-Bad Students
240K Attitude Poor students who do not
multiplier their own efforts with
(E) Excellent facultys teaching
Bad Faculty Bad (D) Excellent Faculty-Good Student
220K Student Attitude Attitude. Highest KnP at Masters
Level. Good students who multiply
200K and increment their own effort with
(B) Excellent facultys teaching.
Bad Faculty Good (E) Poor Faculty- Bad Students Attitude
180K Student attitude Lowest KnP at Masters Level.
Poor students who do not multiplier
their own efforts with Poor facultys
KnP teaching.
2.5(C+) STUDENT 3.0 (B Av.) GPA 3.5(B +) 4.0 (A and A+)
Figure 3. Expected
160K knowledge potential (KnP) of different students at the completion of Masters degree. Some segments of these trends are
exclusive by definition. For example, a student with really bad attitude (Trend E) gets expelled from the Masters degree program during the
first one or two
140Ksemesters and does not reach the high end of trend E. Conversely, students with good attitude, rarely remain in the lower
section of trends C and D, but may decline to trends A, B or E during the Masters program by neglect or by abandoning their early attitudes.
Student effort is thus a fundamental element in acquiring a high KnP. The figure indicates as a warning to those slipping and as an incentive
for those who have fallen behind. Please see Section B under the current heading.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 54

120K
IJCSBI.ORG
Snapshot of Knowledge Potential (KnP) Distributions at Masters Level
300K Students
Graduate
280K with MS (D)
260K
240K
(A) Min. KnP (C)
280K
220K (200) at (E)
Masters
Top
200K Min. KnP
Universiti
(180) at
es
Masters
180K Average
Universities
with Min. KnP
MS (B) (174) at
Masters
Students Low
Expelled Universiti
from MS es
KnP
140K
2.5(C+) STUDENT 3.0 (B Av.) 3.5(B +) GPA 4.0 (A and A+)
Figure 5. Expected knowledge potential (KnP) of different students after 24 months in the Masters program. The minimum KnP level i s tolerated by
lower strata of universities and low quality of faculty members in such universities. Since the KnP is low in trend s B and in A, most universities strive to
at least meet or better the National average of the KnP level of 180 K at the Masters Dregee level. The top stratum of Masters dgree holders with KnP
of 280+ (see trend D), in most cases outperform doctoral degree holders with poor student attitude, poor faculty and at low l evel uinversities. Please see
trend A in Figure 5.
120K
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 55

IJCSBI.ORG
Snapshot of Knowledge Potential (KnP) at Graduation with Doctoral Degree
1200K Note: Plodders are students who keep plugging away as they did in MS
degree program. They do not change their attitude toward gaining
knowledge (KnP) during the doctoral years. Accelerator students
E combine their own skill sets exponentially (i.e., exponential of the KnP
Rare and Exceptional gained adds to their prior KnP) with the (poor, average or excellent)
KnP delivered by faculty teaching.
(A) Plodders and Non-Multiplier Students, and Poor Faculty; Poorest
Achievers.
D (B) Multiplier Students, & Poor Faculty,
1000K Column (B); Average Achievers.
(C) Non-Multiplier Students, & Excellent Faculty,
Column (C); Average Achievers.
(D) Accelerator Students, & Poor Faculty, High Achievers.
Superior Students (E) Accelerator Students, and Excellent Faculty. Highest Achievers.
Non-Multipliers are average Students (GPA=3.5), Average Faculty
(GPA=3.8) at the doctoral years. Plodders are students who keep
plugging away as they did in MS degree program. They do not change
Median Population their attitude toward gaining knowledge (KnP) during the doctoral
C B years. Accelerator students combine their own skill sets exponentially
800K A (i.e., exponential of the KnP gained adds to their prior KnP) with the
(poor, average or excellent) KnP delivered by faculty teaching.
Expelled Note: The numbers are statistically averaged. In reality, students can
traverse the area between A and E during the doctoral years.
3.0 3.2 GPA 3.4 3.6 3.8 GPA 4.

KnP
600K knowledge potential (KnP) of different students at the completion of Doctoral degree. As it can be expected, Plodders do the
Figure 6. Expected
worst (A) and Multiplier students at poor universities do gain enough KnP to graduate. The accelerator students (D and E) do the best but
are extremely rare, even though some faculty members and professional show this rare gift of accelerating faster than teachers and mentors.
Multiplier students do better than plodders but still are not able to take full advantage of the faculty talent. Exceptional non-multiplier students
at excellent universities will do as well as multiplier (B and C) students at low level universities.
400K
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 56
200K
IJCSBI.ORG
400
Red KnP (Top Curve)
350 Blue Salary, $ in 1000's (2012)
Green Population- Distribution
300 KnPs
(Lowest Curve)
250
200
150
100
50
0
Doctoral Masters Bachelor Assoc Post Coll no HS < than HS
Degree Secondary Deg Diploma
400
350
KNPs (Red)
300
250
Salaries (Blue)
200
150
100
50
0
Doctoral Masters Bachelor Assoc Post Coll no HS < than HS
Degree Secondary Deg Diploma
Figures 6 a and b, The trend indicates that at the highest levels of education, the
KnP and salaries are the highest for this sparsely populated segment population
and vice versa. In addition, at the lowest KnP levels, the national minimum wage
($7.75 per hour 2012 rate) law also influences the total compensation. The HS
diploma holders and post secondary trained employees are comparable in both
their KnPs and salary levels. The KnPs are derived from the training and its
duration whereas the salary level is surveyed.
The choice of the most creative mentor is of significant to the future
contributions of their doctoral student. To this extent, the training of the
advanced PhD students becomes an art rather than a job. The Art of
Scientific Investigation [4] in teaching becomes the practice of the superior
faculty members and mentors as much as the art of learning to learn
becomes the responsibility of the rare and exceptional students as depicted
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 57

IJCSBI.ORG
by the two exponential curves C and D in Figure 5. Post-doctoral training
and internships can also be quantified along the basis of trends and curves
presented in this paper.
The best of the students learn how to learn from the knowledge they have
already received and then go on to apply the newly gained knowledge to
further their KnP. An accelerative trend is established. The KnP thus grows
at an exponential rate5 that reaches as high as 1154 K for the Doctoral
students (with excellent faculty and 5 FTE years in an excellent university).
Comparatively, the more mundane students reach just enough, as low as
369 K (with poor faculty, 5 FTE years at an inferior university), to get their
Ph.D. degrees as job seekers! Unfortunately, after 5 FTE years of their
lives, student in the lowest strata of doctoral student end up with a KnP that
is just about or even less than the KnP of top MS students with excellent
faculty in top institutions when they both finish their degrees. Top Masters
Degree graduates are sometimes more coveted than low-level doctoral
degree graduates are as much as the top Bachelors degree students are
preferred over the lower strata of Masters Degree students. The starting
salaries, as it is reflected by the Salary Surveys in the United States. The
tracking of the statistically averaged trajectories for KnP, (2012) starting
salary for Doctoral, Masters, and Bachelors degrees holders is evident 6 in
Figures 6 a and b.
5. CONCLUSIONS
The KnPs developed in this paper are indicative of the employee or
students ability to solve significant problems in a creative and beneficial
fashion. Whereas these curves reflect the generally accepted notion that
more education leads to better pay, we have a predictive model that related
higher education implies higher computable KnP and thus a higher income.
This intermediate parameter (KnP) is a computed based on employee or
student traits, industry or university setting, and the quality of management
or teaching/research teaching faculty. We also indicate the parameters that
influence the final KnP of the student at graduation and training received
after as an extrapolation of the gain in the KnP during the employment or
5
Out of the 20 student mentored, we found 10% (or even less, one with the traits of an
accelerated learner and the other with an inclination to learn but unable to follow through)
who were in the top category and 60-70 % in the mid range and then about 30-20 % who
just wanted a Ph.D. degree to append their names.
6
After the Bachelors degree level in Figures 6a and b, a slightly bump in salary is seen.
This is ascribed to the fact the more promising BS degree holders are lured into jobs
while they could have easily enrolled in the Graduate programs of universities. Further,
the desire to earn at BS degree shows a psychological peak than the desire to learn, thus
the better students may compete and get higher salaries than average (B/B+) students who
enroll in the Masters Degree programs.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 58

IJCSBI.ORG
college/institutional years. The model is entirely predictive but subject to
the sampling error in the student, faculty and the university populations. By
and large, the model is as accurate as the age and health prediction in any
culture or society. Individual differences continue to exist; however, the
circumstances can be consciously altered to maximize the possibility of
being constructive and creative by extrapolating the knowledge gained thus
far, into the environment of the culture and society.
REFERENCES
[1] Syed V. Ahamed, Next Generation Knowledge Machines, Design and Architecture,
Elsevier Insights, Hardcover September 2013.
[2] Abraham H. Maslow, Towards a Psychology of Being, Sublime Books (March 7,
2014); see also S. V. Ahamed, An enhanced need pyramid for the information age
human being, in Proceedings of the Fifth Hawaii International Conference, Fifth
International Conference on Business, Hawaii, May 2629, 2005, see also, An
enhanced need pyramid of the information age human being, paper presented at the
International Society of Political Psychology, (ISSP) 2005 Scientific Meeting,
Toronto, July 3-6, 2005.
[3] Syed V. Ahamed, Next Generation Knowledge Machines, Design and Architecture,
Elsevier Insights, Hardcover September 2013. See Chapters 9, 10, and 11 for the
Development of Knowledge Potential in Universities.
[4] Syed V. Ahamed and Victor B Lawrence, The Art of Scientific Innovation, Prentice
Hall, 2005.

Syed, V. A., 2015. Quantitative Aspects of Knowledge Knowledge
Potential and Utility. International Journal of Computer Science and
Business Informatics, Vol. 15, No. 3, pp. 45-59.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 59

IJCSBI.ORG
A Novel Approach for

Recommending Items based on
Association Rule Mining
Vasundhara M. S.
M.Tech student, Computer Science Department
Visvesvaraya Technological University, Belgaum
GSSSIETW Mysore, India
Gururaj K. S.
Associate Professor and Head, Information Science Department
Visvesvaraya Technological University, Belgaum
GSSSIETW, Mysore, India
ABSTRACT
Currently online shopping has become a trend. People prefer to go for online shopping
rather than going out and shopping for themselves as it provides an easier and quicker way
to purchase items of their choice with quick transactions. Recommendation systems are
widely used for recommending products to the end users in their interested fields. In the
existing system, most recommendations are given to the users based on the browsing
history which may or may not be of users interest and also the quality of the recommended
items may not be guaranteed. This paper aims to find the efficient approach using Data
Mining concept called Association Rule Mining with content-based and collaborative
filtering in order to recommend the only relevant information to the buyers. The items are
recommended for the buyers based on the content of buyers past buying history and opinion
of other users in order to find out the quality of the item. Association Rule Mining is used
for extracting the useful information from the transaction dataset and produce efficient and
effective recommendation based on buyers interest thus satisfying the buyer in better way.
Similarly for music and videos the recommendation is based on the keywords set by the
Business Entity using Feature-based recommendation system.
Keywords
Association Rule Mining, Pattern Discovery, Knowledge-based Filtering, Data
Mining.
1. INTRODUCTION
Currently online shopping has become a fashion. Customers are interested
in purchasing different products online as it is easy, convenient and gives
wide variety of products on single platform. Customers can sit at home and
shop with convenient and order products. Another advantage of online
shopping is that the recommendations are given for the already purchased
products, which will help the customers to shop in their interested fields.
Many shopping websites provide recommendations based on their own
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 60

IJCSBI.ORG
recommender system. These recommendations provided may be relevant or
irrelevant to the customers.
The recommender system provides recommendations to ease the
users search and buy more attractive products and increase vendors profit
[11]. The recommendations are given based on the analysis of the purchase
patterns of the customers. The frequently purchased products are clustered
as frequently purchased products and recommended for the customers. The
Web Usage Mining (WUM) which provides relevant information to the
buyers is used [5]. It stores the users behavior on the Internet, processes
that data for generating recommendation when user logs in next time.
By acquiring the knowledge gained from the analysis of the users
navigational patterns with other information we can fulfill the requirements
of customers by customizing through web personalization [15]. The need for
web personalization is to improve the usability and user retention by
predicting user needs. This is achieved from the knowledge gained by
analyzing the purchasing patterns, contents searched, individual interest and
the user profile data, we can predict user needs or the relevant information
for the customer. User profile contains demographic information (such as
user personal details etc.) for each user of a Web site, as well as information
about users interests and preferences through registration forms [6].
This work presents a new approach for recommending products to
the buyers by combining the features of Content Filtering [1], Collaborative
Filtering [2], Keyword-based Filtering [7] and Association Rule Mining [8]
to produce efficient and effective recommendations.
2. RELATED WORKS
2.1 Content-based Filtering

The Content-based recommendation system recommends products to the
buyers based on the content of the buyers past purchase history [1].
Purchase history gives the content of the overview of the product, in which
buyer is generally interested in from the different types of products. Fig 1
shows Content-based Filtering Algorithm. It is used for separation of items
based on the buyers area of interest. Like other system, content-based
filtering also has some limitations like finding the quality of the item. For
example, content-based filtering cannot differentiate between good article
and bad article if both of them are using the same terminology.
By applying Content-based filtering algorithm the recommendations
are given. The items are recommended based on the contents of purchased
items using the Dynamic User Profile [6]. Each time the user logs-out, the
items are categorized and stored in the web profile of the user in browser
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 61

IJCSBI.ORG
side thus reducing the performance problems and the recommendations are
built offline.
Fig 1: Content-based Filtering

2.2 Collaborative-based Filtering
Content-based filtering cannot differentiate between good article and bad
article if both of them are using the same terminology. To overcome this
problem Collaborative-based Filtering system is used because it is based on
opinion of the other users. Collaborative-based filtering works by creating a
user profile [2]. A new user is matched with the user profile to find
neighbors, who had similar taste like the new user.
The idea is to recommend based on opinion of the like-minded users.
Recommendation can be based on overall top selling items or past buying
history. Two main categories of CF are User-based and item-based
algorithms [3]. In User-based algorithm it uses user profile to generate
recommendation. It forms a set of users (neighbors) who has similar history
and apply different algorithms to produce top-N recommendation to the
active user [4]. In item-based algorithm it provides recommendation by first
forming a user rating model as categories and then produce recommendation
to the user. This algorithm looks into the set of items the user has rated and
computes similarity to the item i and then selects k- most similar items to
the set of items to which user has rated. The recommendation is computed
by taking the weighted average of the target users rating on these similar
items. The similarity between two items is measured by computing the
cosine of the angle between the two vectors. The similarity between two
item-sets i and j is, computed as follows:
.
, = cos , =
|| ||2. | |2
Where . denotes the dot product of two vectors and i, j are two item sets.
Fig 2 shows the Collaborative-based Filtering
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 62

IJCSBI.ORG
Fig 2: Collaborative Filtering Process

The result is presented by applying item-based collaborative filtering
which gives both quality and performance result. The algorithm generates
recommendation model by analyzing the similarities between various items
in the dataset and then recommend.
For large number of items, recommending items based on the rating is a
difficult task. To overcome this we apply the hybrid algorithm [12]. Here
we combine the features of content-based filtering and collaborative-based
filtering algorithms for efficient and effective recommendation. For
example: we apply content-based filtering technique on the content of the
item and filter them. Based on the obtained result, we apply collaborative
filtering technique for filtering items based on the ratings and final
recommendations are given.
The basic presumption is that there is enough historical data for
measuring similarity between products or users which is not true. But for the
products with frequently changing patterns, it is difficult to give
recommendation using content-based filtering and collaborative-based
filtering techniques.
2.3 Feature-based Top-N Recommendation Algorithm

As enough historical data are not available for recommending items with
frequently changing patterns, we use Feature-based filtering which uses the
keywords set by the business entity for recommendation [7]. Based on the
number of keywords matched to the target item the top-N recommendations
are given. Here the recommendation is done based on the type of the item
the user has purchased. The keywords of the item which are frequently
bought are considered for recommendation. For example, if a person buys
music-CD with keywords like singer name, album name, etc. the system
recommends the most nearest or related CD based on the keywords and
rating given by the user [10]. For frequently changing product catalogs, we
first find the similar product set based on the keywords of new item then
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 63

IJCSBI.ORG
find recommended products for that set using recommendation model and
produce top-N recommendation [9]. Fig 3 shows the Feature-based
recommendation.
Fig 3: Feature-based recommendation Algorithm

Fig 4 shows the top-N recommendation. The recommendation is done based
on the combined features of item. If the user is viewing product on web
page then recommendation is given based on that product. If one or more
products are there in cart, then recommendation is done based on selected
products.
Fig 4: top-N recommendation
3. ASSOCIATION RULE MINING

Association Rule Mining (ARM) [8] is the most current data mining
technique designed to group objects together from large databases aiming to
extract the interesting correlation and relation among huge amount of data.
Association Rule Mining helps in generating strong rules by considering
few parameters (measures). Based on this, we set the confidence and
support values for generating more efficient and effective recommendations.
The confidence specifies the number of chapters to be matched between A
and B [13]. The support specifies the number of recommendation to be
given to the user.
4. MAJOR TABLES USED IN BOOK SELLING WEBSITE
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 64

IJCSBI.ORG
From the user profile, all the stored information are extracted to find the
required information for recommendation. Few major tables are shown
below.
Table 1: Item Details

Field Name Description
Item_ID Items ID
Item_Name Items Name
SubCategory_ID Items Sub_Category
ID
Item_Image Items Image
Item_Cost Cost of the Item
Item_Details Details of the Item
Quantity Quantity of the Item
Keywords Items Keywords
Table 1 shows the Item Details. The Item Details table stores all the details
of the Item.
Table 2: Transaction Details
Transaction_ID Transaction ID
Email_ID Email-id of the
customer
Transaction_Date Date of transaction
Dispatched_Date Date of dispatch
Status Status of order
Table 2 shows the Transaction Details. The Transaction Details table stores
all the Transaction details of the Item.
Table 3: Rating.
RatingId Rating ID
Item_ID Items ID
Email_ID Email-id of the customer
Rating Rating of the Item
PostedDate Date of rating posted
Table 3 shows the Rating. The Rating table stores all the rating of the Item.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 65

IJCSBI.ORG
Table 4: Cart Details.
Cart_ID Rating ID
Item_ID Items ID
Email_ID Email-id of the customer
Quantity Quantity of the Item
Table 4 shows the Cart Details. The Cart Details table stores details that are
present in the cart.
5. METHODOLOGY
The purpose of this recommendation system is to recommend Items to the
buyer that suits their interest.
This system has the following steps:
1. Find out the category and subcategory of the Item that the buyer has
bought earlier from the buyers web profile.
2. From the transaction database find all those transactions whose
category and sub category (if there is any) is the same as found in
step1.
3. For content-based recommendation- Perform content based filtering
on category / subcategory found in step1, to find out the items that
are much similar to the books that the buyer has bought earlier based
on the content from the buyers past history record and apply
association rule on those transactions and find out the Items that the
buyer can buy afterward. Adjust the support and confidence
parameters to get stronger rules.
4. For collaborative-based recommendation- perform same as step 3 by
applying collaborative based filtering.
5. For content and collaborative-based recommendation- on the result
of step 3 perform item based collaborative filtering and find out the
list of items in the descending order of recommendations and apply
association rule. In this step system actually evaluate the quality of
the recommending books based on the rating given to those items by
the other buyers.
6. For feature based recommendation- perform same as step 3 by
applying feature based extraction process.
7. Arrange the intersection result in the descending order of
recommendations as given by the previous steps.
Outcome of the step 7 is the final recommendations for the buyer.
All these steps are performed when the buyer is offline and the results are
stored in the buyers web profile. When the buyer comes online next time
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 66

IJCSBI.ORG
the recommendations will be generated automatically. This Item
recommendation system is represented by block diagram as shown in Fig 5.
Figure 5: Block diagram of Recommendation system
6. RESULT
The existing system considers only the content-based filtering and
collaborative-based filtering techniques for recommendations of items. But
the drawback is that giving recommendations for frequently changing
patterns like movies and music are difficult. Also the browsing history was
considered, which gives irrelevant recommendation for the user. In our
proposed system we consider the purchase history for recommending
products for the users. Along with content-based filtering and collaborative-
based filtering techniques we use feature-based extraction for
recommending other Items like music, videos etc. and Association Rule
Mining for better recommendation by specifying support and confidence
values. Table 5 shows the set of Items present.
Table 5: Item-set
Item
I11 I12 I13 I14 I15
I21 I22 I23 I24 I25
I31 I32 I33 I34 I35
141 142 143 144 145
151 152 152 154 155
Table 6: Users Browsed Items

Users Browsed Items
User1 I11 I12 I25 I32 I15 141 142 143 144 145
User2 I21 I22 I23 I24 I25 I12 152 152 I32 145
User3 I31 145 I33 I34 I35 I12 145 I11 I22 I24
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 67

IJCSBI.ORG
In Table 6 we can see that user1, user2 and user3 has browsed so many
Items. In existing system if we recommend items based on browsing then all
the above Items will be recommended.
Table 7: Purchased Items
Users Purchased Items
User1 I11 I22 I24
User2 I12 I22 145
User3 I12 145 I11
In our existing system, if we recommend based on purchase history from

table 7 then only 5 Items i.e. I11, I22, I12, I24 and I45 will be recommended
for new user.
7. CONCLUSION
The recommender system benefits the user by enabling them to find items
they like. The item based recommendation produces recommendation based
on the content of the item purchased and matches to the dynamic user
profile. Item-based technique allows collaborative-based filtering algorithm
which produces high quality recommendations and the feature-based
recommendation system increases the recommendations of users interest.
Hence we combine the features of Content-based and Collaborative-based
filtering for recommendation. Also we are using associative model which
gives a stronger recommendations for the users choice.
ACKNOWLEDGMENT
We are thankful to Dr. Sumithra Devi K.A, Principal, GSSSIETW and Mrs.
Radhasudarshan, Associate Professor and Head, Department of Computer
Science and Engineering for their constant guidance and support.
REFERENCES
[1] Robin van Meteren, Maarten van Someren.: Using Content-Based Filtering for
Recommendation by NetlinQ Group, Gerard Brandtstraat 26-28
[2] J. Konstan, B. Miller, D. Maltz, J. Herlocker, L. Gordon, and J. Riedl. GroupLens:
Applying collaborative filtering to Usenet news. Communications of the ACM,
40(3): 77, 87, 1997.
[3] J. Herlocker, J. Konstan, A. Borchers, and J. Riedl. An algorithm framework for
performing collaborative filtering. In Proceedings of SIGIR, pages 77, 87, 1999.
[4] M. Deshpande and G. Karypis. Item-based top-n recommendation algorithms.
ACM Transactions on Information Systems, 2003.
[5] Anand S.T, Abhay Kumar, Asim G.B.: Book Recommendation system based on
Combine Features of Content Based filtering, Collaborative filtering and
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 68

IJCSBI.ORG
Association Rule Mining. In Proceedings of the International Advance Computing
Conference (IACC) 2014, pp.500-503
[6] Ting Chen, Wei-Li Han, Hai-Dong Wang, Yi-Xun Zhou, Bin Xu, Bin-Yu Zang.
Content Recommendation System based on private dynamic user profile, 19-22,
August 2007.
[7] EuiHong (Sam) Han and George Karypis. Feature-Based Recommendation
System, ACM 2005.
[8] Kotsiantis, Kanellopoulos, Association Rules Mining: A Recent Overview, GESTS
International Transactions on Computer Science and Engineering, Vol.32 (1), 2006,
pp. 71-82.
[9] Slimani, Lazzez, Efficient Analysis of Pattern and Association Rule Mining
Approaches.
[10] V.Manvitha and M.Sunitha Reddy, Music Recommendation System Using
Association Rule Mining and Clustering Technique To Address Coldstart Problem,
IJECS Volume 3 Issue 7 July, 2014 Page No.6855-6858.
[11] Lalita Sharma, Anju Gera, A Survey of Recommendation System: Research
Challenges, International Journal of Engineering Trends and Technology (IJETT) -
Volume4Issue5- May 2013, ISSN: 2231-5381, Page 89-92.
[12] Robin Burke, Hybrid Recommender Systems: Survey and Experiments.
[13] Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques,
Second Edition, Elsivier, 2006.
[14] Luo Zhenghua, Realization of Individualized Recommendation System On Books
Sale, International Conference on Management of e-Commerce and e-Government,
978-0-7695-4853-1/12 IEEE, 2012.
[15] Baraglia and Silvestri, Dynamic Personalization of Web Sites without User
Intervention, Vol. 50, No. 2 COMMUNICATIONS OF THE ACM, February 2007.

Vasundhara, M. S. and Gururaj, K. S., 2015. A Novel Approach for
Recommending Items based on Association Rule Mining. International
Journal of Computer Science and Business Informatics, Vol. 15, No. 3, pp. 60-
69.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 69

IJCSBI.ORG
Cluster Integrated Self Forming

Wireless Sensor Based System for
Intrusion Detection and Perimeter
Defense Applications
A. Inigo Mathew and M. Raj Kumar

UG Student, Department of Electronics and Communication Engineering
SVS College of Engineering, Coimbatore, India.
S. R. Boselin Prabhu
Assistant Professor, Department of Electronics and Communication Engineering
SVS College of Engineering, Coimbatore, India.
Dr. S. Sophia
Professor, Department of Electronics and Communication Engineering
Sri Krishna College of Engineering and Technology, Coimbatore, India.
ABSTRACT
Intrusion detection and perimeter defense was a major concern for military and
civil applications. In military the purpose is mostly for monitoring remote high
altitude areas, areas with less access and extreme weather conditions and for force
protection. Provided suitable sensors the system can detect identify and classify
threads based on the count, number, type weather it is armored vehicles or men in
foot, type and amount of weapons they carry, etc., can be detected in advance. This
system provides reliable real time war picture and better situational awareness.
This will further help to improve the troop readiness and decrease the reaction
time. Added using the data collected tactical planning for deploying troops
effectively can be done. In case of civil applications economic zones like oil fields,
gold mines, can be protected from intruders and attackers. Industrial complex and
production facility can be protected with minimized man power and improved
efficiency. Basic criteria are which had to be taken into account while deploying
wireless sensors for such applications has been discussed. Particularly locating the
intruder with respect to the distance from the sensor node to the target in terms of
latitude and longitudinal coordinates are discussed here.
Keywords
Radar signals, quantization error, friend identification, power management,
perimeter defense
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 70

IJCSBI.ORG
I. INTRODUCTION
Recent trends and advancement of technologies in the area of micro-
electronics has lead to the creation of the Micro-Electro-Mechanical
Systems, commonly referred to as MEMS [3]. MEMS had overcome the
limitations of system on chip technology by providing sensing capabilities
of physical parameters and control of the real world through actuators
instead of just performing logical operations. Not only MEMS which took
advancement in silicon valley, RF technology and digital circuits has also
evolved for long distance low power applications and digital circuits have
shrinked the circuitry into a single chip and minimized the fabrication cost
and time, the sequence of process like sensing, processing, communication
and integration lead towards advancement in WSN. Device which used to
perform such sensing operations in its range is called as motes which come
as a prototype or a commercial product. In this paper wireless mote is used
for border surveillance, detection and tracking of enemies in hostile
environment to secure our main land. Surveillance needs capabilities to
detect, track, identify and classify enemies and priorities them according to
the thread. Normally surveillance needs high degree of stealth in order to
avoid detection. Placing our soldiers along the border directly leads to their
life thread and the solution is to place wireless sensor motes along the
borders to listen to the ground. The problem with wireless sensor network is
power backup. Energy efficient algorithms have to be deployed to tackle
this problem which improves their endurance capability. The main objective
of this paper is to discuss how to detect, classify, and track intruders in
border to protect our perimeter. A field deployed wireless sensor must have
the ability to detect the presence, count, location, track, and identify the
intruders.
II. CHALLENGES IN DEPLOYMENT
a) Field noise:
Sensors mainly convert one form of energy signal into other, mostly an
analog signal to digital for error free transmission. On the other hand
digital systems also have their own problems to tackle. But the worst
enemy for any electronic system is its noise. It may be from outside
environment of natural noise or an internal noise of system noise. Since
our high precision sensor system works on small rate of sample mostly
small amount of photons in case of optical sensors and electrons in case
of low power circuits [4]. Other than this, in the process of conversion
of signals to digital, quantization, aliasing, and bit error rate (BER) after
analog to digital conversion (ADC) will also affect the system
performance.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 71

IJCSBI.ORG
b) Field variation
Field of environment taken for study will not remain so for a long
period. The nature of the environment may change in course due to
climatic conditions which will affect the vision of the system. For
example, infrared sensors may get affected due to heat source emitted
from vehicles, flame, explosion in its area, activities of our soldiers and
it cant be reliable. Radar signals get affected by moisture and mist [4].
Computer vision may get affected due to improper illumination and
shadow formation.
c) Background signals
Separating target from background environment is a major issue. The
same issue is faced by computer vision in separating target from
background in off-laboratory condition. In Some sensing methods like
within range systems like RADAR, LIDAR, SONAR can easily be
fooled by noise and multipath interference.
Sensing human presence can be put together under a single roof as follow.
Figure 1: Sensing humans in an environment
III. DECISIONS TO BE TAKEN WHILE SENSING HUMAN

PRESENCE
a) Presence
Decision has to be taken Is there any human beings are present. During
this process of detecting the presence the system must not miss took
outside environmental components as a human being. In a scenario, if
enemy soldiers are airdropped into our territory and if the use dummies
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 72

IJCSBI.ORG
among those (i.e., some dead bodies are airdropped system will mistook
them as soldiers) which is a serious issue. Presence has to be justified
with no chance if error so that the system can be made system proof.
b) Count
Number of enemy soldiers are intruders present into our territory has to
be identified accurately so as precise and valuable intelligence can be
provided to our soldiers regarding the hostile environment. Counter
measures can be taken accordingly are our tactics can be planned
accordingly.
c) Location
Locating targets is very much important to provide surprise attacks on
enemy so as we can get him in situation, no idea what is happening. In
some scenario locating target is very much important so that we can
eliminate thread situation with indirect fire support elements like mortar,
artillery shells, and even unguided are guided rockets like Pinnak and
Hellfire.
d) Tracking
Course of the enemy or intruder may change in time and it has to be
checked continuously so called a task known as tracking. It is same as
locating. But it is repeated continuously over time for a long duration.
Tracked data has to be continuously updated with our soldier to improve
the reliability of the intelligence.
e) Identification
In some situations our soldiers also has to be placed in front of the line
and the system must not miss took our solders as intruders and take any
counter actions. In some within sensing range sensors this situation is
handled by using Friend or Foe (FOF) check. This will be handled by
system itself to make it a fool proof.
IV. REQUIREMENT FOR THE SYSTEM TO BE MILITARY

COMPATIBLE
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 73

IJCSBI.ORG
a) Physical attributions
In most scenarios the sensor nodes are hand deployed and transported to
the field via vehicles or by the soldier in his back pack which means the
sensor must be small in size and weight [5]. In some occasions the
sensors may be air dropped using transport aircrafts or UAVs in the
sense the sensor node has to be ruggedized.
b) Self formation
Deployed sensor nodes must identify its friend with in its range and
network itself to transfer data using hopping techniques as like ad-hoc,
because of power constrain. It is needed the sensor to be static because
some nodes can get displaced due to physical influential factors and the
node has to reconfigure with the network. If any sensor node fails
reconfiguration has to be done without human intervention.
c) Data flow
During the early stages of the concept of WSN technology particularly
during the period of first generation sensor network one way
communication is much more enough. But advancement of technology
lead us into a new phase of second generation sensor network where in
some scenarios the commander has to take control of the sensor node
where we need duplexed communication techniques say to steer electro
optical sensors like CC TV.
d) Coverage and network size

Coverage in the sense the sensing area of the node. Military standard
sensor must require an appreciable sensing area and the network size
says about the number of nodes which can be connected with the
network. The network must have the property of robustness, self-healing
and configurable.
e) Life of the sensor

Some operations last for weeks and some for months even years. In
such case the sensor must last long to provide intelligence of the war
picture particularly one placed in hostile environment. If the sensors are
deployed for protection of home land and strategic locations it is
possible to change the power source which further improves the life of
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 74

IJCSBI.ORG
the sensor. In some modes the sensor need not to function to its full
effect and there some power saving algorithms are deployed to save
more power thus to improve life.
f) Stealthy characteristics
Now-a-days stealthy is not only for human eyes. Stealth is to cover from
every illuminative characteristic. Means also from electronic and
electromagnetic signature. Deployed node must emit very tiny electronic
signature.
g) Reliability
Data gathered must be reliable for the commander to take split seconds
decision. The network must provide necessary security to avoid eaves
dropping, tampering and interception.
h) Denial of service
In any instant of physical attack on the sensor nodes it must be capable
of reporting it back to the command center by using some switch
mechanism.
i) Tamper proof
Any single data present in the sensor may leads to compensate national
security if it gets in the hands of third party. So that the node must be
tamper proof to secure the data within it.
j) Cost
One of the deciding factors for implementation of any project in real
time is the overall cost of the system in terms of implementation and
maintenance. So this factor has to be taken into account during pre and
post development of the product by implementing latest technology.
V. MILITARY APPLICATION
One of the main applications of wireless sensor network for security
purpose like base protection and perimeter defense. In base protection
wireless sensor motes will be placed around the base to provide real time
data. All the data gathered will be relayed back to the command and control
center located in the base. Our sensor mote will look for intruders in any
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 75

IJCSBI.ORG
form. Our sensors will take care of seismic and acoustic signature of the
intruder. If necessary thermal imagers can also be placed to identify targets.
In a practical scenario it is necessary to provide surveillance capability
upto10 kilometer from the base. But target identification is enough up to
four kilometers. Since the area of 10 kilometer radius is covered for
surveillance [1], it provides necessary time for our force to get ready for
combat.
In case of perimeter defense, we may need to cover a large area like

remote villages, roads connecting key locations for hostile activities. In such
cases we may also need real time war picture of the battle space where it is
necessary to place electro optical sensors like CCTV with day and night
optics and thermal imagers.
a) Operational issues
The sensors have to be connected in a configuration which provides
dynamic target conformation details if one mote fails. If a mote senses a
target and it got failed due to some external or internal tamper before it
reports to base station, anther mote must be able to identify and track the
same target. Collecting more samples from many motes regarding the same
event will improve the accuracy of data, but improves the band width [2]. In
case of any mote failure system has to reconfigure dynamically on its own.
To locate the exact latitude and longitudinal coordinates of the enemy, the
mote first must be able to know its own location. Let us consider the
stationary mote as A, whose latitude and longitudinal attributes are known
to it. Now one of the sensors in the mote detects the presence of an enemy,
but it doesnt know the latitudinal and longitudinal coordinates. Instead the
mode just finds the distance of the target from its sensor, which is B.
Using this value one can get the coordinates of the target.
Normally the distance can be calculated using the following methods.
Let R be the radius of the earth which is approximately 6,371 km.
lat = lat2 - lat1
long = long2 - long1
Another method was to use Haversine formula.
A= sin2(lat/2)+cos(lat1)xcos(lat2)xsin2( long/2)
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 76

IJCSBI.ORG
C = 2.a tan2(a, (1-a))
D = R*C
According to Spherical law of cosines,
D=acos(sin(lat1)sin(lat2)+cos(lat1)cos(lat2).cos(long2 - long1)) * 6371
But all these method is used to find the distance between two known
longitudinal and longitudinal parameters. But in our case the reverse in
required. We need to calculate the longitudinal and longitudinal parameters
using the known distance. In this method the directional factor affects the
measured value. Direction of the target can be found by the sensor which
faces in that direction. But this method of finding the direction will not be
that much accurate. But the distance to that target in that direction can be
found accurately.
Let as consider as ultrasonic range finder which is used here. The

distance to the target is found using the bounced energy from the target. But
because of the environmental objects the noise will be more. This can be
eliminated using adaptive filter technology. Here the cut-off frequency of
the software defined filter can be changed dynamically. Thus the noise can
be eliminated.
Next factor will be the position of the sensor mote. This data can be
found using the GPS chip inbuilt in the mote.
Consider X which is the position of the mote and Y which is the

distance to the target which gives the coordinates of the target. This is
related by
Unknown coordinates = known coordinates + distance
Note: here the distance is related to the angle and also the direction. The
directional angle can be related with the coordinates by the following
method.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 77

IJCSBI.ORG
Figure 2: Determining latitude with respect to equator
Figure 3: Determining longitude with respect to the prime meridian (green which
meridian)
Equator is an imaginary line which decides the earth into two, northern and
southern hemisphere. The latitude of the equator is 0 degree. Point A
forms some angle with equator which is 30 degree north. Latitude is the
degree to north or south of the equator. Prime meridian is a line of reference
which divides the earth into two, eastern and western hemisphere [14-
19].The longitude of the prime meridian is 0 degree. Meridian to longitude
is the degree to which the point is east or west. Here the point is forty
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 78

IJCSBI.ORG
degree east.Thus using directional angle procedure we can say the unknown
point here is 30 degree north and forty degree east. Thus the position of
unknown target can be found.
But the real problem arises when four sensors each for each direction work
simultaneously. The reference points for 0o, 90o,180o, 270o has to be well
defined. This is similar to working on a graph. Assume that each sensor can
cover 90o which represents one quadrant. Four sensors represents four
quadrants. Each mote contains four sensors and thus a single mote can cover
up to 360o.
Figure 4: Sensor arrangement in a single mote
Now the next issue will be the integrity of the data collected by the mote.
For better accuracy and conformity one of the basic principles of digital
communication is adopted. More the sample more accurate the data. So up
to three motes can be placed in a close range so that the range of one mote
can form triangulation pattern with other mote. Combination of data from
these clusters of motes can be used to define the exact coordinates of the
enemy. Also the blind spot of one sensor in a particular direction can also be
eliminated with this group of sensors.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 79

IJCSBI.ORG
Figure 5: Deployment model of multiple motes
This above figure clearly indicates that each sensory motes are covered by
other adjacent motes. This approach will reduce the blind spot and improves
integrity of data but increase bandwidth usage to its peak. This approach
will also be helpful if any one of the motes fails.
b) Operational flow
The below mentioned flow chart clearly explains the operational method of
the system. In the system if the distance to the target was found as infinity,
the targert was ignored. Because no target can be infinite and infinity cannot
be measured. The range to be measured can also be pre-defined, i.e., the
threshold value can be set, which is based on the level of noise. Say if a
signal with output voltage is found as 8V and it can be ignored if output
above the range of 6V is found as noise.
VI. CONCLUSION
Implementing such self forming sensors will reduce the deployment and
maintenance cost which also helps us to provide better situational awareness
and troop readiness in case of military scenarios. In civil application
perimeter can be managed effectively using such wireless sensors. In future
the same will be done in hardware and real time operational issues will be
discussed.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 80

IJCSBI.ORG
Flow Chart:
Figure 6: Flow chart of the methodology
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 81

IJCSBI.ORG
References
[1] http://www.memsnet.org/mems/what-is.html.
[2] Thiago Teixeir, Gershon Dublon, A survey of human-sensing: methods for detecting
presence, count, location, track, and identity, ACM Computing Surveys, Vol. V, No.
N, 20YY, Pages 1-77.
[3] Michael Winkler, Klaus-Dieter Tuchs, Kester Hughes, and Graeme Barclay.
Theoretical and practical aspects of military wireless sensor networks, Journal of
Telecommunications and Information Technology, pp. 37-45.
[4] K. Akkaya and M. Younis, A survey on routing protocols for wire-less sensor
networks, Ad-hoc Netw., no. 3, pp. 325249, 2005.
[5] Al-Karaki and Kamal, Routing techniques in wireless sensor networks: a survey,
IEEE Wirel. Commun., vol. 11, iss.6, pp. 628, 2004.
[6] Niculescu, Communication paradigms for sensor networks,IEEE Commun. Mag.,

vol. 43, iss. 3, pp. 116122.
[7] Bhattacharya, Kim, Prabh, and Abdelzaher, Energy conserving data placement and
asynchronous multicast in wireless sensor networks, Proceedings of the International
Conference on Mobile Systems, Applications, and Services (MobiSys), May 2003.
[8] Blum, Nagaraddi, Wood, Abdelzaher, Son, and Stankovic, An entity maintenance and
connection service for sensor networks, First Intl. Conference on Mobile Systems,
Applications, and Services (MobiSys), May 2003.
[9] Boselin Prabhu, SR & Sophia, S, 2012, A research on decentralized clustering

algorithms for dense wireless sensor networks, International Journal of Computer
Applications, vol. 57, no. 20, pp. 35-40.
[10] Boselin Prabhu, SR & Sophia, S, 2013, Mobility assisted dynamic routing for mobile
wireless sensor networks, International Journal of Advanced Information Technology,
vol. 3, no. 3, pp. 9-19.
[11] Boselin Prabhu, SR & Sophia, S, 2013, A review of energy efficient clustering
algorithm for connecting wireless sensor network fields, International Journal of
Engineering Research and Technology, vol. 2, no. 4, pp. 477-481.
[12] Boselin Prabhu, SR & Sophia, S, 2013, Variable power energy efficient clustering for
wireless sensor networks, Australian Journal of Basic and Applied Sciences, vol. 7,
no. 7, pp. 423-434.
[13] Boselin Prabhu, SR & Sophia, S, 2013, Capacity based clustering model for dense
wireless sensor networks, International Journal of Computer Science and Business
Informatics, vol. 5, no. 1, pp. 1-10.
[14] Boselin Prabhu, SR & Sophia, S, 2013, Hierarchical distributed clustering algorithm
for energy efficient wireless sensor networks, International Journal of Research in
Information Technology, vol. 1, no. 12, pp. 45-55.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 82

IJCSBI.ORG
[15] Boselin Prabhu, SR & Sophia, S, 2013, Real-world applications of distributed
clustering mechanism in dense wireless sensor networks, International Journal of
Computing Communications and Networking, vol. 2, no. 4, pp. 99-105.
[16] Boselin Prabhu, SR & Sophia, S, 2013, An integrated distributed clustering algorithm
for dense WSNs, International Journal of Computer Science and Business
Informatics, vol. 8, no. 1, pp. 1-12.
[17] Boselin Prabhu ,Inigo Mathew,A, SR & Sophia, S, 2014, Modern cluster integration
of advanced weapon system and wireless sensor based combat system, Scholars
Journal of Engineering and Technology, vol. 2, no. 6A, pp. 786-794.
[18] Chen, Jamieson, Balakrishnan, and R Morris, Span: an energy-efficient coordination

algorithm for topology maintenance in ad hoc wireless networks, 6th ACM
MOBICOM Conference, 2001
[19] Arras, Mozos, and Burgard. "Using boosted features for the detection of people in 2d
range data, In Proc. of the int. conf. on robotics & automation, 2002.

Mathew, A. I., Kumar, M. R., Boselin, S. R. P. and Sophia, S. 2015. Cluster
Integrated Self Forming Wireless Sensor Based System for Intrusion
Detection and Perimeter Defense Applications. International Journal of
Computer Science and Business Informatics, Vol. 15, No. 3, pp. 70-83.
ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 83

Vol 15 No 3 - May 2015

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Vol 15 No 3 - May 2015

Загружено:

Авторское право:

Доступные форматы

International Journal of Computer Science

and Business Informatics

ISSN: 1694-2507 (Print)

A Hybrid Algorithm for Improvement of XML Documents Clustering ................................................... 1

A Novel Collaborative Filtering Friendship Recommendation Based on Smartphones ......................... 16

Quantitative Aspects of Knowledge Knowledge Potential and Utility..................................................... 45

A Hybrid Algorithm for

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 1

One of the problems in incremental algorithms is that, the order of input of

This paper is organized as follows. Section 2 discusses related works on

The clustering algorithms of XML documents can be categorized into pair

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 2

Fig. 1: Cluster Level structure merging in XCLS+ method

The formula to be used for calculating the similarity between a XML

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 3

The parameters used this formula are:

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 4

Fig. 2: Tabular view of a XML document suitable for XCLS+ clustering

The steps of matching of two objects in XCLS+ method are as follows:

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 5

3.1 First problem of XCLS+ and its solution

According to Equation (1) LevelSim is a value between 0 and 1; 0 indicates

Fig. 3: Two sample XML documents (Movie1,Movie2)

Using XCLS+ formula we have:

0.5 22 4 +22 3 +42 2 +02 1 +02 0 +0.5(22 2 +22 1 +42 0 )

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 6

3.2 Our solution for this problem

Nk (r)Lk 1 M k (r)Lk 1 CP1i r Li 1 CP1j r L j1

In this formal a new variable Mk is defined. M is the number of nodes in k-

3.3 Second problem of XCLS+

After comparing two documents and determining their related similarity

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 7

4. COMBINATION OF ONLINE WITH OFFLINE

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 8

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 9

number of documents in the same category that

Merge Pair of Clusters Phase Semi- Cluster Cluster

Fig. 5: Clustering WXCLS+ method

5. Evaluation of the proposed method

Both XCLS+ and WXCLS+ methods are implemented by Microsoft visual

At first, both clustering approaches are applied on the set of heterogeneous

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 10

ENTROPY PURITY FSCORE

0.7 0 0.04 0.02 1 0.92 0.95 0.99 0.93 0.96

AlGORITHM THRESHOLD ENTROP PURITY FSCORE

AlGORITHM THRESHOLD ENTROPY PURITY FSCORE

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 11

In new method we were looking for an approach which clustering process is

To evaluate the results of the XCLS+ and WXCLS+ methods on

Table 5: The Results of the evaluation on homogeneous documents by method XCLS+

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 12

6. CONCLUSION AND FUTURE WORK

The incremental algorithms like XCLS and XCLS+ perform clustering

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 13

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 14

This paper may be cited as:

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 15

A Novel Collaborative Filtering

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 16

1.1 Mobile Computing

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 17

ISSN: 1694-2108 | Vol. 15, No. 3. MAY 2015 18

on the Android-based Process or even Smartphones. The outcome can

Fig1: System Architecture