Вы находитесь на странице: 1из 14

EEEB383: Random Process Assignment

Semester 1, Academic Year


2015/2016

GROUP MEMBERS:
1)
Mhd Naser
2)
Nasser Salman Saleem S
EE090568
3)
Farshad Golshan
EE088610
4)
Mohammed Alawi
EE092929
Page
1

EE092886

5)

Mohamed Fathelrhman Gadalla

SECTION: 01
LECTURER: NAGALETCHUMI

EE090692

A/P BALASUBRAMANIAM, Pn.

Introduction:
Given an unknown data sources, the sources are data A and B. Each source is
having 1500000 data. The huge data sources have come from the sensors which
generating the data at the rate 10k samples/second.
The data generated are random and reach 1500000 at a particular time. The term
random process is usually refers to a random variable which is a function of time
and its strictly defined in terms of an ensemble of time functions. Such as
ensemble of function may in principle be generated using many sets of identical
sources.
Random processing, like other types of data processing, can be classified in a
number of different ways. For example: Continuous Random Variables, Discrete
Random

Variables,

Dependent

and

Independent,

Joint

Continuous/Discrete

Random Variables, etc.


The random process data can be analyzed using loose sense Gaussian processes
and autocorrelation function.
Set Up
U2351A is a data acquisition device which sends the data from hardware to the
computer or from computer to the hardware. This device is works with USB ports,
a serial port also available. The internal has USB to serial and serial to USB
converter. The device uses 10k sample/sec to send and receive the data without
errors. The resolution is 16 bits. In order for the device able to work, one should
Page
2

understand the input requirements of the device and must send the data which
meet the device requirement. Similarly, to view the data obtained, there is a need
of software or platform that must installed in the computer to collect the data.
The data collected not only shows in the platform but also saved into the system
as database. Some data might not be important, so it can be filtered out at the
hardware or in the platform.

Literature Review:
Random Variables: A random variable, usually written as X, it is a variable
whose possible values are numerical outcomes of a random phenomenon. There
are two types of random variables, discrete and continuous.
1. First Method; Continuous Random Variables:
Continuous random variables are random quantities that are measured on
continuous values. They can usually be taken from any value over some interval,
which

distinguishes

them

from

discrete

random variables, which can take on only a


sequence of values as integers.
The cumulative

distribution

function (CDF), written F(t) is given by:

So the CDF is found by integrating the PDF between the minimum value of X and
t.

Page
3

Similarly, the probability density function of a continuous random variable can be


obtained by differentiating the CDF.

Expected value and Variance:


The procedure for finding Expected values and Variance for continuous random
variables is similar to calculate expected values and standard deviations for
discrete random variables. The differences are that sums in the formula for
discrete random variables get replaced by integrals[2]:

E[X]

xf ( x)dx

The Variance of a Random Variable: if X is a random variable with mean, then the
VAR [X ]=E [ X ](E[ X ])

formula of the variance for X will be:

The standard deviation is the square root of the variance:

x = VAR[ X ]

2. Second Method; Discrete Random Variables:


The first method is a discrete random variable with non-negative integer values. It
allows the computation of probabilities for individual integer values the
probability mass function (PMF) or for sets of values, including infinite sets [1].
And some experiments are mathematically hard to describe, so random variables
are then treated as a sample space.
The Discrete R.V has a cumulative mass (distribution) function F(x) of a random
variable indicates the probability of X assuming a value less than or equal to x,
i.e., F(x) is the probability that the observed value of X will be at most;
Just like CDF can be obtained from PMF, so can PMF be obtained from CDF.
Page
4

Means and Variances of Random Variables:


The mean of a discrete random variable, X, is its weighted average.
To find the mean of X, multiply each value of X by its probability, then add all the
products.

The mean of a R.V X is called the expected value of X.


The Variance of a Random Variable: if X is a random variable with mean, then the
VAR [X ]=E [ X ](E[ X ])

formula of the variance for X will be:

The standard deviation is the square root of the variance:

x = VAR[ X ]

3. Third Method; Joints Random Variables:


The joint distribution function F of X and Y contains all the statistical information
about X and Y. In particular, given the joint distribution function F of X and Y so,
can calculate the probability of any event defined in terms of X and Y[3];
In the case when X and Y are both discrete random variables, it is more
convenient to use the joint probability mass function of X and Y.
Page
5

The individual PMFs of X and Y are called the marginal PMFs:


P , ( x , y )=P [X =x ,Y = y ]
The Covariance & Correlation coefficient of two Random Variables:
The covariance function is a number that measures the common variation of X
and Y. It is defined as
cov ( X , Y )=E [( XE [ X ])(Y E [Y ])]=E[ XY ]E[ X ] E[Y ]

The covariance is determined by the difference in E[XY] and E[X]E[Y ].


The covariance can be normalized to produce what is known as the correlation
coefficient,

The correlation coefficient is bounded by 1 1. It will have value = 0


when the covariance is zero and value = 1 when X and Y are perfectly
correlated or anti-

correlated.

Methodology:
Page
6

To process such a huge data, there are many software can be used. In this
assignment, the proposed software is MATLAB. By writing a programming code in
script for the 10k sample/sec data rate, the each data will be calculated using the
data function set programmed.
The concept of data mining and cluster also can be applied in MATLAB to speed up
the data search. This suggests that the data will be grouped into different
categories and access with different commands. Thus, this process saves time for
processing the data and speed up the data process.

Page
7

Figure 1: sensors data Plot on


matlab

As it is shown in figure from the diagrams the sensors in ideal condition have
voltage fluctuation between -0.1v to 0.5v. So it has been decided to three ranges
for each sensor. The conditions are as follows:
1. Ideal which has
2. Positive sensor detection
3. Negative sensor detection

-0.1 < V < +0.5


+0.5 < V <+10
-10 < V < -0.1

As the project problem has specified 2 sensors with 10k/s sampling is used. The
group should use the sample data set A for their calculations. Each set contains
sample data collection for two sensors which gathered to collect data. There are
1500000

sample

data

available

for

each

sensor

which

equivalents

to

1500000/10000 = 150 seconds of data collection.


For this project we are assigned to analyse the sensors data in three different
ways using what we learnt in random course. The three methods which have been
decided by the group are:
1. Create two Continuous Random Variables.
2. Create two Discrete Random Variables.
3. Create Joint Discrete Random Variables.

Page
8

1. Continuous

Random Variable:

The base of the project is two Discrete CDF one for each sensor. To evaluate the
probability of the CDF with the conditions which were explained previously. First
the sensorA data were saved as matrix in matlab and named matrixA. Same
procedure was done on sensorB and named matrixB. The following matlab
coding was used to calculate the probability of the positive and negative sensor
detections.

clear;
clc;
load matrixA
pa=0; % to store the positive values in data A
na=0; % to store negative values in data A
for ii = 1: length(sensorA)
a = sensorA(ii);
if (a > 0.5)
pa = pa+1;
end
if (a < -0.1)
na = na+1;
end
end
p_sensorA_po = pa/150e4; %to find the probability of positive sensor A detection
p_sensorA_ne = na/150e4; %to find the probability of negative sensor A detection
load matrixB
pb=0; % to store the positive values in data A
nb=0; % to store negative values in data A
for jj = 1: length(sensorB)
Page
9

b = sensorB(jj);
if (b > 0.5)
pb = pb+1;
end
if (b < -0.1)
nb = nb+1;
end
end
p_sensorB_po = pb/150e4; %to find the probability of positive sensor B detection
p_sensorB_ne = nb/150e4; %to find the probability of negative sensor B detection
The results of the Matlab calculations are:
na = 199007
pa = 475361
nb = 233604
pb = 475361

Since the total number of the data recorded are 1500000 so this results should be
divided by total data recorded in order to get the probability for each case. The
final results are as follows:
P (sensor A +) = 0.2566
P (sensor A -) = 0.1327
P (sensor B +) = 0.3169
P (sensor B -) = 0.1557
Based on these results CDF has been created for two sensors. CDF of the
final result is shown bellow

FA(a)=

0
-10
0.1327

a<
-10 a <

0.7434

-0.1 a < 0.5


Page
10

FB(b)=

0
-10
0.1557

b<
-10 b <

-0.1
0.6283
1

-0.1 b < 0.5


b

Also the PDF of the result is shown below:

fA(a) =

0.1327 -10 a <


-0.1
0.6107
-0.1 a <
0.2566
0.5 a <
0

I.

0.1557
< -0.1
0.5274
fB(b)=
<
0.5
0.3169

otherwise

Expected Values for an ideal voltage values for sensor A:

0.5

E [ A ] = 0.6107 x dx=0.6107 [
0.1

x 2 0.5
] =0.0733
2 0.1

Page
11

-10 b
-0.1 b
0.5 b

0.5

E [ A 2 ]= 0.6107 x 2 dx=0.6107 [
0.1

x 3 0.5
] =0.0256
3 0.1

Var [ A ] =E [ A 2 ] ( E [ A ] ) =0.0256( 0.0733 )2=0.0202

A =Var [ A ]= 0.0202=0.1421

II.

Expected
voltage values

0.5

E [ B ]= 0.5274 x dx=0.5274 [
0.1

Values for an ideal


for sensor B:

2 0.5

x
] =0.0633
2 0.1

Page
12

0.5

E [ B 2 ]= 0.5274 x 2 dx=0.5274 [
0.1

x 3 0.5
] =0.0222
3 0.1

Var [ B ] =E [ B2 ]( E [ B ] ) =0.0633( 0.0222 )2=0.0182

B= Var [ B ]= 0.0182=0.1349

2. Discrete

Random
Variable:

Page
13

To calculate a discrete random variable, several points should be selected and by


analyzing the sensor data the number of times each point is selected should be
calculated. However for this project the sensor data are not specific, for example
there is no +7v or -2v. All the sensor data are in decimal point. To overcome such
problem, a very small range is given to each point for example for point 7v {7 < v
< 7.2} is the range given since the range of the point is small therefore it will not
impact the accuracy of the result.
For tis discrete random variables, five points has been assigned as follows:

1.
2.
3.
4.
5.

At
At
At
At
At

+7v:
-7v:
0v:
5v:
-5v

7 < v < 7.2


-7.2 < v < -7
-0.1 < v < 0.1
5 < v < 5.2
-5.2 < v < -5

A similar matlab code used in Continuous Random Variable method is used to


evaluate the results. The following code is the programming code done in matlab.

Page
14

Вам также может понравиться