Академический Документы
Профессиональный Документы
Культура Документы
I.
INTRODUCTION
18
RELATED WORKS
III.
PROPOSED ALGORITHM
A. Preliminary
In this section we present basic terminologies and
techniques that will be used in our framework.
Textons allow a compact representation for different
appearances of an object. It has been proven to be effective
in categorizing generic object classes. See [28] for a detail.
Texture-layout filter is a pair (r, t) of an image region r, and
a texton t. Region r is defined in coordinates relative to the
pixel i being classified. The features have been shown to be
sufficiently general to allow learning layout and context
information for the object classes.
Joint Boost: Given the texture-layout feature, many
techniques can be used for learning the object classes. [28]
employed an adapted version of the Joint Boost algorithm
for automatic feature selection and learning the texturelayout potentials. The process iteratively selects
discriminative texture-layout features as weak classifiers
and combines them into a powerful classifier.
This process leads to a problem of minimizing parameters
with an error function, which unfortunately requires an
expensive search over the possible weak classifiers
(features) to find the optimal combination of the sharing set
N, features (r, t), and thresholds . Several optimizations
are possible to speed up the search for the optimal weak
classifiers. Since the set of all possible sharing sets is
exponentially large, greedy approximation has been used
[28, 36]. To speed up the minimization over features, the
author in [28] employed a random feature selection
procedure and performed sub-sampling on pixels for
19
Where
i
c
1 rounds of running.
i
c
i zci him ( c )
c
ci
is then
re-weighted as
zci H i ( c )
zci {1, 1}
(+1 if example i has ground truth class c, -1 otherwise).
Initialization: The initial population of the GA algorithm is
initialized by two methods: The first one is by randomly
selecting d value, the second one is by using Joint Boost
algorithm.
Evolutionary operators: For the crossover operator, we use
the two point crossover technique; for the mutation
operator, we use swap mutation method.
Selection: To select the parents for crossover, one is
selected randomly from the population, the other one is
selected randomly from 50% population which has the best
fitness function.
IV.
a v (i ) b if c S
hi (c) r ,t
otherwise
kc
20
i , jL
ACKNOWLEDGMENT
We would like to thank Philipp Krahenbuhl for providing us
the data set as well as sending us the materials related to
their works on object recognition problem.
This work was partially supported by the project Some
Advanced Statistical Learning Techniques for Computer
Vision funded by the National Foundation of Science and
Technology Development, Vietnam under grant number
102.01-2011.17. The Vietnam Institute for Advanced Study
in Mathematics provided part of the support funding for this
work.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
Nij
iL
[7]
| L | jL Nij
[8]
[9]
[10]
[11]
CONCLUSION
[12]
[13]
[14]
[15]
21
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
Mean-shift
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
Hierarchical CRF
Theme-based CRF
GAST
building
62
63
80
78.4
65.1
grass
98
98
96
95.3
89.5
tree
86
89
86
73.3
65.6
cow
58
66
74
79.8
74.6
sheep
50
54
87
74.3
71.8
sky
83
86
99
76.3
77.9
aero plane
60
63
74
80.7
86.4
water
53
71
87
53.3
60.8
face
74
83
86
66.4
84.8
car
63
71
87
88
75.8
bike
75
79
82
79
92.5
flower
63
71
97
92.4
92.8
sign
35
38
95
77.6
81.2
bird
19
23
30
41.7
72.4
book
92
88
86
93.6
98.8
chair
15
23
31
66.1
98.8
road
86
88
95
83.3
69.4
cat
54
33
51
82.5
94.5
dog
19
34
69
64.2
85.5
body
62
43
66
64.1
65
boat
32
24.2
50.4
Table 1: Comparison of segmentation accuracy of Joint Boost [18], Mean-shift [20], Hierarchical CRF [20], Themebased CRF [19] and GAST.
22
grass
Tree
cow
Sheep
sky
plane
water
face
car
Bike
flower
sign
bird
book
chair
road
cat
dog
body
boat
Building
building
65.1
0.0
4.1
0.0
0.4
3.6
2.6
1.9
2.1
1.0
3.1
0.2
3.4
0.0
1.0
0.0
5.9
0.0
1.7
3.7
grass
0.7
89.5
2.9
1.9
0.6
0.7
0.6
0.5
0.2
0.2
0.1
0.6
0.8
0.1
0.7
Inferred
class
True
class
0.
0
tree
5.4
65.6
1.4
6.7
1.1
3.3
0.4
0.2
0.2
7.1
1.2
0.6
0.3
0.8
0.
7
cow
22.5
0.2
74.6
2.7
sheep
1.7
24.5
0.7
0.3
71.8
0.1
0.7
0.2
sky
2.3
2.5
0.6
0.1
77.9
0.4
2.5
0.5
7.7
0.8
0.7
aero
plane
3.1
5.4
0.3
4.2
86.4
0.6
water
4.9
7.6
0.6
0.2
8.5
60.8
0.1
1.5
2.8
0.8
0.6
3.
5
face
1.7
2.3
84.8
6.2
0.3
4.6
car
14.7
1.8
0.9
75.8
5.8
1.1
bike
2.3
0.1
0.1
0.9
92.5
4.1
flower
4.7
0.8
0.8
0.9
92.8
0.1
sign
18.4
0.2
81.2
0.2
bird
0.4
13.1
0.1
11.7
72.4
0.9
0.9
0.
4
book
0.6
0.1
98.8
0.5
chair
0.3
98.8
0.9
road
6.2
1.4
3.1
0.6
0.2
0.7
0.3
0.7
2.8
1.3
4.2
69.4
2.7
3.2
1.
2
cat
5.5
94.5
dog
0.9
0.3
10.4
85.5
2.
9
body
7.4
2.8
2.7
14
0.1
0.6
1.8
1.7
3.8
65
50
.4
boat
18.3
global
77.7
avg
78.7
9.1
22.1
Table 2: Segmentation accuracy of our system on the MRSC 21-class data set
Figure 1: Comparison of the global and average results found by Joint Boosting [18], Mean-shift, Hierarchical CRF [20], Themebased CRF [19], Fully connected CRF [21], and GAST over 20 running times.
23