Вы находитесь на странице: 1из 6

An Approach to Improve Flexible Manufacturing

Systems with Machine Learning Algorithms


Hang Li
Institute of Automation and Information System
Technische Universitt Mnchen
Garching near Munich, Germany
hang.li@tum.de

AbstractThe electricity consumption in the industry In this paper, we want to reduce the electricity consumption
occupies considerable ratio in the gross electricity consumption in the manufacturing system with the machine learning
compared with the consumption in other sectors e.g. residential, methods, and compare the performance of two different
agriculture etc. One crucial solution to this problem is to algorithms for a plant optimization. Machine learning is to
optimize the production structure. The grand plan Industry optimize a performance criterion using example data or past
4.0 provides a more adaptable and flexible perspective for the experience with programming computers [9]. In the industrial
smart factory. The complexity of a manufacturing system, on the practice, the robot or other actuators should be trained to learn
other hand, has been enhanced. Machine learning algorithms are from the previous data or experience as the basis of future
a cluster of excellent approaches to control a complex system and
decision, then their behaviour can be optimized to complete a
to optimize a stochastic process. In order to improve the
performance of a production system, it must be formulated to an
series of operation with the minimum energy.
executive model at first, then the optional control policies can be Recently, plenty of the research work and application of the
selected to cope with it. In this paper, the classification algorithm machine learning methods are focused on the commercial
and the Q-learning algorithm have been implemented to reduce domain e.g. retail company analyses the customers behaviours
the electricity consumption in an automation system. The [10-11], social network recommends the friends or the
simulation results prove that they are capable for manipulating information which the users are interested in [12] and
the multi routes transporting system and the system can
bioinformatics domain such as extracting the huge amount of
performance better with the implementation of the machine
data [13-14].
learning algorithms.
Compared with its great success in the business and health
Keywordsmachine learning algorithm, flexible care world, the application on manufacturing system is still far
manufacturing system, Q-learning, electricity consumption from being developed mature enough to decline the electricity
reduction consumption. This is due to the fact that the vast pluralities of
I. INTRODUCTION the current production process are deterministic process.
Moreover, a considerable part of researches focus on the
Nowadays, the scientists are researching on clean and scheduling problem. P. Priore et. al. applied the inductive
renewable energy [1-4] to improve the domestic energy learning, backpropagation neural networks, and case-based
structure, and to cope with the global climate and environment reasoning algorithm to improve the manufacturing system [15].
issues for sustainable development. Another effective method, Y. R. Shiue et. al. proposed a learning mode for dynamic
the researchers are also pursuing, is to cut down the energy scheduling with the genetic algorithm [16]. There are few
consumption during the production process and the residential researchers optimize the manufacturing system with the
life [5-7]. reinforcement learning algorithms. In [17], Teo et. al.
Generally, it is difficult to cut down the residential implemented the reinforcement learning algorithm on the city
electricity consumption. On the contrast, with the proper logistics system in e-commerce, then evaluated the system by
control policy, it is possible to reduce the electricity the Nitrogen oxides emission.
consumption on the industry section. The industrial electricity However, the proposal of industry 4.0 provides a wide
consumption accounts for more than 40% of the gross stage for the implementation of the machine learning
consumption capacity all over the world, and in the majority of algorithms. One basic feature of the future production is
countries, it exceeds the consumption of the other segments flexibility [18]. This concept of flexibility is different from the
such as agriculture, forestry, residential etc. In the OECD previous one. Essentially, the previous concept means
nations, the industry electricity consumption stays at a high producing several products in one line, which, however, will be
proportion [8]. In all the OECD European countries, the produced in several lines in parallel in the future. Regarding to
industry accounts for more than 20% electricity consumption. the structure of the flexible manufacturing system, one possible
Moreover, in 39.1% countries, the industrial electricity method to analyse it is the Petri Net diagram [19]. But in the
consumption exceeds 40% of the gross electricity capacity. practice, the petri-net (PN) model trends to be too large.

978-1-5090-3474-1/16/$31.00 2016 IEEE 54


Another method, the machine learning algorithm, will be II. SYSTEM MODELING AND PROBLEM FORMULATION
implemented in the following part.
A. System modelling
Suppose a flexible manufacturing system consists of As it has been in the first part, the manufacturing system
work stations 1 , 2 , and each workstation provides has been added with several branches including conveyor belts
processes e.g. an unfinished product can be handled with (CB) and conveyor switches (CS). Conveyor belts are
{1 , 2 } at workstation . Theoretically, the employed to transport the products to the target work station

manufacturing system can produce a total of 21 { | } kinds and conveyor switches are used to achieve the product bypass
of final products. Assume that one product needs only two operation to the corresponding belts.
processes {1 | } and {2 | } , the conventional flexible
Under the normal circumstance, one conveyor switch
manufacturing system will lead the product to go through the
cannot work independently, so it must cooperate with
entire production line, and just stop to handle with 1 at work
transporters. Obviously, one conveyor switch can connect at
station and 2 at work station , which will definitely least two belts. If a switch connects only two belts, that means,
results in a waste of time and energy. it is merely a diverter. When it connects 3 or more belts, it
In the laboratory of the chair of automation and information means the system should consider not only which product will
system, TUM in Germany [20], a flexible manufacturing be delivered first, but also the coordination when several belts
system is implemented as follows and shown in Fig. 1. are taking the product to the conveyor switch simultaneously.
So the fundamental research units in the system are one
steering device and 3 conveyor belts, which are defined as the
trident node. In other words, the transporting tasks in the
flexible manufacturing system are undertaken by the trident
nodes. The decomposition of the manufacturing system to the
trident node is shown in Fig. 1.
In the research work, one conveyor switch with two
horizontal conveyor belts and one vertical conveyor belt are
used to explain the basic unit, i.e. node in the transporting
route. In one node, there are four actuators, namely one
conveyor switch and three conveyor belts. The target of a node
is to decide the transport order and coordinate the behaviours
of conveyor belts, so the key component in the node is the
conveyor switch, all the options of it will be numbered by a
subscript suffix 4 + 4. On the global level, a product will go
from the feeding part to the palletizing part. According to the
system structure in Fig. 1, for the node, a product
Fig. 1. the definition of the basic research object comes from the right side in most cases. So the state variable of
the conveyor belt will be numbered by a subscript suffix 4 +
the dark conveyor belts represents the routing plan of the 1 . Because of the relatively shorter length of the vertical
previous FMS, and the bold-frame conveyor belts represent the conveyor belt, its state variable will be numbered by a subscript
routing plan of the new FMS. That kind of structure avoids the suffix 4 + 2. Finally the state variable of the conveyor belt on
redundancy stroke of the product but increase the complexity the left side of the conveyor switch will be numbered with a
of the system on the other hand. Nevertheless, it also increases subscript suffix 4 + 3.
the flexibility and adaptability. Assume there is a problem at
one of the work stations or one conveyor belt between the Referring to the node state space which
adjacent work stations, the productive efficiency will be includes conveyor belts, its state is described by the following
influenced. But with our transporting structure, the product has distributes:
more options and it will go to another branch, so the production Kinematics: 4+1,2,3, with its range {0, 1}. Here, 0
process of the system will not be interrupted.
means stationary and 1 means motive;
The motivation of this paper is to optimize the new flexible
manufacturing system from the point of view of energy with Direction: 4+1,2,3 , with its range {0, 1}. Here, 0
the implementation of the machine learning algorithms, and means the belts are departing from the conveyor switch
compare the system performance with the different algorithms, and 1 means the belts are moving forward to the
i.e. support vector machine (SVM) algorithm and the Q- conveyor switch;
learning algorithm. The rest of the paper is organized as Load: 4+1,2,3 , with its range {0, 1}. Here, 0 means
follows. The section 2 illustrates the system modelling and conveyor belts idling, and 1 means load.
formulates the research objects. The third section introduces
the optimization methods. The simulation results are given in Besides, the public conveyor belts exit in the adjacent
section 4 to evaluate the control strategy. And the last section nodes diffusely. That means the state variables of the adjacent
concludes the paper. nodes are dependent. For the node, its state space can be
described as,

55
4+1 4+2 4+3 the velocity array. The problem eventually turns into deciding
= [4+1 4+2 4+3 ] the velocity of each conveyor belt.
4+1 4+2 4+3 C. System contraints
So the whole manufacturing system can be described with a Besides, there are also several constraints of the system.
block period tridiagonal matrix, 1) For a conveyor belt: no load, no movement
If 4+ = 0
1 2 3
Then 4+ = 0( 4 0) (8)
1 2 3
1 2 3 2) In one node, all the CBs moving forward to CS are
= forbidden
4+1 4+2 4+3
4+1 4+2 4+3 4+1 &4+2 &4+3 =
[ 4+1 4+2 4+3 ]
3) In one node, the loading CBs are no more than 2
B. Problem formalization
Because of the feature of motors mounted on conveyor 1 + 2 + 3 2
belts and conveyor switches, the motor power is constant at the
certain speed. 4) In one node, the CS may not carry the product to the
loading CBs
= 2 60 If 4+ &4+ =

= Then 4+4 (11)


5) The n-nodes group
Where, the represents the power of the motor, For a n-nodes group, the number of free CBs is + 2. And
represents the rotation rate of the motor, represents the in the n-nodes group, the public CBs take the opposite
energy consumption, represents the velocity of the product, direction values to the adjacent CSs. Within the n-node group,
represents the length of conveyor belt within the node. Then the number of the product is no more than + 1.
the conclusion is drawn that,
III. PROCESS OPTIMIZATION STRATEGY
According to the physical structure of the FMS in this
paper, the production is a stochastic process, but not
Similarity, for the conveyor switches, deterministic process because of the unforeseen product
(6) variants and production routes. In a stochastic process, the
machine learning methods are beneficial to optimize an
Here, the represents the turned angels of the conveyor automation system [21]. Essentially, the system research object
switch. is about the option of actuators under the stochastic process and
the evaluating criterion is the energy consumption. So, the goal
The transporting sequence will be determined at first, next
of the node is to minimize the energy consumption. Let the
the velocity of the belts will be decided. The action space of the
node contains the action of conveyor belts and state at time be . For a decision that begins at time 0 , the
the central switch. For the conveyor belts in the node, initial state is given by 0 . At any time, the set of possible
the action attributes are velocity 4+1,2,3 with its range actions depends on the current state, and it is described as
{ , , }. The symbol means conveyor belts move with ( ) [22].
the full speed, means conveyor belts moving with the half The value of the state-action pair (, ) under the policy ,
speed, and means conveyor belts moving with the one third denoted by (, ) , represents the expected return when
speed. For the conveyor switch in the same node, it has only starting in state , taking action and following policy
one attribute, the orientation 4+4 with its range {0,1,2,3} . thereafter:
Here, 0 means the switch faces to the east, and it is also the
default value. 1 means that facing south; 2 means west and (, ) = {|, }
3 means north. So the action space of the node is
described as follows, where {.} denotes energy consumption under the
stochastic dynamics , given that the controller uses policy
= [4+1 4+2 4+3 (4+4 )] .The optimal action value function is defined as the
maximum Bellman equation over all the policies:
It is displayed as a form of augmented matrix, because the
full speed belts will be handled at first and the slowest belt will (, ) = max (, )
be handled in the end, i.e. the handling sequence conceals in

56
Once is known, an optimal policy (i.e. one that TABLE I. THE SPECIFIC INFORMATION OF SUB SCENARIOS.
minimizes the energy consumption) can be found by an CB17.load CB18.load CB19.load
optimization over the action argument:
True True False
() True False True
= arg max (, )
False True True
True False False
The system will choose one product to transport to the False True False
optional branch based on a random number, and then the False False True
system calculates the energy consumption 1, at this An example of the sub scenario 2 is displayed in Fig. 2.
situation. The system may choose other options during the Under the circumstance, the node 1 and node 2 are occupied
stochastic process, and obtains the energy consumption with one product respectively. And then they will pass the node
2, , 3 , , , and then the optimal option will be 4 through CB17 and CB18 separately. Then the possible
actions of each sub scenario are concluded in table 2.
decided. This is the classification method in .
Another control policy for this problem is the Q-learning
algorithm. The Q-learning algorithm estimates from the
interaction between the actuators and the environment
iteratively. The Q-learning algorithm updates the following
equation to search the optimal action:

+1 ( , ) = ( , ) + [+1 + max (+1 , )



( , )] (15)

Here, +1 represents the reward values observed by the


controller while interacting with the environment; represents Fig. 2. The simulation model of sub scenario 1.
the discount rate and is the learning rate.
Obviously, the system transporters fulfil the following
conditions, and then it has been proved that the control policy TABLE II. THE ALTERNATIVE ACTIONS AND SEQUENCES OF EACH UNIT
converges to when [23].
Explicit, distinct values of the Q-function are stored and 17.load19 18.load19 17.load19 18.load19
updated for each state-action pair.
18.load19 17.load19 18.load17 17.load18
The sum of the squares of is finite, whereas the sum 18.load17 19.load17 18.load17 19.load17
of is infinite. 19.load17 18.load17 19.load18 18.load19
17.load18 19.load18 17.load18 19.load18
The controller keeps trying all actions in all states with 19.load18 17.load18 19.load17 17.load19
nonzero probability. 17.load18 17.load19 - -
18.load17 18.load19 - -
IV. SIMULATION RESULTS 19.load17 19.load18 - -
For a standard machine learning process, it contains the Here, the number with circle means the transporting
data collection, data preparation, data analysis, algorithm sequence of the system, and the symbol means the
training, algorithm test and the algorithm implementation transporting direction.
generally. In the simulation process, a random number is
generated first, and then the system enters the corresponding
sub scenario. Then the node decides the handling sequence.
Simultaneously, the system keeps record of the energy
consumption. This process is simulated under the CoDeSys
environment with the codes under IEEE 61131-3. Moreover,
the test in each sub scenario will be carried out for 50 times,
and then the average electricity consumption is calculated to
evaluate the algorithm.
In the simulation, 6 groups of tests are executed according
to the system structure. In the tests, the entire product will pass
through the node 4. The first three groups of simulations, two Fig. 3. The possible routing plan for a product in the sub scenario 1
products pass the node group. In the last three simulations, only
one product will pass through the nodes then chooses one node Fig. 3 displays the possible action in sub scenario 1. The
for the further operation. The instruction of each scenario is dash lines represent the 1 and the number with the dash circle
specified with the truth table as follows. means the transporting sequence of 1 . The dash dot lines

57
represent the 2 and the number with dash dot circle means the 320

Electricity consumption
transporting sequence of 2 . In the simulation process, it will
310
consume 40 unit electric energy when a product passes one
horizontal CB, and 30 unit when it passes the vertical CB 300
because of the length. For the CS, it will cost 15 unit when one
CS turns 90 degrees. According to equation (8), the electricity 290
consumption of a CS is proportional to its turned angles, so one
CS will consume 30 unit energy when it turns 180 degrees. 280
After plenary time, the system completes the algorithm training
period, and gets the energy cost with the corresponding action 270
1 6 11 16 21 26 31 36 41 46
in the sub scenarios which are shown in table 3.
Trial times
SVM Q-learning
TABLE III. THE ENERGY CONSUMPTION IN THE DIVERSE SUB SCENARIOS
WITH THE DIFFERENT ACTIONS
Fig. 5. The average energy consumption with two algorithms in scenario 2

360
260 290 295 340

Electricity consumption
310 280 290 340 340
310 280 340 290
160 180 - - 320
160 130 - -
180 130 - - 300
The minimum is labelled with the bold number. The entire
bold number composite the support vector in the solution 280
space, then in the algorithm implementation process, the 260
system will choose the corresponding action to decline the 1 6 11 16 21 26 31 36 41 46
energy consumption with the application of the support vector
Trial times
machine algorithm. SVM Q-learning
In the previous section, the other algorithm mentioned is
the Q-learning algorithm. In the Q-learning algorithm, the Fig. 6. The average energy consumption with two algorithms in scenario 3
parameters are initialized at first as follows.
190
Electricity consumption

TABLE IV. THE PARAMETER SETTING OF THE Q-LEARNING ALGORITHM


180
parameter value
0 0
0.9 170
0.9
0.1
160
Then the reward is calculated under each circumstance as
follows.
150
1 6 11 16 21 26 31 36 41 46
= arg arg( | ) Trial times
2
SVM Q-learning
50 times experiment has been done and the results are
displayed from Fig. 4 to Fig. 9. Fig. 7. The average energy consumption with two algorithms in scenario 4

330 190
Electricity consumption

Electricity consumption

320
180
310
170
300
290 160

280 150
270 140
260 130
1 6 11 16 21 26 31 36 41 46 1 6 11 16 21 26 31 36 41 46
Trial times Trial times
SVM Q-learning SVM Q-learning

Fig. 4. The average energy consumption with two algorithms in scenario 1 Fig. 8. The average energy consumption with two algorithms in scenario 5

58
190 [3] H. Lund, and B. V. Mathiesen. "Energy system analysis of 100%
renewable energy systemsThe case of Denmark in years 2030 and
Electricity consumption

180 2050." Energy 34.5 (2009): 524-531.


170 [4] P. D. Zhang., Y. L. Yang, S. Jin, Y. H. Zheng, L. S. Wang, and X. R. Li.
(2009). Opportunities and challenges for renewable energy policy in
160 China. Renewable and Sustainable Energy Reviews, 13(2), 439-449.
[5] Z. C. Guo, and Z. X. Fu. "Current situation of energy consumption and
150 measures taken for energy saving in the iron and steel industry in
China." Energy 35.11 (2010): 4356-4360.
140
[6] J. K. Lu, T. M. Sookoor, V. Srinivasan, G. Gao, B. Holben, J. Stankovic,
130 E. Field and K. Whitehouse. "The smart thermostat: using occupancy
1 6 11 16 21 26 31 36 41 46 sensors to save energy in homes." Proceedings of the 8th ACM
Trial times Conference on Embedded Networked Sensor Systems. ACM, 2010.
SVM Q-learning [7] K. Langendoen, B. Aline, and O. Visser. "Murphy loves potatoes:
Experiences from a pilot sensor network deployment in precision
Fig. 9. The average energy consumption with two algorithms in scenario 6 agriculture." | Proceedings 20th IEEE International Parallel &
Distributed Processing Symposium. IEEE, 2006.
From the simulation results, in each of the sub scenario, the [8] Electricity information. Paris: International Energy Agent, 2012. Print.
two algorithms perform both better, and they can both reduce [9] Alpaydin, Ethem. Introduction to machine learning. MIT press, 2014.
the average electricity consumption. [10] L. Y. Chu. Development of an integrated intelligent product assortment
optimization model for apparel retailing. Diss. The Hong Kong
The expected electricity of each sub scenario is 296.25, Polytechnic University, 2012.
305, 305, 170, 145, and 155 respectively. After 50 trials, the [11] V. L. Miguis, D. Van den Poel, A. S. Camanho, and J. F. Cunha.
average electricity consumption declines 9.33%, 11.46%; Predicting partial customer churn using Markov for discrimination for
modeling first purchase sequences. Advances in Data Analysis and
5.97%, 7.93%; 5.57%, 8.13%; 4%, 5.65%; 5.24%, 9.52%; Classification, 6(4), 337-353, 2012.
10.97%, 14.84% with the SVM and Q-learning algorithm [12] Eklaspur, M. Namrata., and S. Pashupatimath Anand. "A Friend
under each scenario separately. And they trend to convergent to Recommender System for Social Networks by Life Style Extraction
the minimum value. And the Q-learning algorithm is Using Probabilistic Method-Friendtome." International Journal of
convergent fast. Computer Science Trends and Technology (IJCST) 3.3 (2015).
[13] W. Yu, M. Gwinn, M. Clyne, A. Yesupriya, and M. J. Khoury. A
The average electricity consumption fluctuates at the very navigator for human genome epidemiology. Nature genetics, 40(2), 124-
beginning of the SVM algorithm because of its algorithm 125, 2008.
training process. In the algorithm training process, the system [14] B. Jaume, and X. Llor. "Largescale data mining using geneticsbased
decides the behaviour in a random way. machine learning." Wiley Interdisciplinary Reviews: Data Mining and
Knowledge Discovery 3.1 (2013): 37-61.
V. CONCLUSION [15] P. Priore, D. de la Fuente, J. Puente, and J. Parreo. A comparison of
machine-learning algorithms for dynamic scheduling of flexible
In this paper, the system simulation has been completed manufacturing systems. Engineering Applications of Artificial
first, and then the flexible manufacturing system optimization Intelligence, 19(3), 247-255, 2006.
problem is formulated to the computer executive math model. [16] Y. R. Shiue, R. S. Guh, and K. C. Lee. Development of machine
Next, two algorithms have been implemented, one is learningbased real time scheduling systems: using ensemble based on
classification algorithm, and the other one is the Q-learning wrapper feature selection approach. International Journal of Production
algorithm. Research, 50(20), 5887-5905, 2012.
[17] J. S. Teo, E. Taniguchi, and A. G. Qureshi. (2012). Evaluating city
The classification algorithm is a model based offline logistics measure in e-commerce with multiagent systems. Procedia-
algorithm, after a period of time the system finds the optimal Social and Behavioral Sciences, 39, 349-359.
solution which it will always apply after the algorithm training [18] B. Vogel-Heuser, D. Schtz, and P. Ghner. "Agentenbasierte Kopplung
process. The Q-learning algorithm is a model free online von Produktionsanlagen." Informatik-Spektrum 38.3 (2015): 191-198.J.
Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2.
algorithm, and it convergences quickly with the proper reward Oxford: Clarendon, 1892, pp.68-73.
function. In one word, the two algorithms can decline the [19] M. C. Zhou, and F. DiCesare. Petri net synthesis for discrete event
electricity consumption of the flexible manufacturing system, control of manufacturing systems. Vol. 204. Springer Science &
and the performance with the Q-learning algorithm is slightly Business Media, 2012.
better than the SVM algorithm. For the optimization of the [20] S. Ulewicz, D. Schtz and B. Vogel-Heuser. "Flexible Real Time
electricity consumption in a flexible manufacturing system, the Communication between Distributed Automation Software Agents,"
Q-learning algorithm has the better application prospect. in 22nd International Conference on Production Research (ICPR), Aug.
2013.
REFERENCES [21] H. Li. "The implementation of reinforcement learning algorithms on the
elevator control system." IEEE 20th Conference on Emerging
[1] J. M. Carrasco, L. G. Franquelo, J. T. Bialasiewicz, E. Galvn, R. C. P. Technologies & Factory Automation (ETFA), 2015.
Guisado, M. . M Prats, and N. Moreno-Alfonso. (2006). Power-
electronic systems for the grid integration of renewable energy sources: [22] C. J. Watkins, and, P. Dayan. (1992). Q-learning. Machine
A survey.Industrial Electronics, IEEE Transactions on, 53(4), 1002- learning, 8(3-4), 279-292.
1016. [23] T. Jaakkola, Tommi, Michael I. Jordan, and Satinder P. Singh. "On the
[2] S. Carley. "State renewable energy electricity policies: An empirical convergence of stochastic iterative dynamic programming
evaluation of effectiveness." Energy policy 37.8 (2009): 3071-3081. algorithms." Neural computation 6.6 (1994): 1185-1201.

59

Powered by TCPDF (www.tcpdf.org)

Вам также может понравиться