Big Data Analysis of Hollow Fiber Direct Contact Membrane Distillation

Big Data Analysis of Hollow Fiber Direct Contact
Membrane Distillation (HFDCMD) for Simulation-based

Empirical Analysis
Albert S. Kim , Seo Jin Ki , and Hyeon-Ju Kim

a, *
Civil and Environmental Engineering, University of Hawaii at Manoa,
2540 Dole Street Holmes 383, Honolulu, Hawaii 96822, USA

b
Deep Ocean Water Application Research Center, Korea Research Institute
of Ships and Ocean Engineering, Goseong-gun, Gangwon-do 219-822,

Republic of Korea
A manuscript for
Desalination
Abstract
Keywords: Self-Organizing Map, Multiple linear regression

1. Introduction
Membrane distillation (MD) is a non-isothermal process in which the
partial pressure gradient of evaporated water vapor is the driving force for
mass transfer with an accompanied transformation between liquid and gas

phases [1]. MD received close attention due to its capability to produce clean
(distillated) water using various energy resources including industrial waste,
low-grade heat, and alternative ones. Rejection of solutes in MD processes is
based on selective evaporation of water molecules, while solute molecules
stay in the liquid feed phase. In principle, the rejection ratio of MD is as high
as that of reverse osmosis (RO), and even superior for the removal of arsenic
and boron [2, 3]. The current bottleneck of MD technology includes lower
distillate (permeate) flux and membrane wetting. Applied pressure of RO can
be flexibly adjusted for required performance, but the driving force of MD is
bounded to the vapor pressure difference between hot and cold water
streams, of which the theoretical maximum is the atmospheric pressure. To
minimize heat transfer across the membrane, fast stream flow tangent to the
membrane surface should be established, but high hydraulic pressure in
membrane channels can generate liquid flow through membrane pores, i.e.,
membrane wetting. The liquid entry pressure needs to be carefully estimated
based on physical and chemical properties of the membrane, under which
hydraulic pressure is applied to generate tangential flows.
Membrane distillation (MD) has several types in terms of condensation
phases and methods, which include direct contact membrane distillation
(DCMD), vacuum membrane distillation (VMD), sweep gas membrane
distillation (SGMD), and air gap membrane distillation (AGMD). DCMD has low
resistance to mass (and heat) transfer, it is easy to setup and operate the
system, and it does not require external condensers [1]. VMD provides higher
distillate flux than that of DCMD because the driving force is enhanced for
water evaporation on the hot feed side by applying low vacuum pressure (1
3% of atmospheric pressure) on the distillate side in the absence of air in
pore spaces. SGMD collects evaporated water using a sweep gas (typically
air) and condenses the mixed gas externally. Therefore, both VMD and SGMD
require external condensers. AGMD utilizes cold wall surfaces on which
evaporated water condenses and forms dews. As water vapor molecules stay
near the cold wall until the gas-to-liquid phase transformation is completed,
the mass transfer resistance of AGMD is the highest among others.
Hydrophobic MD membranes are often prepared as flat sheets. Hollow
fiber (HF) membranes are instead preferred in industrial applications because
of its high packing density, i.e, total membrane surface area available per
packed volume. HF membranes are characterized by pore size, outer (or
inner) diameter, and length, which are of orders of O(0.1) m, O(0.1) mm,
and O(0.1) m, respectively. Therefore, both geometrical ratios of the fiber
length to diameter and diameter to pore size are of the order of O(10 ). When
3
HFs of 1 mm diameter are packed in a vessel of inner diameter 10 cm for

example, the number of HFs can easily exceed 5,000. These multi-scale
properties of hollow fiber direct contact membrane distillation (HFDCMD)
processes make direct simulation studies of concurrent mass/heat transfer
phenomena formidable. Dominant transfer mechanisms of HFDCMD are
momentum/heat transfer in the lumen and shell sides and heat/mass
transfer in solid and void membrane regions. Characteristic parameters of

HFDCMD processes can be categorized into three groups: (1) operation
condition (temperatures and flow rate of the lumen and shell inlets), (2)
intrinsic membrane properties (pore size, porosity, tortuosity, and thermal
conductivity), and (3) membrane geometry (inner and outer radii, length,
and packing density). To overcome the above-mentioned fundamental
difficulties of HFDCMD investigation, a small number (of an order of O(10))
fibers were used for experimental [4] and modeling [5, 6] studies. On the
other hand, cylindrical cell model was developed for a large number of
densely packed HFs to investigate the vapor mass transfer as correlated with
heat transfer [7]. As implied, it is a difficult task to thoroughly model MD
phenomena using a finite number of HFs with the geometrical complexity. In
MD, mass transfer phenomena is strongly correlated with conductive and
convective heat transfer through the solid part and void pores of the
membrane, respectively. The prediction of vapor flux (i.e., distillate
production rate) requires to estimate of heat transfer coefficient in the lumen
and shell interfaces at different temperatures. Empirical correlations were
often employed to predict the amount of trans-membrane heat transfer.
These equations were developed for pure heat transfer along impermeable
ducts using experimental and theoretical results, and applied to coupled heat
and mass transfer in HFDCMD (see the next section for details).
When a model is developed and verified with experimental data, a
number of cases can be tested to collect physically meaningful cases and
further to propose optimal operation conditions. Usually, this approach can

noticeably reduce the number of (costly) experiments in variety of science
and engineering disciplines. When a large amount of data are produced
using a series of simulations, analysis of the big data is another important
task. The National Big Data Research and Development Initiative was
announced by the US White House to help solve some of the Nations most
pressing challenges for scientific discovery, environmental and biomedical
research, education and national security [8]. It aims to make the most of the
fast-growing volume of digital data and to improve knowledge and insights
from the large and complex collections of digital data. Specific engineering
processes such as MD have a variety of operation variables and options.
Controlling parameters of DCMD easily exceeds 10 (see Section 2.3.2). If
each parameter has 5 candidate values, the total number of operation cases
is 5 =9,765,625, which is almost 10 millions. It is practically impossible to
10
conduct this number of experiments within a limited amount of time to find

optimal conditions. If a physical model that can predict fundamental trend of
engineering phenomena is however available, key governing parameters of
the MD performance and their inter-relationships can be readily identified by
analyzing the big data of MD simulations. Along this scope, we use our
previous modeling tool, specifically developed for HFDCMD [7], generate
simulation outputs of an order of 10 million cases, and analyze the big data
using advanced statistical approach, i.e., self-organizing map as compared to
conventional multiple linear regression. In this paper, we aim to understand
the relative significance of core parameters and their importance on

mass/heat transfer phenomena, link conventional dimensionless numbers to
the direct simulation results, and finally provide a phenomenological
correlation for fast estimation of mass and heat transfer.
2. Theory and simulations

2.1. System configuration
We consider a cylindrical vessel containing a large number of hollow
fibers of an order of at least a few thousands. The large number of fibers and
geometrical complexity make formidable to do direct simulations of coupled
heat and mass transfer occurring through individual fibers. We previously
developed a model to analyze the performance of HFDCMD by considering a
cylindrical cell, which was assumed to contain important physical and
engineering characteristics [7].
We consider the hot-out/cold-in (HOCI) operation mode of HFDCMD,
where the hot and cold streams flow in the shell and lumen regions,
respectively, because this mode provides a higher distillate flux than the hotin/cold-out (HICO) mode. The hot feed stream flows with shell velocity v
carrying thermal energy of temperature T (typically 4090C). A cold
f
permeate stream of temperature T (525C) flows in the opposite direction

p
to the shell flow. This counter-current flow maintains the trans-membrane

temperature gradient pseudo-constant along the fiber.
The thermal conductivity of membrane materials for MD is of an order of

O(0.1) W/mK. The heat transfer from the hot feed to the cold permeate
consists of conduction though the solid part of the membrane and
convection by the vapor flux through the void membrane pores. Water
evaporates near pore inlets at the feed-membrane interface and migrates to
pore
outlets
of
the
permeate
side.
This
evaporation
precedes
the
transformation of hot water from its liquid to gas (vapor) phases, consuming
the ambient thermal energy of the shell stream as much as the latent heat.
Therefore, the vapor temperature at the pore inlet is close to the feed
temperature in the local thermodynamic equilibrium. As vaporized water
molecules diffuse through (tortuous) membrane pores, they carry thermal
energy from the feed-membrane to the permeate-membrane interfaces. This
increases the permeate temperature along the fiber. Viscous mass transfer is
negligible in DCMD because water and air molecules are confined in pore
spaces.
The counter-current cold stream obtains (i) the conducted heat at the
solid membrane surface and (ii) the convected mass and carried heat at the
pore outlets. The transfer rate of the water vapor determines the MD
performance with predetermined operation conditions. The hot feed stream
looses heat as much as the cold stream receives, and therefore the
feed/permeate stream temperatures decreases/increases along the fiber.
Because the lumen volume is usually smaller than the shell volume (per
fiber), the lumen temperature varies more rapidly in the longitudinal
direction than the shell temperature. In this case, the dominant transfer
mechanisms in each region are as follows: (i) momentum and heat transfer
in the lumen and shell regions and (ii) heat and mass transfer in the void and
solid
membrane
parts.
These
mechanisms
can
be
characterized
quantitatively using representative dimensionless numbers.

If N hollow fibers of inner and outer radii of a and b, respectively, are
f
packed in a cylindrical vessel of radius R , then the packing fraction is

v
calculated
as
and
average
volume
per
each
fiber
is
c L
2
where
) is the cell radius (see Fig. 1(a)). Throughout this paper, the inner, shell, and
cell surfaces indicates r=a, b, and c, respectively. Fluid velocities are
assumed to be zero on the inner and shell surfaces of the HF membrane, and
tangential stress and radial temperature gradient are presumed to be zero
on the cell surface. Mathematical details can be found in our previous work
[7].
2.2. Dimensionless number analysis for HFDCMD

2.2.1.Overview
In MD processes, heat transfer rate can be expressed as
(1)
where
i=f,
m,
and
p,
indicating
feed,
membrane,
and
permeate,
respectively. On the same line, T , T , and T , are temperature

f
differences between the bulk feed and feed-membrane interface, across the
membrane of thickness (=ba), and between the permeate-membrane
m
interface and bulk permeate stream, respectively. As temperature is a point

function in the cylindrical cell, h and T are often implied as length-averaged
i
quantities. Within the HF membrane, we have h =h +J H T , where h is

m
m0
m0
the pure heat transfer coefficient of the porous membrane and H is the
w
evaporation enthalpy of water. Phenomenologically, the vapor flux is

proportional to the vapor pressure difference between the hot feed and cold
permeate temperatures:
(2)
where C is the membrane mass transfer coefficient and P is the partial

m
pressure of the vapor gas. The complexity comes from the coupling of Q and
i
J through h . In steady state, Q in each region should be equal to each other

w
because there is neither source nor sink of heat in the vessel. Then, one
writes
(3)
Estimation of h and h is a crucial step for the performance analysis because
f
they
vary
with
geometrical
and
fluid
dynamic
properties.
Empirical
correlation previously developed using heat transfer experiments (or
simulations) is well accepted in MD literature. These correlations represent

the Nusselt number (Nu) as a function of Reynolds (Re), Prandtl (Pr), Grashoff
(Gr) numbers as well as the ratio of hydraulic diameter (d ) to channel length
h
(L).
Gryta et al. [9] summarized 13 correlations commonly used for the
estimation of the Nusselt number in cylindrical conducts under laminar flow
(see Fig 1(b)), and concluded that heat transfer for MD carried out in platand-frame modules can be best quantified by
(4)
For laminar flow (Re<2100) of hollow fiber and capillary modules, Gryta and
Tomaszewska [10] suggested two correlations:
(5)
and
(6)
and reported that Eq. (6) provides better match than Eq. (5) with
experimental observation. Some researchers used two different correlations
to estimate heat transfer coefficients in feed and permeate sides separately.
Kim et al. [11] used Eq. (6) for the tube (lumen) side stream and used
Groehns correlation for the shell side of a hollow fiber module [12]:
(7)
where is the yaw angle. Similarly, Edwie and Chung [13] used two different
empirical correlations for heat transfer of the hot feed and cold distillate
streams for HFDCMD. Phattaranawik et al. [14] and Ho et al. [15] used Eq. (6)
for the flat-sheet MD modules. These correlations were interchangeably used
for the flat-sheet and hollow-fiber modules by employing the hydraulic
diameter. In this light, selection of specific correlations seems to be
subjective to best explain experimental observations. A recent summary of
empirical correlations for heat transfer used for MD can be found elsewhere
[16].
The inner and outer radii of hollow fiber membrane (i.e., a and b,
respectively) are of an order of O(0.1) mm and the thickness has a
m
comparable size to a. This implies that curvature effect of the hollow fiber
needs to be carefully considered in solving governing equations. In a steady
state (t=0), heat and mass flux, q and J , across the membrane satisfies,
w
and
(8)
respectively. Because there is neither source nor sink of heat and mass in the
vessel, divergences of q and J must vanish. One can assume that heat
w
transfer across the HF membrane is predominant in the radial direction

(indicated using subscript r):
(9)
where T is the absolute temperature within membrane, is the thermal

conductivity
of
the
porous
membrane
and
is
the
effective
molar
enthalpy
using
=24.527
kJ/mol
and
=5.2434102
kJ/molK.
Ficks
law
may represent
through
the
va
por
flux
p
(10)
where n is the molar concentration of water vapor in pore spaces [mol/m ]
w
and D is the effective diffusion coefficient as a combination of Brownian and

eff
Knudsen diffusivity using Bosanquets relationship [7, 17, 18]. General
solutions for the coupled Eqs. (9) and (10) were analytically obtained using
perturbation theory, and an iterative method was used to calculate local and
length-averaged profiles of heat and mass fluxes along the hollow fiber [7].
2.2.2.Lumen and shell regions

Fluid mechanic properties in the lumen and shell regions include
stream speeds (i.e., u and v for the lumen and shell, respectively) and
hydraulic diameter (d ). Throughout this paper, variables for the lumen and
h
shell regions are indicated using subscripts, lmn and shl, respectively.
Thermodynamic properties of water are density (), dynamic viscosity (),
kinematic viscosity (), specific heat at constant pressure (c ), and thermal
p
conductivity (k ) as functions of temperature.

w
Hydraulic diameter
As discussed above, heat transfer correlations used for MD were developed
using conducts, as shown in Fig. 1(b). A cylindrical rode of radius b is inside a
thermally insulating impermeable pipe of radius c. The hydraulic diameter is
defined as
(11)
where V (=(c b )L) is the liquid volume and S (=2(b+c)L) is the wetted
2
surface area by the liquid. Then, we calculate
(12)
where =b c is the volume (or packing) fraction of the rode inside the
2
pipe. Figs. 1(a) and (b) have similar characteristics at r=c. Our previous
work
assumed
and
(in Fig. 1(a)), of which the former indicated thermally insulated surface of the
rode-containing pipe (in Fig. 1(b)). Therefore, the effective diameter for heat
transfer must be calculated using conducting surface only (instead of wetted
surface), S =2bL. Then we have
c
(13)
because S >S or <1.0. On summary, the characteristic diameters of the
w
shell and lumen regions are
(14)
and
(15)
respectively.
Reynolds number
The momentum transfer is characterized using Reynolds number, defined as
a ratio of inertial and viscous forces:
(16)
where the kinematic viscosity of water varies with fluid temperature: 1.307
(at 10C), 0.553 (at 50C), and 0.326 (at 90C) in 10 m /s. This noticeable
6
dependence of on stream temperatures significantly influences the

performance of MD in terms of mass and heat fluxes. (see Appendix for
functional forms of and with respect to temperature T). In Eq. (16), U is
the (cross-section averaged) flow speed and d is the hydraulic diameter. The
h
Reynolds number includes hydrodynamic characteristics in the longitudinal

direction of the fiber, and is calculated for lumen and shell regions
individually, as denoted Re
lmn
and Re , respectively. In the lumen region, U=

shl
u, d =2a, and = (T ); and in the shell region, U=v, d = d , and =

h
(T ), where T
shl
lmn
lmn
and T are the lumen and shell temperatures, respectively.

shl
Prandtl and Peclet numbers

The ratio of viscous to thermal diffusion rate is defined as Prandtl number:
(17)
where =k c [m /s] is the thermal diffusivity, c [J/kgK] is the specific
w
heat at constant pressure, and k [W/mK] is the thermal conductivity of

w
water [19] (see Appendix for their dependence on temperature). Prandtl

number includes only thermal and viscous properties of water, which are
independent of module geometry and fluid dynamics. A product of Prandtl
and Reynolds numbers is themal Peclet number, i.e., a ratio of advective
transport and thermal diffusion: Pe=RePr. In lumen and shell regions, Peclet
numbers are denoted as
(18)
and
(19)
respectively, where the kinematic viscosity is represented as a function of
temperature.
2.2.3. Solid and void membrane regions

Nusselt number
The ratio of convective to conductive heat transfer is represented as Nusselt

number:
(20)
where (by definition) h [W/m K] is the convective heat transfer coefficient, L

2
[m] is the characteristic length, and k
[W/mK] is the membrane heat
conductivity. Here, h can be interpreted by heat flux at the feed-membrane

interface per unit temperature and L is the outer diameter of the fiber, 2b.
c
The absence of the inner diameter 2a in the Nusselt number is due to

empirical correlations developed using conducts as shown in Fig. 1(b). In
cylindrical coordinate system (as applied to HFDCMD in this paper),
heat/mass flux defined as heat/mass transfer rate per unit surface area does
not provide consistent understanding of transfer phenomena in the radial
direction. The product hL as a whole (instead of h and L separately) better
c
indicates characteristics of HFDCMD heat transfer. The reason is as follows.

In the steady state, the heat equation in the radial direction is simplified to
(21)
which indicates that qr=constant. Conversely, q decreases with respect to r
from the inner (r=a) to outer (r=b) surfaces. Therefore, we let
(22)
where S is the longitudinal average of S (=rq), which represents the
q
amount of heat lost in the feed stream per unit fiber length per unit time.
Coefficient 2 in Eq. (22) doubles the radial distance, i.e., diameter as
corresponding to L =2b. In the steady state, S is equal to heat obtained by
c
the cold stream per unit fiber length per unit time. Here we define the
Nusselt number for HFDMCD as
(23)
Peclet number
The vapor molecules diffuse through tortuous membrane pores from the
feed-membrane to permeate-membrane interfaces. As they carry thermal
energy, Peclet number is defined as
(24)
where D is the diffusion coefficient of vapor molecules in the membrane

pore. Similar to hL of Eq. (22), J r is constant with respect to the radial
c
coordinate because of
(25)
Then, Peclet number is rewritten as
(26)
where F =J r indicates the amount of vapor mass transferred across the
w
membrane per unit fiber length per unit time.
Structural factor
A hollow fiber membrane used for DCMD can be physically characterized
using pore diameter (d ), thickness ( ), porosity (), tortuosity (), and
p
length (L). A thin HF membrane of high porosity, less tortuosity, and larger
diameter provides higher flux. Then, the structural factor can be defined as
(27)
where the tortuosity is usually a decreasing function of porosity [20]. We

used Beekmans tortuosity expression as a function of the porosity:
(28)
which is specifically developed for highly interconnected pores. Details of

tortuosity discussion can be found elsewhere [7, 20].
As described above, Reynolds and Prandtl numbers characterize phyiscal an
fluid dynamic properties in the lumen (Re
lmn
and Pr ) and shell (Re and Pr )

lmn
shl
shl
regions. Nusselt and Peclect numbers and structural factor are used only for
this membrane region so that no subscript is used for Nu, Pe, and SF.
2.3. Simulations
2.3.1. HFDCMD simulation software
The simulation software is named hfdcmd as a part of EnPhySoft
(Environmental Physics Software), which is written in FORTRAN90 using
intrinsic
object-oriented
programming
modules include calculation functions
features.
Specific
programming
and subroutines for Brownian,
Knudsen, and effective diffusion coefficients, physical properties (including

reference constants and temperature-dependent functions) of liquid water
and water vapor. GNU FORTRAN (gfortran version 4.8.2) was used to compile
the developed code set using GNU make utility. hfdcmd uses an input file to
read 11 parameters as listed in Table 1. The EnPhySoft package is GNU
licensed and open to public. GNU Octave (version 3.8.1) was used for postprocessing, 2D visualization, and image conversion. Among several releases
of Linux distribution, Ubuntu (the latest version 14.04 named trusty) was
selected for long-term, stable development. Other distributions include
RedHat, Fedora, CentOS, openSUSe, and scientific Linux. Each distribution
has its own advantages and disadvantages. Ubuntu developers are allowed
to upload personally developed packages to their own PPA. Public users can
freely download, install, and use necessary software programs under the
GNU free license agreement. In the same light of open software philosophy,
hfdcmd (as an independent application of EnPhySoft) is available by
accessing the corresponding authors PPA (see Appendix for details). In
addition, all the necessary software packages (called dependencies) are also
freely available (downloadable) using package management software of the
Ubuntu distribution. For easy access and use of hfdcmd, graphical user
interface (GUI) is developed and included in the EnPhySoft package, as
shown in Fig. 2(a). Tcl/Tk script is used to develop the hfdcmd GUI, which can
open, save, and save as an input file, and run, show, archive and clear
numerical and graphical results in a window. A help file explains physical
meanings of input and output variables. To expand the accessibility of this
EnPhySoft package, Oracle VirtualBox was used to run Ubuntu OS in
Windows, Mac, and (other) Linux OSs. In Linux, the EnPhySoft can be directly
installed, but the use of VirtualBox is also possible. Fig. 2(b) shows the
EnPhySoft VirtualBox installed in Mac. The general performance of Ubuntu OS
strongly depends on (graphical) desktop environment due to the limited
partial resources available from the main OS. GNOME and KDE are excellent
but their full implementation on a VirtualBox may significantly hamper the
performance of operating systems. Instead, we used Lubuntu, which uses a
fast and lightweight operating system with a minimal desktop, called LXDE.
2.3.2. Cases and result selection criteria

A single run of hfdcmd generates 20 files (8 image and 12 data files),
which are used by the Octave script to summarize and visualize simulation
results. The total size of all the 20 files is 1.2 MB. Two dimensional
temperature profiles in the lumen, membrane, and shell regions are stored in
3 different files (each having 300-400 KB) as big as image files. To run a
HFDCMD simulation, 11 parameters are required, as listed in Table 1. These
are temperatures and fluid velocities of the hot feed and cold permeate
streams (4); inner radius, thickness, pore diameter, porosity, length, and
thermal conductivity of the HF membrane (6); and (fiber) packing fraction
(1). The numbers within parentheses indicate the number of parameters in
each category. We set 46 representative cases of the first 10 parameters,
and 3 cases for the packing fraction. F and S are calculated using the
w
simulation program and converted later to their dimensionless forms of Pe

and Nu, respectively.
The total number of simulations is 11,059,200, which theoretically
requires as large as 13.2 TB to store all simulation results. This is out of the
typical hard drive size in modern PCs. Only core results can be saved in a
single output file, of which size is about 3 KB. If only the core files are saved,
then the 311 GB is required. Although this is a manageable disk size, the
large number of files significantly slows down the OS performance. Although
NTFS and ext2/ext3 (of Windows and Linux, respectively) can hold up to 4.3
billion files, Linux Shell performance is noticeably degraded by handling a

number of files, especially to sequentially read all files for statistical analysis.
Among the total 11,059,200 simulations, only physically meaningful
7,453,717 cases are selected for statistical analysis. If the HF membrane is
long and stream flows are slow, the hot feed stream looses significant
amount of thermal energy to the permeate stream of lower temperature as
the feed stream flows tangent to the fiber surface. A thermal equilibrium
region can be locally established where the transmembrane temperature
difference is negligible so that feed water evaporation is locally insignificant.
The HF membrane is only partially utilized for membrane distillation. When
the 11 parameters are given, the HFDCMD code can calculate the maximum
fiber length, L , within which the local equilibrium region does not form. If
max
the membrane length (as an input parameter) is longer than the (calculated)
maximum length, i.e., L>L , then the simulation stops and no other
max
quantities are calculated. We found that 3,605,483 cases fall into this
inefficient category and so discarded in the statistical analysis. After each
run of hfdcmd, a special script was used to extract core numerical results,
stored in a line of a minimum file size. This extracted information was
appended to a file that contains results of previous runs. The size of this
single file was reduced to 1.5 GB for the 7,453,717 physically meaningful
cases.
To fast finish all the runs, we used the basic parallel computing
schemes for optimal job allocation. No specific parallel libraries (such as
message passing interface (MPI) and open multiple processing (openMP)) are
used, but the large number of jobs are instead divided into three groups
based on the packing fraction values. Each group of simulations were run in
different computing nodes, of which node has two quad-cores (8 cores) of
Intel (R) Xeon (R) CPU E5345 2.33 GHz and 16 GB of random access memory.
The total 3 identical nodes (in terms of hardware configuration) were
available and so 24 (=38) hfdcmd simulations were simultaneously
conducted at a time. Each node took 17.5 hours with 8 simultaneous jobs
(i.e., one simulation job per core). This is equivalent to an elapsed time as
long as 17.538=420 hours (=17.5 days) of sequential run using a
single core. The 7,453,717 physically meaningful cases were separately
stored in three files (on three nodes), and then combined into a single text
file of 1.5 GB. Statistical analysis was done using this single data file,
containing core information of each run in each file-line.
2.4. Big Data Analysis

2.4.1. Input data preparation
Two data sets are prepared from the raw data for statistical analysis:
one using real physical variables and another using reduces, dimensionless
variables. In the physical set, the tortuosity () is calculated using Eq. (27) as
a function of the membrane porosity (). The tortuosity solely depends on
the porosity so that their correlation must be transparent and strong.
Because F is strongly influenced by the porosity and tortuosity (more
w
specifically, their ratio /), we intentionally included the tortuosity as the

12th input parameter. Two output parameters of this physical set are mass
and heat transfer rate per fiber length, F and S , respectively. In the
w
dimensionless set, 12 new reduced variables are proposed: reduced shell

temperature
(T /T ),
shl
lmn
temperature
polarization
(T /T
shl
lmn
1),
effective
porosity ( = /), structural factor (SF), and dimensionless numbers (Re

eff
and Pr of lumen and shell regions, Pe of all regions, and Nu of the membrane
region only). Among them, Pe and Nu of the membrane region (denoted
Pe_mbr and Nu_mbr, respectively) are set as two output variables, as
corresponding to F and S of the physical set, respectively.
w
On summary, the total number of input variables are 12 and 10 in the

physical and dimensionless data sets, respectively, which are used to predict
correlation levels to transmembrane mass and heat transfer rates. Selforganizing map (SOM) and multiple linear regression (MLR) were used to
analyze these two big data sets generated from a series of hfdcmd numerical
simulations. MATLAB version 7.6 and IBM SPSS Statistics version 21 were
used for SOM and MLR analyses, respectively, as follows.
2.4.2. Self-organizing map (SOM)

SOM effectively identifies patterns of and relationships between input
data using an unsupervised learning algorithm and extract information from
complex data sets by creating maps of the multi-dimensional and
sophisticated data [2123]. The SOM approximates the probability density
function of the input data and shows the data in a more comprehensive
fashion of fewer dimensions (2D or 3D). As the system stability is maintained
through convergence to an equilibrium map, the SOM provide great flexibility
in data presentation and comprehension [24]. SOM can be applied to various
engineering and science fields, and comprehensive review of SOM and its
wide applications can be found elsewhere [2529].
During the unsupervised learning, the input data are used to train a neural
network without any prior knowledge of data structure. The data are
naturally partitioned into different number of clusters depending on their
own characteristics. Suppose that a data set contains n factors (i.e., n
dimensions), where the density of factor i, is expressed as vector, x , which is
i
an input layer to the SOM. This input node i is connected to other nodes (j).
The connection weights, w (t), are updated adaptively at each iteration step
ij
t. This iterative process continues until the difference between input data x
and the weight w , defined as
ij
(29)
is fully minimized and so the convergence is reached. Initial weight values
are randomly assigned with small magnitudes. The neuron having the
shortest distance to the input vector is chosen as a winner. The winner and
neighboring neurons are allowed to change their weights to keep minimizing
the distances. At each iteration, the weighted vector is updated as follows:
(30)
where Z =1 for the winning and neighboring vectors, Z =0 for other
j
remaining vectors, and (t) is a fraction for corrected learning. In SOM

analysis, all input data (or variables) were normalized into the range [-1, 1]
to make each variable have the same impact on the neuron structure in a
two-dimensional hexagonal grid. The SOM map nodes and data were
processed by linear initialization and batch training algorithm, respectively
[30]. The profile of representative samples of a variable (i.e., its distribution
pattern) was finally visualized in individual component planes. The map size
of SOM was determined by 1010 neurons rather than the default size (
, see next section for details) to quickly investigate data patterns with small
numbers of samples or output neurons.
2.4.3. Multiple linear regression (MLR)

The multiple linear regression (MLR) is a statistical method to reveal
the linear relationships between a dependent variable (y) and multiple
independent variables (x , x , etc). MLR can generate an equation that best
1
describe linear combinations between inputs and an output [29]. In this

study, a logarithmic transformation was used to normalize the dependent
variable to satisfy the homoscedasticity requirement, i.e., the variances of
the output variable are normally distributed across the range of predictor
variables. Stepwise regression method was used to select important
predictor variables in the regression process using probability of 0.05 and 0.1
for entry and removal, respectively. The average inflation factor of the
physical and dimensionless data sets was calculated and the value was close
to 1.0. This indicates that multicollinearity (i.e., high correlation between
variables) did not affect the validity of the regression equation and
coefficients [28].
3. Results and Discussion

Fig. 3 shows spatial patterns of the 14 parameters of the physical data
set, analyzed using the SOM method. The 12 input data include the
calculated tortuosity, and two outputs are the heat and mass transfer rates
per finger length, F
and S , respectively. Each parameter map (i.e.,

q
component plane) represents the distribution of 100 samples, visually shown
in the 10 by 10 hexagonal grid. This grid is reduced from the total 7,453,717
original samples. Color bars indicate ranges of parameter values and similar
map patterns in general imply strong correlations. Here we first discuss the
data partitioning. As shown in the fifteenth panel (at the bottom row, second
to the right), the samples were partitioned into 6 different groups based on
the characteristic similarity of the input parameters. The sixteenth panel
showed Davies-Bouldin index versus the number of clusters tested. A lower
Davies-Bouldin index is a symmetric and non-native fiction of the ratio of
how clusters are scattered. A lower index indicates a better clustering. In
other words, the best clustering minimizes the Davies-Bouldin index. In our
study, six clusters will best explain the variation patterns of the HFDCMD
parameters. Some SOM component panels have interesting patterns of
parameter variables. Most parameter maps show gradient structures of
vertical zig-zag stripes, including 12 input and 2 output parameters.
In the first group, Tshl, Rinr, Rotr, d_pore, length, and KappaSolid have
this structural type from blue left to red right, which has close similarity to
Sqbar. In the second group, Tlmn and pacfrac have the vertical stripe
structures, but color variations look disorderly oscillating and are not as
consistent as others. This indicates that Tlmn and pacfrac are less correlated
to others. The third group consists of u_bar and v_bar, of which panel maps
are very similar to each other, having the red-left to blue-right gradient of
vertical zig-zag stripes. The trend of this third group is opposite to that of the
first group. The fourth group has the input parameter, porosity, and the
calculated parameter, tortuosity. The porosity map pattern is unique to have
the horizontal gradient of blue-top to red-bottom stripes. As the tortuosity is
inversely proportional to the porosity, the tortuosity pattern follows red-top
to blue-bottom horizontal stripes. These two parameters are least correlated
to others because they are very basic physical characteristics of membrane
material. As expected, the inverse relationship between the porosity and
tortuosity is clearly shown in the map patterns. The fifth groups is the output
group having Fw_bar and Sq_bar. Fw_bar has a unique map pattern, which
looks similar to the vertically flipped Sq_bar. No transparent relationship is
shown between Fw_bar and other parameters, this implies that Fw_bar
cannot be simply predicted using a few parameters. Even if a correlation is
made, its accuracy must be limited. Although both Fw_bar and Sq_bar have
the similar trend of blue-left to red-right vertical stripes, peak points are at
opposite locations, i.e., cluster 1 and 6 of Fw_bar and Sq_bar, respectively.
Two maps of Rinr and Rotr resemble that of Sq_bar, indicating very strong
correlations. There is not any specific map of the diagonal gradient from blue
left-top to red right-bottom, characterizing the map of Fw_bar. This implies
that certain sets of parameter values to maximize Fw_bar and Sq_bar are not
entirely consistent with each other. Increasing heat transfer rate does not
always guarantee higher mass transfer rate. Due to the complexity of
HFDCMD phenomena, the current nonlinear SOM analysis seems limited to
identifying interactions between parameters and correlations to final outputs.
Fig. 4 shows results of SOM analysis using the dimensionless data set of
total 12 variables (10 inputs and 2 outputs). Similar to Fig. 3, this large data
set is reduced to 100 representative samples presented in component planes
with color bars. We include Tshl_red, Tem_pol, and. Prandtl_shl in the first
analysis group of Fig. 4. Tshl_red and Tem_pol have the same gradient of of
the horizontal stripes: blue-top to red-bottom. Prandtl_shl also has the
horizontal characteristics of reverse color scheme. The maps of these three
parameters better resemble that of Peclet_mbr, representing mass transfer
rate across the HF membrane per unit length. Vertical gradients are clearly
shown in maps of Reynolds_lmn, Peclet_lmn, and Peclet_shl having the redright to blue-left gradient. Peclet_shl looks similar to the above three
paramets with a slight diagonal pattern. This trend is opposite to
Nusselt_mbr representing the heat transfer rate across the membrane. The
map of effective porosity, i.e., porosity divided by the tortuosity, has an
almost identical map pattern to that of Nusselt_mbr, indicating their strong
correlation. Prnadtl_lmn also has a vertical stripe pattern but the gradient
has a peak a away from left or right end. Unlike other physical dimensionless
numbers, Prandtl_shl has the gradient structure of horizontal stripes. We
believe this influences the semi-diagonal stripe structure of Reynolds_shl.
Structural factor has the vertical stripe pattern, but the peak region is
located near the center of the map. Unlike effective porosity, SF is less
correlated to others, especially Peclet_mbr and Nusselt_mbr. Other two less
correlated
parameters
are
Prandtl_lmn
and
Reynolds_shl.
Note
tha
Peclet_mbr is best correlated to Tshl_red and TemPol, and Nussekt_mbr shows

a specific similarity only to effective porosity. We think that this SOM analysis
of the dimensionless data set does not provide new specific information in
addition to those from the physical data set. We discuss this issue in the next
section using the MLR method.
The physical and dimensionless data sets are analyzed using the multiple
linear regression (MLR) method, which have 10 and 12 input parameters,
respectively, to predict two outcomes, respectively. In the physical data set
Fw_bar and Sq_bar are regressed using varying number of parameters from 1
to 12, and Peclet_mbr and Nusselt_mbr are regressed using 1 to 10
dimensionless variables. Fig. 5 shows accuracies of the MLR method in terms
of R values of each predicted variables. The R values of the four predicted
2
dependent variables (Fw_bar, Sq_bar, Peclet_mbr and Nusselt_mbr ) increase

as the number of dependent variables increases from 1 to 4. Using more
than 5 dependent variables does not significantly enhance the predictability
of the MLR method. At least, four variables are required to explain variations
of Fwbar, Sqbar, and PecletMbr, whereas three parameters are enough to
predict Nusselt_mbr.
Table 2 shows coefficients of four predicted variables in the form of
(31)
where B is a constant and B for i=14 are (unstandarized) coefficients of
0
indepdndent variables shown in Table 2 in sequence. In the physical data set,

Fwbar is reasonably well predicted using only four independent variables:
Tshl, d_pore, Rinr, and Rotr with R =0.7. Increasing the number of
2
independent
variables
does
not
significantly
increase
the
prediction
accuracy, as shown in Fig. 5. Sqbar has key parameters of Tshl, Rinr, Rotr,
and
Tlmn,
provideing
R =0.88,
2
which
is
almost
saturated.
The
unstandardized coefficients for individual variables in regression models

described the strength of the relationship between independent and
dependent variables. (For example, as the shell temperature Tshl increases
by one, (log-transformed) Fwbar increased by 1.8010
units.) Validity of
this prediction using unstandardized coefficients are quantified using

standard error (SE) values, which are much smaller than unstandardized
coefficients, B . On the other hand, the standardized coefficients for individual
i
variables provide standard units of measure, i.e., their significances in the

models as predictors. Although the unstandardized coefficients of Tshl for
Fwbar and Sqbar are small (1.8010
and 2.6210 , respectively), its

3
standardized coefficients are the highest (0.70) for Fwbar and the second
highest (0.67) for Sqbar. Signs of the unstandardized and standardized
coefficients are identical to each other. The negative sign indicates an
inverse relationship between dependent and independent variables: as Rotr
increases both Fwbar and Sqbar decreases, keeping other independent
variable fixed. This is because thicker HF membrane hinders both heat and
mass transfer rate. In the prediction of Fw_bar of the physical set, the most
significant parameters among all 12 are ranked Tshl, Rinr, Rotr, and d_pore in
order. Because the operation mode is HOCI, the shell temperature is higher
than the lumen temperature in a wide range of 1585C. A higher Tshl
definitely increases both mass and heat transfer rates with similar
significances, characterized by the standardized coefficients, 0.70 and 0.67,
respectively. The inner radius, Rinr, has stronger influence than the outer
radius, Rotr, in predicting both Fwbar and Sqbar, and the strongest to Sqbar
with =0.98. value differences between Rinr and Rotr must stem from the
cylindrical geometry of the azimuthal symmetry. The least influencing
parameters of Fwbar and Sqbar are d_pore and Tlmn, respectively. Mass
transfer rate per unit length, Fwbar, is primarily restricted by the vapor
evaporation rate at the interface between hot feed and (outer) membrane
surface; and it is inversely proportional to the membrane thickness, Rotr -
Rinr.
Given
that,
pore
spaces
of
an
order
of
(RotrRinr) geometrically determine the vapor transfer rate. d_pore is r

nked the fourth in the MLR of Fwbar. Interestingly, this indicates that th
lumen temperature is not as important as the pore size of HF membrane. The
least influencing parameter of Sqbar is the lumen temperature Tlmn with t
e negative coefficient. The heat tra nsfer rate directly proportional to the
emperature difference, i.e., TshlTlmn. Unlike Fwbar, HF geometry is more
important in Sqbar, followed by temperatures. d_pore in Fwbar disappears
in Sqbar, and replaced by Tlmn. Comparing the fourth parameters (d_pore and
lmn in predictions of Fwbar and Sqbar, respectively) shows physical insight t
at mass transfer is primarily initiated by the hot feed temperature an
As indicated above, Peclet_mbr and Nusselt_mbr represent dimensionless
forms of mass and heat transfer rates, Fwbar and Sqbar, respectively.
Interestingly, the dimensionless data set has four common, most significant
parameters, indirectly implying that the dimensionless analysis can provide
meaningful physical results using the compact forms. This parameters are
ranked in order: Tshl_red, porosity_eff, Peclet_lmn, and Peclet_shl for
predicting Peclert_mbr and porosity_eff, Peclet_shl, Peclet_lmn, and Tshl_red
for regressing Sqbar. Among them, Peclet_shl always has the negative sign,
ranked the first and third in terms of of Nusselt_mbr and Peclet_mbr,
respectively. Note that lumen and shell Pelcet numbers are based on thermal
diffusion and membrane Peclet number is based on vapor diffusion through
membrane pores.
In predicting Peclet_mbr, the reduced shell (feed) temperature Tshl_red is

found to be the most significant parameter with =0.74, which is highly
analogous to =0.70 of Tshl in Fwbar regression. Tshl_red has the least
impact on Nusselt_mbr prediction as expected from the fact that Tshl and
Tlmn in Sqbar regression are ranked the third and fourth, respectively.
Noticeably small value of Tshl_red explains the plateau trend of
Nusselt_mbr plot (in Fig. 5) after the number of independent variables more
than three. It is interesting that Tem_pol does not show up in predicting any
of membrane transport dimensionless numbers. Effective porosity appears
both in regression equations, and ranked third and fourth of membrane
Nusselt and Peclet numbers, which were absent in Fwbar and Sqbar
predictions. This effective porosity is as important as Peclet_lmn (=0.63) in
Nusselt_mbr prediction. Structural factor does not show up as important
input parameters in analysis of the dimensionless set. In both physical and
dimensionless sets, microstructure of the membrane is found to be
secondarily important as compared to macroscopic geometry and operation
conditions.
Lumen and shell (thermal) Peclet numbers appear to be key controlling
parameters of both membrane Peclet and Nusselt numbers. Lumen Peclet
number has stronger influence to transmembrane mass transfer than shell
Peclet number, but has less impact to transmembrane heat transfer than
shell Peclet number. Given conditions in the shell side, as lumen Peclet
number increases, mass and heat transfer rate increase because the lumen
velocity and hydraulic diameter are linearly proportional to the lumen Peclet
number. As the (only) lumen temperature increases the kinematic viscosity
decreases so that the lumen Peclet number increases. In most empirical
correlations developed or used for heat transfer phenomena of DCMD
processes, viscosity and density are often assumed to be constant or fixed at
specific temperatures. As these temperature-dependent fluid mechanics
properties are included in the dimensionless analysis of the current study,
implications easily become complex to understand. The lumen Peclet
number increases as the lumen velocity and lumen temperature increases:
the former reduces the temperature polarization on the lumen surface at r=
a, the later reduces temperature gradient followed by vapor concentration
gradient across the membrane. Contribution of the lumen velocity and
temperature to the mass and heat transfer through the lumen Peclet number
is not consistent to physical insight.
The same analysis can be applied to effects of shell Peclet number (of Eq.
19XX) to mass/heat transfer phenomena. Given the shell velocity, a higher
HF packing reduces volumetric flow rate in the shell side, allowing more
temperature polarization in both tangential and normal directions on the
outer membrane surface. Variation of the shell temperatures (4090C) is
much more than that of lumen temperature (525C). In this range of shell
temperature, the kinematic viscosity decreases below 50 percent. Similar to
lumen
Peclet
number,
faster
velocity,
low
packing
ratio,
and
high
temperature in the shell side will increase the shell Peclet number. A lower
packing ratio indicates higher volumetric flow rate in the shell region, which
diminish the temperature polarization on the outer membrane surface at r=
b. In general, increasing shell Peclet number should result in higher mass and
heat transfer rates. In this aspect, it loos appropriate that signs of of shell
and lumen Peclet numbers are exchanged.
This seemingly untenable analysis can be resolved by explicitly
considering
the
reduced
shell
temperature,
Tshll_red,
which
include
difference between the shell and lumen temperatures, i.e., driving force of
the transfer phenomena. There is no doubt that feed temperature and feeddistillate temperature difference play key roles in mass and heat transfer.
Dimensionless numbers such as Reynolds, Prandtl, and Peclet numbers
characteristic fluid properties, which are varying with temperature. Variation
of fluid density, viscosity, thermal diffusivity and conductivity influence the
lumen and shell dimensionless numbers. This must not be as important as
lumen and shell temperatures to transfer phenomena, but their trends are
counterintuitive.
We indicate that HFDCMD is purely multi-physical phenomena, in which
mass/heat/momentum transfer phenomena are strongly inter-correlated in a
sophisticated manner. Variable reduction and functional mapping might not
be straightforward to quantitatively predict the performance of DCMD. In the
physical data analysis, d_pore (used for Fwbar prediction) is the only
parameter
that
dimensionless
microscopically
analysis,
effective
characterizes
porosity
is
the
membrane.
commonly
In
the
included
for
membrane Peclet and Nusselt numbers. More importantly dimensionless

analysis uses the same set of four independent parameters. One can
systematically
compare
influences
of
the
four
dimensionless
input
parameters to mass and heat transfer rates using standardized coefficient.

Rough prediction of the transfer rates is more accurate using results of
physical analysis as d_pore of Fwbar prediction is replaced by Tlmn for Sqbar
regression.
4. Concluding Remarks
In every test run, .
Acknowledgement
This
research
was
supported
by
the
National
R&D
project
of
Development of Energy Utilization Technology of Deep Sea Water Resource

supported by the Ministry of Oceans and Fisheries of the Republic of Korea.
References
[1] M Khayet and T. Matsuura. Membrane Distillation: Principles and
Applications. Elsevier, New York, 2011.
[2] Dan Qu, Jun Wang, Deyin Hou, Zhaokun Luan, Bin Fan, and Changwei
Zhao. Experimental study of arsenic removal by direct contact
membrane distillation. Journal of Hazardous Materials, 163(2-3):874879,
April
2009.
ISSN
0304-3894.
URL
//www.sciencedirect.com/science/article/pii/S0304389408010716.
http:
[3] Deyin Hou, Jun Wang, Xiangcheng Sun, Zhaokun Luan, Changwei Zhao,
and Xiaojing Ren. Boron removal from aqueous solution by direct contact
membrane distillation. Journal of Hazardous Materials, 177(1-3):613619,
May
2010.
ISSN
0304-3894.
URL
http://www.
sciencedirect.com/science/article/pii/S0304389409020652.
[4] Kai Yu Wang, Tai-Shung Chung, and Marek Gryta. Hydrophobic PVDF
hollow fiber membranes with narrow pore size distribution and ultra-thin
skin for the fresh water production through membrane distillation.
Chemical
Engineering
00092509.
Science,
doi:
63(9):25872594,
May
2008.
10.1016/j.ces.2008.02.020.
ISSN
URL
http://linkinghub.elsevier.com/ retrieve/pii/S0009250908000900.
[5] Jung-Gil Lee and Woo-Seung Kim. Numerical modeling of the vacuum
membrane distillation process. Desalination, 331:4655, December
2013.
ISSN
00119164.
doi:
10.
1016/j.desal.2013.10.022.
URL
http://www.sciencedirect.com/science/article/pii/ S0011916413004931.
[6] Guoqiang Guan, Xing Yang, Rong Wang, Robert Field, and Anthony G.
Fane. Evaluation of hollow fiber-based direct contact and vacuum
membrane distillation systems using aspen process simulation. Journal of
Membrane Science, 464:127139, August 2014. ISSN 03767388. doi:
10.1016/j.memsci.2014.03.054.
URL
science/article/pii/S0376738814002403.
http://www.sciencedirect.com/
[7] Albert S Kim. Cylindrical cell model for direct contact membrane
distillation (DCMD) of densely packed hollow fibers. Journal of Membrane
Science, 455:168186, April 2014.
ISSN
03767388.
doi:
10.1016/j.memsci.2013.12.067.
URL
http://www.sciencedirect. com/science/article/pii/S0376738813010302.
[8] OBAMA ADMINISTRATION UNVEILS BIG DATA INITIATIVE: ANNOUNCES
$200
MILLION
IN
NEW
R&D
INVESTMENTS,
2012.
URL
http://www.whitehouse.gov/
sites/default/files/microsites/ostp/big_data_press_release_final_2.pdf.
[9] M Gryta, M Tomaszewska, and A W Morawski. Membrane distillation with
laminar flow. Separation and Purification Technology, 11(2):93101, June
1997.
ISSN
1383-5866.
URL
http://www.sciencedirect.com/science/article/pii/S1383586697000026.
[10] Marek Gryta and Maria Tomaszewska. Heat transport in the membrane
distillation process. Journal of Membrane Science, 144(1-2):211222, June
1998. ISSN
03767388. doi: 10.1016/ S0376-7388(98)00050-7. URL
http://www.sciencedirect.com/science/article/pii/ S0376738898000507.
[11] Young-Deuk Kim, Kyaw Thu, Noreddine Ghaffour, and Kim Choon Ng.
Performance investigation of a solar-assisted direct contact membrane
distillation system. Journal of Membrane Science, 427(0):345364, January
2013.
ISSN
0376-7388.
doi:
http://dx.doi.org/
10.1016/j.memsci.2012.10.008.
URL
http://www.sciencedirect.com/science/article/ pii/S0376738812007521.
[12] H.G. Groehn. Influence of the yaw angle on heat transfer and pressure
drop of tube bundle heat exchangers. Heat Transfer,, 6:203209, 1982.
[13] Felinia Edwie and Tai-Shung Chung. Development of hollow fiber
membranes for water and salt recovery from highly concentrated brine
via direct contact membrane distillation and crystallization. Journal of
Membrane Science, 421-422:111123, December 2012. ISSN 03767388.
doi: 10.1016/j.memsci.2012.07.001. URL http://linkinghub.elsevier.com/
retrieve/pii/S0376738812005108.
[14] J Phattaranawik, R Jiraratananon, and A G Fane. Effect of pore size
distribution and air flux on mass transport in direct contact membrane
distillation. Journal of Membrane Science, 215(1-2):7585, April 2003.
ISSN
03767388.
doi:
10.1016/S0376-7388(02)00603-8.
URL
[15] Chii-Dong Ho, Cheng-Hao Huang, Feng-Chi Tsai, and Wei-Ting Chen.
Performance improvement on distillate flux of countercurrent-flow direct
contact
membrane
distillation
32,April2014.ISSN00119164.doi:
systems.
Desalination,338:26
10.1016/j.desal.2014.01.023.
URL
[16]
Khayet. Membranes
and
theoretical
modeling
of
membrane
distillation: A review. Advances in Colloid and Interface Science, 164(1-
2):5688, 2011. ISSN 0001-8686. doi: DOI:10.1016/j.cis.2010.09.005. URL

[17] Albert S Kim. A two-interface transport model with pore-size distribution
for predicting the performance of direct contact membrane distillation
(DCMD). Journal of Membrane Science, 428:410424, February 2013. ISSN
03767388.
doi:
10.1016/j.memsci.2012.10.054.
URL
http://linkinghub.elsevier.com/retrieve/pii/S0376738812008010.
[18] C. H. Bosanquet. The optimum pressure for a diffusion separation plant.
British TA Report, page BR/507, 1944.
[19] Maria L V Ramires, Carlos A Nieto de Castro, Yuchi Nagasaka, Akira
Nagashima, Marc J Assael, and William A Wakeham. Standard Reference
Data for the Thermal Conductivity of Water. Journal of Physical and
Chemical
Reference
Data,
24(3):13771381,
1995.
doi:
10.1063/1.555963. URL http://link.aip.org/link/?JPR/24/1377/1.

[20] Behzad Ghanbarian, Allen G. Hunt, Robert P. Ewing, and Muhammad
Sahimi. Tortuosity in Porous Media: A Critical Review. Soil Science Society
of America Journal, 77(5):1461 1477, 2013. ISSN 0361-5995. doi:
10.2136/sssaj2012.0435.
URL
https://www.soils.org/
publications/sssaj/abstracts/77/5/1461.
[21] Teuvo Kohonen. Self-Organized Formation of Topologically Correct
Feature Maps. Biological Cybernetics, 43:5969, 1982.
[22] Teuvo Kohonen. Analysis of a simple self-organizing process. Biological

Cybernetics,
44(2):
135140,
1982.
ISSN
0340-1200.
doi:
10.1007/BF00317973. URL http://dx.doi.org/10. 1007/BF00317973.

[23] Teuvo Kohonen. Self-Organizing Maps. Springer, Berlin, 3rd edition,
2001.
[24] H Ritter and K Schulten. Convergence properties of Kohonens topology
conserving
Biological
maps:
fluctuations,
Cybernetics,
60(1):59
stability,
71,
and
1988.
dimension
ISSN
selection.
0340-1200.
doi:
10.1007/BF00205972. URL http://dx.doi.org/10.1007/ BF00205972.

[25] Schulten Ritter, H., Martinetz, T. Neural Computation and Self-Organizing
Maps: An Introduction. Addison-Wesley, MA, 1992.
[26] Samuel Kaski, Jari Kangas, and Teuvo Kohonen. Bibliograph y of SelfOrganizing Map (SOM) Papers: 1981-1997. Neural Computing Surveys,
1:102350, 1998.
[27] Seo Jin Ki, Joo-Hyon Kang, Seung Won Lee, Yun Seok Lee, Kyung Hwa
Cho, Kwang-Guk An, and Joon Ha Kim. Advancing assessment and design
of
stormwater
monitoring
programs
using
self-organizing
map:
characterization of trace metal concentration profiles in stormwater

runoff. Water research, 45(14):418397, August 2011. ISSN 1879-2448.
doi:
10.1016/j.watres.2011.05.021.
URL
[28] Andy Field. Discovering statistics using SPS. Sage Publications Ltd,
Thousand Oaks, CA, 2nd edition, 2005.
[29] A M Kalteh, P Hjorth, and R Berndtsson. Review of the Self-organizing
Map (SOM) Approach in Water Resources: Analysis, Modelling and
Application. Environ. Model. Softw., 23(7):835845, 2008. ISSN 1364-8152.
doi:
10.1016/j.envsoft.2007.10.001.
URL
http://dx.doi.org/10.1016/j.envsoft.2007.10.001.
[30] J Vesanto, J Himberg, E Alhoniemi, and J Parhankangas. SOM toolbox for
Matlab
5.
Technical
report,
Espoo,
Finland,
2000.
URL
http://www.cis.hut.fi/somtoolbox/package/ papers/techrep.pdf.
[31] S.C. McCutcheon, J. L. Martin, and T. O. Jr. Barnwell. Handbood of
Hydrology. McGraw-Hill, New York, 1st edition, 1993.
Captions for figures
Fig. 1. Schematic of (a) a hollow fiber located in a imaginary cylindrical cell,

which represents a number of densely packed fibers in a vessel and (b)
conducts of heat transfer consisting of a solid rod inside a cylindrical
impermeable pipe. Each has length L.
Fig. 2. Snapshots of (a) hfdcmd GUI and (b) Oracle VirtualBox.
Fig. 3. Variation of variables (i.e., 14 component planes) of representative
samples analyzed by a Self-Organizing Map in a 10-by-10 hexagonal grid. In
the figure, data partitioning (shown in the fifteenth panel) indicates
individual clusters that have similar characteristics which are divided by the
minimum Davies-Bouldin index of 6 clusters (shown in the sixteenth panel).
Fig. 4. Variation of variables (i.e., 12 component planes) of representative
samples analyzed by a Self-Organizing Map in a 10-by-10 hexagonal grid.
Unlike Fig. 1, various dimensionless numbers were used as input variables in
the analysis.
Fig. 5. The fraction of variance (i.e., coefficient of determination R ) explained
2
by the regression equation with different numbers of predictor variables.

Big Data Analysis of Hollow Fiber Direct Contact Membrane Distillation

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Big Data Analysis of Hollow Fiber Direct Contact Membrane Distillation

Загружено:

Авторское право:

Доступные форматы

Big Data Analysis of Hollow Fiber Direct Contact

Membrane Distillation (HFDCMD) for Simulation-based

Albert S. Kim , Seo Jin Ki , and Hyeon-Ju Kim

Civil and Environmental Engineering, University of Hawaii at Manoa,

2540 Dole Street Holmes 383, Honolulu, Hawaii 96822, USA

Deep Ocean Water Application Research Center, Korea Research Institute

of Ships and Ocean Engineering, Goseong-gun, Gangwon-do 219-822,

Keywords: Self-Organizing Map, Multiple linear regression

mass transfer with an accompanied transformation between liquid and gas

HFs of 1 mm diameter are packed in a vessel of inner diameter 10 cm for

transfer in solid and void membrane regions. Characteristic parameters of

further to propose optimal operation conditions. Usually, this approach can

conduct this number of experiments within a limited amount of time to find

the relative significance of core parameters and their importance on

2. Theory and simulations

permeate stream of temperature T (525C) flows in the opposite direction

to the shell flow. This counter-current flow maintains the trans-membrane

The thermal conductivity of membrane materials for MD is of an order of

quantitatively using representative dimensionless numbers.

packed in a cylindrical vessel of radius R , then the packing fraction is

2.2. Dimensionless number analysis for HFDCMD

respectively. On the same line, T , T , and T , are temperature

interface and bulk permeate stream, respectively. As temperature is a point

quantities. Within the HF membrane, we have h =h +J H T , where h is

evaporation enthalpy of water. Phenomenologically, the vapor flux is

where C is the membrane mass transfer coefficient and P is the partial

J through h . In steady state, Q in each region should be equal to each other

correlation previously developed using heat transfer experiments (or

simulations) is well accepted in MD literature. These correlations represent

transfer across the HF membrane is predominant in the radial direction

where T is the absolute temperature within membrane, is the thermal

and D is the effective diffusion coefficient as a combination of Brownian and

Knudsen diffusivity using Bosanquets relationship [7, 17, 18]. General

2.2.2.Lumen and shell regions

conductivity (k ) as functions of temperature.

surface area by the liquid. Then, we calculate

shell and lumen regions are

dependence of on stream temperatures significantly influences the

Reynolds number includes hydrodynamic characteristics in the longitudinal

and Re , respectively. In the lumen region, U=

u, d =2a, and = (T ); and in the shell region, U=v, d = d , and =

and T are the lumen and shell temperatures, respectively.

Prandtl and Peclet numbers

heat at constant pressure, and k [W/mK] is the thermal conductivity of

water [19] (see Appendix for their dependence on temperature). Prandtl

2.2.3. Solid and void membrane regions

The ratio of convective to conductive heat transfer is represented as Nusselt

where (by definition) h [W/m K] is the convective heat transfer coefficient, L

[m] is the characteristic length, and k

[W/mK] is the membrane heat

conductivity. Here, h can be interpreted by heat flux at the feed-membrane

The absence of the inner diameter 2a in the Nusselt number is due to

indicates characteristics of HFDCMD heat transfer. The reason is as follows.

where D is the diffusion coefficient of vapor molecules in the membrane

Then, Peclet number is rewritten as

membrane per unit fiber length per unit time.

where the tortuosity is usually a decreasing function of porosity [20]. We

which is specifically developed for highly interconnected pores. Details of

and Pr ) and shell (Re and Pr )

modules include calculation functions

and subroutines for Brownian,

Knudsen, and effective diffusion coefficients, physical properties (including

2.3.2. Cases and result selection criteria

simulation program and converted later to their dimensionless forms of Pe