Академический Документы
Профессиональный Документы
Культура Документы
E-mail: cbernido.cvif@gmail.com
Abstract
We utilize a stochastic functional integral approach that forms a natural framework for analyzing
ubiquitous complex sequences of fluctuations with underlying non-Markovian stochastic process
beyond fractional Brownian motion. We demonstrate how Hida white noise calculus, guided by
mean square deviation (MSD) analysis of empirical data, allows derivation of single nucleotide
occurrence probability distributions for whole genomes of four significant species of bacteria:
(a) freshwater cyanobacteria Synechococcus elongatus PCC7942, 2.7 Mbp, (b) marine
cyanobacteria Prochlorococcus marinus subsp. marinus str. CCMP1375, 1.8 Mbp,
(c) pathogenic bacteria Staphylococcus aureus subsp. aureus NCTC 8325, 2.8 Mbp, and
(d) Staphylococcus aureus ILRI Eymole1/1, 2.9 Mbp. Here, the stochastic variable is chosen
to represent separation distances between succeeding identical single nucleotides where distance
is defined as the number of steps through intervening bases. The stochastic parameter set takes
values of nucleotide occurrence count along the genome length. The probability density function
(PDF) is derived in closed form for the associated stochastic process with exponentially damped
memory kernel, and is shown to satisfy a modified diffusion equation with a parameter-
dependent diffusion coefficient. The PDF yields an analytical result for MSDs that match
empirical plots, showing a rising nonlinear curve that flattens to a plateau starting close to 1 kb,
similar to restricted diffusion. The plots exhibit compliance with Chargaff’s second parity rule
for nucleotides. The same PDF describes occurrences of single nucleotides adenine, guanine,
cytosine, and thymine for all four bacterial genomes considered.
Keywords: non-Markovian stochastic process, white noise calculus, DNA sequence, S.
elongatus PCC7942, P. marinus subsp. marinus str. CCMP1375, S. aureus subsp. aureus NCTC
8325, S. aureus ILRI Eymole1/1
subsp. marinus str. CCMP1375 has 1751 080 bp with 1882 where, b , c > 0 are constants. The stochastic variable x (L )
coding genes and 94 non-coding genes. Common infections represents fluctuating separation distances between succeed-
are caused by the pathogenic bacteria S. aureus. Moreover, ing identical single nucleotides where distance is defined in
S. aureus subsp. aureus NCTC 8325 is an exemplar for strains terms of steps through the intervening bases as shown in
used in genetic manipulation. S. aureus ILRI Eymole1/1 has figure 1.
2874 302 bp with a GC-content of 32.88% [19]. The stochastic parameter set takes values of nucleotide
We investigate, in particular, the distribution of a given occurrences along the whole genome length, typically over a
nucleotide in a DNA sequence. Mean square displacement million base pairs, so that we take 0 s L with
(MSD) analysis for the fluctuations in distribution of single L = N - 1, where N is the total number of the single
nucleotides along the bacterial genomes lends insight for the nucleotide present in the genome strand. In equation (1),
choice of underlying stochastic process for the system. The exp [-(b /2)(L - s )] is a memory kernel characteristic of the
genome-wide patterns in MSD plots generated deviate from system. The initial value of the process is fixed at x 0. How-
linearity pointing to non-Markovian stochasticity. Evaluation ever, from equation (1), we see that the fluctuating variable
of the expectation of the constrained stochastic variable over can take any value in the domain when the parameter takes
Hida white noise probability measure yields analytical results the value, s = L. We thus introduce the constraint fixing the
for the MSD that match empirical plots, showing a rising endpoint x (L ) = xL , using the Donsker delta functional,
nonlinear curve that flattens to a plateau starting close to 1 kb, d (x (L ) - xL ), the stochastic distributional analog of the
and following Chargaff’s second parity rule for nucleotides. Dirac delta function [20–22]. The delta functional constraint
The same PDF describes occurrences of single nucleotides takes as nonvanishing paths only those with terminal values at
adenine (A), guanine (G), cytosine (C), and thymine (T) for the specified point xL . The PDF P (xL , L; x 0 , 0) for fluctua-
all four bacterial genomes considered. tions ending at xL with initial value at x 0 is the expectation
In section 2, we present relevant essential points of the
stochastic integral approach for processes with memory using
P (xL , L; x 0, 0) = ò d (x (L ) - xL ) dm (w ). (2 )
2
Phys. Scr. 94 (2019) 125006 R R Violanda et al
Figure 2. Separation distances between a nucleotide base and the identical single nucleotide immediately preceding it for S. elongatus
PCC7942. The occurrence number n is the nth nucleotide base of the same type encountered in the DNA sequence.
In equation (2), dm (w ) = Nw exp ⎡⎣ -(1 /2) ò w (t )2dt ⎤⎦ d¥w, We now write, e (s ) = k bc exp [-(b /2)(L - s )], e Î S (),
is the Gaussian white noise probability measure where the and integration over dm (w ) can be done using the characteristic
exponential is responsible for the Gaussian fall-off and Nw is a functional equation (3). With equation (5), the PDF can then be
normalization factor [4]. The dm (w ) is defined by the char- written as
acteristic functional 1 +¥
⎡ L ⎤ ⎡ 1 L ⎤
P (xL , L; x 0, 0) =
2p ò-¥ exp {ik [x 0 - xL ]}
where e (s ) Î S () with the Gel’fand triple, S () Ì L2 () Ì We also remark that it is the pairing between the dual spaces
S ¢ (). Here S (), L2 (), and S ¢ () are the Schwartz space through the triple S () Ì L2 () Ì S ¢ (), defined as the
of test functions, the Hilbert space of square integrable bilinear extension of the inner product on L2, that facilitates the
functions, and the space of tempered distributions, respec- treatment of memory functions in the stochastic integral. In
tively [4]. equation (6), the integral over dk is a Gaussian integral which
To evaluate equation (2), we write the Donsker delta yields
functional in terms of its Fourier representation to get 1
P (xL , L; x 0, 0) =
1 +¥ 2pc [1 - exp ( - bL )]
P (xL , L; x 0, 0) =
2p ò ò-¥ exp {ik [x (L ) - xL ]} dk dm (w ).
⎧ - (xL - x 0 )2 ⎫
(4 ) ´ exp ⎨ ⎬, (7 )
⎩ 2c [1 - exp ( - bL )] ⎭
Using equation (1) for x (L ) we have
where integration over ds has been carried out. The PDF of
+¥
1 equation (7) can be shown to satisfy a modified diffusion
P (xL , L; x 0, 0) =
2p -¥ ò
dk exp {ik [x 0 - xL ]}
equation with a parameter-dependent diffusion coefficient similar
⎧ L ⎡ b ⎤ ⎫ to other stochastic processes with memory [23].
ò
´ exp ⎨ik bc
⎩ 0
ò
exp ⎢ - (L - s) ⎥ w (s) ds⎬ dm (w ).
⎣ 2 ⎦ ⎭ With equation (7), the MSD can be calculated,
(5 ) MSD = á (x - á xñ)2 ñ, where brackets á¼ñ denote an average.
3
Phys. Scr. 94 (2019) 125006 R R Violanda et al
Figure 3. A magnified view of figure 2 for the 400th–500th identical single nucleotide occurring in the DNA sequence of S. elongatus
PCC7942.
4
Phys. Scr. 94 (2019) 125006 R R Violanda et al
Figure 4. MSD for T, A, G, and C nucleotide bases of S. elongatus PCC7942. Empirical MSD (blue dots); theoretical fit (red curve) given by
equation (9).
Table 1. Values of parameters in equation (9) that give a good match Table 2. Values of parameters in equation (9) that give a good match
between theoretical MSD (red curve, figure 4) and genome-based between theoretical MSD (red curve, figure 5) and genome-based
MSD (blue dots, figure 4) for S. elongatus PCC7942. MSD (blue dots, figure 5) for P. marinus subsp. marinus str.
CCMP1375.
Base T Base A Base G Base C
Base T Base A Base G Base C
a 32.2 31.85 16.63 16.69
b 0.003 54 0.003 32 0.003 12 0.002 75 a 15.82 15.59 53.92 53.93
c 0.62 0.59 0.24 0.27 b 0.002 96 0.004 24 0.007 12 0.008 33
c 0.18 0.19 1.28 1.15
5
Phys. Scr. 94 (2019) 125006 R R Violanda et al
Figure 5. MSD for T, A, G, and C nucleotide bases of P. marinus subsp. marinus str. CCMP1375. Empirical MSD (blue dots); theoretical fit
(red curve) given by equation (9).
Table 3. Values of parameters in equation (9) that give a good match ILRI Eymole1/1 are plotted with parameter values given in
between theoretical MSD (red curve, figure 6) and genome-based table 4. The MSD plots are shown in figure 7.
MSD (blue dots, figure 6) for S. aureus subsp. aureus NCTC 8325.
Base T Base A Base G Base C
a 12.63 13.30 65.05 63.82
b 0.0027 0.001 59 0.002 46 0.002 16 5. Discussion
c 0.18 0.19 0.89 0.90
As shown in figures 4–7, a good match between theoretical
and genome-based MSD is obtained for each of the four
bacterial genomes. This indicates that varying separation
and T for the genome of P. marinus subsp. marinus str. distances between neighboring nucleotides of the same type
CCMP1375, with parameter values in table 2. The MSD plots
are described by a non-Markovian stochastic process char-
are shown in figure 5.
acterized by equations (7)–(9). The theoretical MSD calcu-
lated as an ensemble average closely matches the empirically
derived MSD especially at large occurrence numbers (see
4.3. Staphylococcus aureus subsp. aureus NCTC 8325
figures 4–7). Noting that increasing occurrence numbers
The MSD of the separation distances similar to figures 2 and correspond to the role of increasing time for fluctuations
3 for nucleotides A, G, C, and T for the bacterial species S. measured in a time series, one could say that the system is
aureus subsp. aureus NCTC 8325 are obtained for parameter ergodic for large occurrence numbers. For small occurrence
values given in table 3. The corresponding MSD plots are numbers, however, a slight deviation is observed between the
shown in figure 6. theoretical and empirical MSD. A non-ergodic behavior is
therefore manifested for small occurrence numbers, or when a
series of similar nucleotides are relatively close to each other.
4.4. Staphylococcus aureus ILRI Eymole1/1
We now discuss the significance of parameters a, b, and c of
The MSD of the separation distances (similar to figures 2 and equation (9), as well as the PDF obtained for the different
3) for nucleotides A, G, C, and T for the genome of S. aureus genomes.
6
Phys. Scr. 94 (2019) 125006 R R Violanda et al
Figure 6. MSD for A, G, C, and T nucleotide bases of S. aureus subsp. aureus NCTC 8325. Empirical MSD (blue dots); theoretical fit (red
curve) given by equation (9).
Table 4. Values of parameters in equation (9) that give a good match PDF and MSD, equations (7) and (8), respectively, i.e.
between theoretical MSD (red curve, figure 7) and genome-based exp (-bL ) » 1 - bL + (b 2L2 /2!)+¼, one sees that the
MSD (blue dots, figure 7) for S. aureus ILRI Eymole1/1. stochastic process approaches the mathematical form of
Base T Base A Base G Base C ordinary Brownian motion when, b 1. The only physical
difference is that the ‘step length’ or distances between suc-
a 12.42 13.54 70.46 58.61 ceeding similar nucleotides can vary (see, also, figure 9). Note
b 0.002 89 0.001 79 0.002 53 0.001 92
that here, L is the occurrence number which corresponds to
c 0.20 0.19 0.93 0.92
time T in discussions of Brownian fluctuations.
7
Phys. Scr. 94 (2019) 125006 R R Violanda et al
Figure 7. MSD for A, G, C, and T nucleotide bases of S._aureus ILRI Eymole1/1. Empirical MSD (blue dots); theoretical fit (red curve)
given by equation (9).
Table 5. Chargaff’s second parity rule may be verified if (a–c) for A nucleotides, or ‘step lengths,’ can be very different from each
and T have approximately the same values. The same holds for G other.
and C.
S. elongatus PCC7942 T A G C 5.4. Probability density function
(a–c) 31.58 31.26 16.39 16.42 We could also extract information from the probability of
occurrence of a given nucleotide from the PDF. This may be
illustrated in figure 10 where empirical and theoretical graphs
5.3. MSD for large occurrence numbers (equation (7)) of the PDF are shown as a function of (x - x 0 )
and the occurrence number L.
As observed in figures 4–7, the MSD plateaus at a roughly Both empirical and theoretical graphs in figure 10 show a
horizontal line as occurrence numbers become large. This peak at x = x 0. Recall that x represents separation distance of
behavior is similar to confined or restricted diffusion (see, e.g. a single nucleotide from an identical nucleotide immediately
[27]). The approximate MSD values at large occurrence preceding it in the sequence (see figure 1). Hence, given any
numbers when the graph becomes approximately flat are nucleotide separated by a distance x 0 from the identical
given by parameter a as summarized in table 6. We can also neighboring nucleotide, it is most likely that the fluctuation of
compare the genome-based MSD with an MSD arising from distance values would end up with a nucleotide characterized
purely random arrangement of the four nucleotides. This is by a distance x = x 0 from a nucleotide of the same type
shown in figure 9 and reflected in table 6 for comparison. As preceding it.
expected, the genomic nucleotide sequence is far from We note that the PDF, equation (7), is a solution of a
random. modified diffusion equation of the form [6]
We note that the randomly distributed A, G, C, and T of ¶P (x , s ; x 0, 0) ⎡ bc ⎤ ¶ 2 P (x , s ; x 0 , 0 )
figure 9 do not represent an MSD typical of ordinary Brow- =⎢ ⎥ , (11)
¶s ⎣ 2 exp (bs) ⎦ ¶x 2
nian motion. In the usual Brownian motion, or random walk,
the step lengths are normally equal although directions are with a parameter-dependent diffusion coefficient D(s).
unpredictable. In figure 9, the distances between similar Such kinetic equations are of recent interest due to the wide
8
Phys. Scr. 94 (2019) 125006 R R Violanda et al
Figure 8. MSD versus occurrence number for nucleotides T (blue), A (red), G (green), and C (black) for S. elongatus PCC7942.
Table 6. MSD values at large occurrence numbers given by parameter a. The last row shows the exact MSD value for randomly placed
nucleotides corresponding to figure 9.
MSD at Large Occurrence Numbers
Base T Base A Base G Base C
S. elongatus PCC7942 32.2 31.85 16.63 16.69
P. marinus subsp. marinus str. CCMP1375 15.82 15.59 53.92 53.93
S. aureus subsp. aureus NCTC 8325 12.63 13.30 65.05 63.82
S. aureus ILRI Eymole1/1 12.42 13.54 70.46 58.61
Random locations of A, G, C, and T 23 23 23 23
9
Phys. Scr. 94 (2019) 125006 R R Violanda et al
Figure 9. MSD versus occurrence number for randomly placed T, A, G, and C nucleotide bases. The graph significantly differs from genome-
based MSD of figures 4–8.
Since Rx (L > 0) 1, the graph removes the Rx (0) from the and the challenge of looking for similarities and differences
plot to emphasize the exponential decay. Similar graphs are that cut across diverse bacterial communities [33, 34], a non-
exhibited by the other nucleotides of other bacterial species. Markovian stochastic process as presented in this paper could
In figure 11, a weak but positive correlation for small lag provide a framework and analytical tool for revealing infor-
values is observed with the autocorrelation function dropping mation encoded in these complex genome sequences. More-
close to zero starting around occurrence number 1000. Note over, knowing the appropriate PDF and the modified
that occurrence numbers less than 1000 refer to nucleotides diffusion equation it obeys provides a novel perspective and
nearer to each other as compared to those with occurrence analytical tool. Such stochastic framework may find future
numbers beyond 1000 where nucleotides have much larger use not only in bacterial taxonomy [33], but also in the design
separations. Considering that the initial or starting point of the and synthesis of DNA sequences [35, 36].
occurrence number (e.g. the first A in figure 1) could be any Here we note that, in evaluating nucleotide separation
nucleotide in the circular genome of the bacteria, a possible distances, no distinction has been made yet between coding
source for the positive correlation could be a functionally and non-coding regions of the DNA sequence [9]. Most
related group of nucleotides beyond which a drop in corre- bacterial genomes have 10%–20% noncoding DNA. It would
lation may develop. For example, a bacterial gene may consist thus be interesting to apply the stochastic functional integral
of around a thousand or more nucleotides depending on its method to investigate spatial distributions of and correlations
function. Moreover an operon, which is a cluster of adjacent between special clusters in genomic strands. It is of interest to
genes with a common control mechanism [32], may also be determine whether trends and relations between motifs will
responsible for this positive correlation. emerge in MSD plots of spatial distributions of dyads, triads
and tetrads of nucleotides. Nonlinear behavior of the MSD
could shed light on functional separation distances for related
6. Conclusion genes clustered together such as an operon. This would be of
immediate consequence to studies of genetic maps and pat-
Predicting patterns in DNA sequences, in general, is made terns in repeated sequences (see, e.g. [37]) and reserved for
challenging by the high degree of complexity and variability future work. Moreover, applications to other biopolymers can
of nucleotide combinations in genomes. With the rapid be explored. For instance, the sequence of amino acids in a
increase in the number of bacterial genomes being sequenced protein and its interaction with a solvent play an important
10
Phys. Scr. 94 (2019) 125006 R R Violanda et al
Figure 10. Probability density function (PDF) as a function of occurrence number L and distance, x - x 0 , for S. elongatus PCC7942. Top
graph (empirical); bottom graph theoretical, equation (7), for b = 0.003 54; c = 0.62.
role in the crystallization of proteins needed to determine Recent advances in computational analysis [41] could help
protein structures [38–40]. Given a protein, the distribution deal with a sequence of twenty amino acids comprising a
properties of identified residues significant in a crystallization protein instead of sequences of four nucleotides discussed in
process can be investigated using the method in this paper. this paper. Possible reduction could be done if distributions of
11
Phys. Scr. 94 (2019) 125006 R R Violanda et al
Figure 11. Autocorrelation function versus occurrence number for the T-base of S. elongatus PCC7942 (Empirical: blue dots; Theoretical:
solid red curve). The fluctuation shows a weak but positive correlation at small lag values indicating that distance fluctuation is weakly
persistent only for small occurrence numbers.
‘patches’ or clusters [40] rather than single amino acids are [4] Hida T, Kuo H H, Potthoff J and Streit L 1993 White Noise—
considered. An Infinite Dimensional Calculus (Dordrecht: Kluwer)
[5] Bernido C C and Carpio-Bernido M V 2012 White noise
analysis: some applications in complex systems, biophysics
and quantum mechanics Int. J. Mod. Phys. B 26 1230014
Acknowledgments [6] Bernido C C and Carpio-Bernido M V 2014 Methods and
Applications of White Noise Analysis in Interdisciplinary
Sciences (Singapore: World Scientific)
The authors thank Benjamin E Rubin, Rev R Aure, Hyunjin [7] Afreixo V, Rodrigues J M, Bastos C A and Silva R M 2016
Shim, and Victor Sojo for helpful discussions. R R V wishes The exceptional genomic word symmetry along DNA
to acknowledge support from the Commission on Higher sequences BMC Bioinform. 17 59
Education. [8] Kuruoglu E E and Arndt P F 2017 The information capacity of
the genetic code: is the natural code optimal? J. Theor. Biol.
419 227–37
[9] Hart A and Martinez S 2014 Markovianness and conditional
ORCID iDs independence in annotated bacterial DNA Stat. Appl. Genet.
Mol. Biol. 13 693–716
[10] Vergne N 2008 Drifting Markov models with polynomial drift
Christopher C Bernido https://orcid.org/0000-0002- and applications to DNA sequences Stat. Appl. Genet. Mol.
9329-214X Biol. 7 6
[11] Bai F L, Liu Y Z and Wang T M 2007 A representation of
DNA primary sequences by random walk Math. Biosci. 209
References 282–91
[12] Hérisson J, Payen G and Gherbi R 2007 A 3D pattern matching
algorithm for DNA sequences Bioinformatics 23 680–6
[1] Shapiro B J, Leducq J-B and Mallet J 2016 What is speciation? [13] Peng C-K, Buldyrev S V, Goldberger A L, Havlin S,
PLoS Genet. 12 e1005860 Sciortino F, Simons M and Stanley H E 1992 Long-range
[2] Sojo V, Pomiankowski A and Lane N 2014 A bioenergetic correlations in nucleotide sequences Nature 356 168–70
basis for membrane divergence in archaea and bacteria PLoS [14] Churchill G A 1989 Stochastic models for heterogeneous DNA
Biol. 12 e1001926 sequences Bull. Math. Biol. 51 79–94
[3] Rubin B E, Wetmore K M, Price M N, Diamond S, [15] Burks C and Farmer D 1984 Towards modeling DNA
Shultzaberger R K, Lowe L C, Curtin G, Arkin A P, sequences as automata Physica D 10 157–67
Deutschbauer A and Golden S S 2015 The essential gene set [16] Cohen S E and Golden S S 2015 Circadian rhythms in
of a photosynthetic organism PNAS 112 E6634–43 cyanobacteria Microbiol. Mol. Biol. Rev. 79 373–85
12
Phys. Scr. 94 (2019) 125006 R R Violanda et al
[17] Cohen J E et al 2017 An innovative biologic system for [29] Cheng-Wu L, Hong-Lai X, Cheng G and Wen-biao L 2018
photon-powered myocardium in the ischemic heart Sci. Adv. Modeling and experiments for the time-dependent diffusion
3 e1603078 coefficient during methane desorption from coal J. Geophys.
[18] Hugler M and Sievert S M 2011 Beyond the Calvin cycle: Eng. 15 315–29
autotrophic carbon fixation in the ocean Annu. Rev. Mar. [30] Barredo W, Bernido C C, Carpio-Bernido M V and
Sci. 3 261–89 Bornales J B 2018 Modelling non- Markovian fluctuations
[19] Zubair S, Fischer A et al 2015 Complete genome sequence of in intracellular biomolecular transport Math. Biosci. 297
Staphylococcus aureus, strain ILRI Eymole1/1, isolated 27–31
from a Kenyan dromedary camel Stand. Genomic Sci. [31] Schulz J H P, Chechkin A V and Metzler R 2013 Correlated
10 109 continuous time random walks: combining scale-invariance
[20] Kuo H-H 1983 Donsker’s delta function as a generalized with long-range memory for spatial and temporal dynamics
Brownian functional and its application Theory and J. Phys. A: Math. Theor. 46 475001
Application of Random Fields. Lect. Notes Control Inf. Sci. [32] Ermolaeva M D, White O and Salzberg S 2001 Prediction of
vol 49 (Berlin: Springer) pp 167–78 operons in microbial genomes Nucleic Acids Res. 21
[21] Lascheck A, Leukert P, Streit L and Westerkamp W 1994 1216–21
More about Donsker’s delta function Soochow J. Math. 20 [33] Land M et al 2015 Insights from 20 years of bacterial genome
401–18 sequencing Funct. Integr. Genomics 15 141–61
[22] Nunno G D, Øksendal B and Proske F (ed) 2009 The Donsker [34] Sumner J G, Jarvis P D and Francis A R 2017 A
delta function and applications Malliavin Calculus for Lévy representation-theoretic approach to the calculation of
Processes with Applications to Finance (Berlin: Springer) evolutionary distance in bacteria J. Phys. A: Math. Theor. 50
[23] Carpio-Bernido M V, Barredo W I and Bernido. C C 2017 On 335601
time-dependent diffusion coefficients arising from stochastic [35] Inouye M, Ishida Y and Inouye K 2017 Designing of a single
processes with memory Structure, Function and Dynamics gene encoding four functional proteins J. Theor. Biol. 419
from nm to Gm ed C D Villagonzalo et al (New York: 266–8
American Institute of Physics) p 050004 [36] Jiménez-Sánchez A 2017 Bacterial cell cycle classification.
[24] Molina-Garcia D, Sandev T, Safdari H, Pagnini G, Application to DNA synthesis and DNA content at any cell
Chechkin A and Metzler R 2018 Crossover from anomalous age J. Theor. Biol. 419 8–12
to normal diffusion: truncated power-law noise correlations [37] Avershina E and Rudi K 2015 Dominant short repeated
and applications to dynamics in lipid bilayers New J. Phys. sequences in bacterial genomes Genomics 105 175–81
20 103027 [38] Kurgan L et al 2009 CRYSTALP2: sequence-based protein
[25] Meerschaert M M and Sabzikar F 2013 Tempered fractional crystallization propensity prediction BMC Struct. Biol. 9 50
Brownian motion Stat. Probab. Lett. 83 2269–75 [39] Fusco D, Barnum T J, Bruno A E, Luft J R, Snell E H,
[26] Sobottka M and Hart A G 2011 A model capturing novel Mukherjee S and Charbonneau P 2014 Statistical analysis of
strand symmetries in bacterial DNA Biochem. Biophys. Res. crystallization database links protein physico-chemical
Commun. 410 823–8 features with crystallization mechanisms PLoS One 9
[27] Burov S, Jeon J-H, Metzler R and Barkai E 2011 Single e101123
particle tracking in systems showing anomalous diffusion: [40] Abramo M C, Caccamo C, Calvo M, Conti Nibali V, Costa D,
the role of weak ergodicity breaking Phys. Chem. Chem. Giordano R, Pellicane G, Ruberto R and Wanderlingh U 2011
Phys. 13 1800–12 Molecular dynamics and small-angle neutron scattering of
[28] Yamanaka K, Narumi T, Hashiguchi M, Okabe H, Hara K and lysozyme aqueous solutions Philo. Mag. 91 2066
Hidaka Y 2018 Time-dependent diffusion coefficients for [41] Shim H 2019 Feature learning of virus genome evolution with
chaotic advection due to fluctuations of convective rolls the nucleotide skip-gram neural network Evol. Bioinform. 15
Fluids 3 99 1–10
13