Академический Документы
Профессиональный Документы
Культура Документы
Revolution Computing
New Haven, CT USA
Rmetrics 2009
Outline
iterators
foreach
foreach(iterator,...) %dopar% {
statements
}
Example
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 4
Examples
> z <- 2
> f <- function (x) { sqrt (x + z) }
> foreach (j=1:4, .combine=c) %dopar% { f (j) }
[1] 1.732051 2.000000 2.236068 2.449490
List comprehension
M <- 100
S <- matrix(0,M,M)
for (j in 1:(M-1)) {
for (k in min ((j+2),M):M) {
R <- simpleRule (Cl (MSFT),j,k,9, Ra, Rb)
Dt <- na.omit (R - Rb)
S[j,k] <- mean (Dt)/sd(Dt)
}
}
Now in parallel, by rows...
M <- 100
S <- foreach (j=1:(M-1), .combine=rbind,
.packages=c (xts,TTR)) %dopar% {
x <- rep (0,M)
for (k in min ((j+2),M):M) {
R <- simpleRule (Cl (MSFT),j,k,9,Ra,Rb)
Dt <- na.omit (R - Rb)
x[k] <- mean (Dt)/sd( Dt)
}
x
}
Parallelizing parts of an existing package
Basic idea
I Profile code with Rprof (profr is a nice wrapper that visualizes
the results)
I Examine bottlenecks for apply-like statements and for loops
with independent code blocks
I Rewrite for loops without side-effects as required (may require
a custom combine function)
I Unlock the namespace, provisionally replace target
function(s) and experiment (a nice trick)
Example: ipred
# An example
mapReduce (cyl, mean(mpg), mean(hp),
data=mtcars, applyfun=sapply)
# With multicore:
require (mutlicore)
mapReduce (cyl, mean(mpg), mean(hp),
data=mtcars, applyfun=mclapply)
# With foreach:
require (foreach)
fapply <- function (A,B,C) {
foreach (j=A, .combine=cbind) %dopar% B(j, C) }
mapReduce (cyl, mean(mpg), mean(hp),
data=mtcars, applyfun=fapply)