Вы находитесь на странице: 1из 3

27/12/2014

HowtotrimleadingandtrailingwhitespaceinR?­StackOverflow

StackOverflowisaquestionandanswersiteforprofessionalandenthusiastprogrammers.It's100%free,no

registrationrequired.

registrationrequired. Takethe2­minutetour ×
× How to trim leading and trailing whitespace in R?

Iamhavingsometroubleswithleadingandtrailingwhitespaceinadata.frame.EgIliketotakealookat aspecific row ina data.frame basedonacertaincondition:

> myDummy[myDummy$country == c("Austria"),c(1,2,3:7,19)]

[1] codeHelper

country

dummyLI

dummyLMI

dummyUMI

[6] dummyHInonOECD dummyHIOECD <0 rows> (or 0length row.names)

dummyOECD

IwaswonderingwhyIdidn'tgettheexpectedoutputsincethecountryAustriaobviouslyexistedinmy data.frame .AfterlookingthroughmycodehistoryandtryingtofigureoutwhatwentwrongItried:

>

myDummy[myDummy$country == c("Austria "),c(1,2,3:7,19)]

 

codeHelper country dummyLI dummyLMI dummyUMI dummyHInonOECD dummyHIOECD

18

AUT Austria

0

0

0

0

1

dummyOECD

18

1

AllIhavechangedinthecommandisanadditionalwhitespaceafterAustria.

Furtherannoyingproblemsobviouslyarise.EgwhenIliketomergetwoframesbasedonthecountry column.One data.frame uses "Austria " whiletheotherframehas "Austria" .Thematching doesn'twork.

1. Isthereanicewayto'show'thewhitespaceonmyscreensothatiamawareoftheproblem?

2. AndcanIremovetheleadingandtrailingwhitespaceinR?

SofarIusedtowriteasimple Perl scriptwhichremovesthewhitespacebutitwouldbeniceifIcan somehowdoitinsideR.

r
r
 

askedFeb14'10at12:44

JeromyAnglim mropa
JeromyAnglim mropa

8,990

8,990

3

3

49

49

105

3,078

3,078

2

2

17

17

24

Ijustsawthat sub() usesthe Perl notationaswell.Sorryaboutthat.Iamgoingtotrytousethefunction.

Butformyfirstquestionidon'thaveasolutionyet.– mropa Feb14'10at12:50

Ashadleypointeditthisregex"^\\s+|\\s+$"willidentifyleadingandtrailingwhitespace.sox<­ gsub("^\\s+|\\s+$","",x)manyofR'sreadfunctionsashavethisoption:strip.white=FALSE– JayFeb14'10

at15:11

addacomment

5 Answers

Probablythebestwayistohandlethetrailingwhitespaceswhenyoureadyourdatafile.Ifyouuse read.csv or read.table youcansettheparameter strip.white=TRUE .

Ifyouwanttocleanstringsafterwardsyouoneofthesefunctions:

# returns string w/o leading whitespace

trim.leading <‐ function (x) sub("^\\s+", "", x)

# returns string w/o trailing whitespace

trim.trailing <‐ function (x) sub("\\s+$", "", x)

# returns string w/o leading or trailing whitespace

trim <‐ function (x) gsub("^\\s+|\\s+$", "", x)

Touseoneofthesefunctionson myDummy$country :

27/12/2014

HowtotrimleadingandtrailingwhitespaceinR?­StackOverflow

myDummy$country <‐ trim(myDummy$country)

To'show'thewhitespaceyoucoulduse:

paste(myDummy$country)

whichwillshowyouthestringssurroundedbyquotationmarks(")makingwhitespaceseasiertospot.

answeredFeb14'10at13:13

editedFeb14'10at15:52 answeredFeb14'10at13:13 f3lix 14.6k 7 37 64

14.6k

7

37

64

@f3lixohthosearesomenicetips!thanks!– mropa Feb14'10at13:43

3 Ashadleypointeditthisregex"^\\s+|\\s+$"willidentifyleadingandtrailingwhitespace.sox<­ gsub("^\\s+|\\s+$","",x)manyofR'sreadfunctionsashavethisoption:strip.white=FALSE– JayFeb14'10

at15:10

@Jay:Thanksforthehint.Ichangedtheregexpsinmyanswertousetheshorter"\\s"insteadof"[\t]".–

f3lixFeb14'10at15:46

8 Seealso str_trim inthe stringr package.– RichieCottonFeb16'10at15:35

1 Plusonefor"Trimfunctionnowstoredforfutureuse"­thanks!– ChrisBeeleyJan17'12at9:56

show2morecomments

ChrisBeeley Jan17'12at9:56 show 2 morecomments

Tomanipulatethewhitespace,usestr_trim()inthestringrpackage.ThepackagehasmanualdatedFeb

15,2013andisinCRAN.Thefunctioncanalsohandlestringvectors.

install.packages("stringr", dependencies=TRUE) require(stringr) example(str_trim)

d4$clean2<‐str_trim(d4$V2)

(creditgoestocommenter:R.Cotton)

answeredFeb21'13at16:30

(creditgoestocommenter:R.Cotton) answeredFeb21'13at16:30 user56 1 785 10 23 1 +1For bestpractice

1

785

10 23

1 +1Forbestpractice,mosteasy,mostconvenientsolution!– petermeissnerOct16at12:24

addacomment

ad1)Toseewhitespacesyoucoulddirectlycall print.data.frame withmodifiedarguments:

print(head(iris), quote=TRUE)

# Sepal.Length Sepal.Width Petal.Length Petal.Width Species

# 1

"5.1"

"3.5"

"1.4"

"0.2" "setosa"

# 2

"4.9"

"3.0"

"1.4"

"0.2" "setosa"

# 3

"4.7"

"3.2"

"1.3"

"0.2" "setosa"

# 4

"4.6"

"3.1"

"1.5"

"0.2" "setosa"

# 5

"5.0"

"3.6"

"1.4"

"0.2" "setosa"

# 6

"5.4"

"3.9"

"1.7"

"0.4" "setosa"

Seealso ?print.data.frame forotheroptions.

answeredFeb15'10at10:00

forotheroptions. answeredFeb15'10at10:00 Marek 19.5k 5 38 62 addacomment Asimple

19.5k

5

38

62

addacomment

Asimplefunctiontoremoveleadingandtrailingwhitespace:

trim <‐ function( x ) { gsub("(^[[:space:]]+|[[:space:]]+$)", "", x)

}

Usage:

> text = "

>

[1] "foo bar

trim(text)

foo bar

baz 3"

baz 3 "

27/12/2014

answeredFeb19at13:37

BernhardKausler 1,541 10 23 addacomment
BernhardKausler
1,541
10
23
addacomment

HowtotrimleadingandtrailingwhitespaceinR?­StackOverflow

Usegreporgrepltofindobservationswithwhitespacesandsubtogetridofthem.

names<‐c("Ganga Din\t","Shyam Lal","Bulbul ") grep("[[:space:]]+$",names) [1] 1 3 grepl("[[:space:]]+$",names) [1] TRUE FALSE TRUE sub("[[:space:]]+$","",names) [1] "Ganga Din" "Shyam Lal" "Bulbul"

answeredFeb14'10at14:13

JyotirmoyBhattacharya 2,939 1 13 25
JyotirmoyBhattacharya
2,939
1
13
25

4 Or,alittlemoresuccinctly, "^\\s+|\\s+$" hadleyFeb14'10at14:45

1 Justwantedtopointout,thatonewillhavetouse gsub insteadof sub withhadley'sregexp.With sub it

willstriptrailingwhitespaceonlyifthereisnoleadingwhitespace

f3lixFeb14'10at15:50

Didn'tknowyoucoulduse\setc.withperl=FALSE.ThedocssaythatPOSIXsyntaxisusedinthatcase,but thesyntaxacceptedisactuallyasupersetdefinedbytheTREregexlibrary laurikari.net/tre/documentation/regex­syntaxJyotirmoyBhattacharyaFeb14'10at18:37

addacomment