Regex between > and < in R with stringr -
how can capture string between > , < in r.
d<-"\"id/56771\" target=\"_self\">children- , adolescents</a></li>\n\t\t\t<li><" //m
str_extract(d,">+(.*?)+<") gives me
>children- , adolescents</a></li>\n\t\t\t<li>< i guess new string command trick, thought there more direct...
you can use str_extract, str_match may better suited:
str_extract(d, ">.*?<") [1] ">children- , adolescents<" the trick here ? modifier tells regex not greedy. regex matching greedy default, means match longest string pattern.
this still leaves bit of work do, i.e. remove first , last character. 1 can vector subsetting, or might easier use str_match instead. returns of pattern matches array:
str_match(d, ">(.*?)<") [,1] [,2] [1,] ">children- , adolescents<" "children- , adolescents" (the 2 matches 1. entire string, , 2. pattern inside brackets.)
this means it's simple matter of returning second element:
str_match(d, ">(.*?)<")[2] [1] "children- , adolescents"
Comments
Post a Comment