regex - Use lapply on a subset of list elements and return list of same length as original in R -


i want apply regex operation subset of list elements (which character strings) using lapply , return list of same length original. list elements long strings (derived reading in long text files , collapsing paragraphs single string). regex operation valid subset of list elements/strings. want non-subsetted list elements (character strings) returned in original state.

the regex operation str_extract stringr package, i.e. want extract substring longer string. subset list elements based on regex pattern in filename.

an example simplified data:

library(stringr) texts <- as.list(c("abcdefghijkl", "mnopqrstuvwxyz", "ghijklmnopqrs", "uvwxyzabcdef")) filenames <- c("ab1997r.txt", "bg2000s.txt", "mn1999r.txt", "dc1997s.txt") names(texts) <- filenames regexp <- "abcdef" 

i know in advance strings want apply regex operation, , hence want subset these strings. is, don't want run regex on elements in list, doing return invalid results (which not apparent in simplified example).

i've made few naive efforts, e.g.:

x <- lapply(texts[str_detect(names(texts), "1997")], str_extract, regexp) > x $ab1997r.txt [1] "abcdef"  $dc1997s.txt [1] "abcdef" 

which returns reduced-length list containing substrings found. results want are:

> x $ab1997r.txt [1] "abcdef"  $bg2000s.txt [1] "mnopqrstuvwxyz"  $mn1999r.txt [1] "ghijklmnopqrs"  $dc1997s.txt [1] "abcdef" 

where strings not containing regex pattern returned in original state.

i have informed myself stringr, lapply , llply (in plyr package), many operations illustrated using dataframes examples, not lists, , don't involve regex operations on character strings. can achieve goal using loop, i'm trying away that, advised, , better @ using apply-class of functions.

you can use subset operator [<-:

x <- texts is1997 <- str_detect(names(texts), "1997") x[is1997] <- lapply(texts[is1997], str_extract, regexp) x # $ab1997r.txt # [1] "abcdef" # # $bg2000s.txt # [1] "mnopqrstuvwxyz" # # $mn1999r.txt # [1] "ghijklmnopqrs" # # $dc1997s.txt # [1] "abcdef" # 

Comments

Popular posts from this blog

angularjs - ADAL JS Angular- WebAPI add a new role claim to the token -

node.js - Using Node without global install -

php - CakePHP HttpSockets send array of paramms -