dataframe - Data importing Delimiter issue in R -


i trying import text file r, , put data frame, along other data.

my delimiter "|" , sample of data here :

|painless check-in. 2 legs of 3 on ac: ac105, yyz-yvr. roomy , clean a321 fantastic crew. ac33: yvr-syd, light load , had 3 seats myself. enthusiastic , friendly crew usual on transpacific route take several times year. arrived 20 min ahead of schedule. expected high level of service our flag carrier, air canada. altitude elite member. |we returned dublin toronto, on winnipeg. other cutting close due limited staffing in toronto our flight excellent. due rush in toronto 1 of our carry ones placed go in cargo hold. when arrived in winnipeg stayed in toronto, helpful , kind @ winnipeg airport, , received 3 phone calls following day in regards misplaced bag , delivered our home. thankful , more appreciative of service received great end wonderful holiday. |flew toronto heathrow. worse flight on way out. paid hefty fee exit seats had no storage whatsoever, , not room under seats. ridiculous. crew poor, not friendly. 1 older male member of staff quite attitudinal, acting though doing huge favour serving them. reasonable dinner breakfast measly piece of banana loaf. that's it! worst airline breakfast have had.enter image description here

as can see, there many "|" , screenshot below shows, when imported data in r, separated once, instead of 152 times.

how each individual piece of text in different column inside data frame? data frame of length 152, not 2.

edit: code lines are:

  mydata <- read.table("c:/users/norbert/desktop/research/important files/airline reviews/reviews/air_can_review.txt", sep="|",quote=null, comment='',fill = true, header=false)  length(mydata) [1] 2 class(mydata) [1] "data.frame" str(mydata) 'data.frame':   1244 obs. of  2 variables:  $ v1: factor w/ 1093 levels "","'delayed' on departure (i reference flights between march 2014 , january 2015 in regard: denver, sfo,",..: 210 367    698 853 1 344 483 87 757 52 ...  $ v2: factor w/ 154 levels ""," hotel","5/9/2014, lhr vancouver, ac855. 23/9/2014, vancouver lhr, ac854. economy leg room ok compared to",..: 1 1 1 1 78 1 1 1 1 1 ...   mydataframe <- data.frame(text = mydata, othervar2 = 1, othervar2 = "blue", stringsasfactors = false)  str(mydataframe)  'data.frame':   531 obs. of  3 variables:   $ text       : chr  "bru-yul, may 26th, a330-300. departed on-time, landed 30 minutes late due strong winds, nice flight, food" "excellent, cabin-crew smiling , attentive except 1 old lady throwing meal trays boomerangs. seat-" "pitch generous, comfortable seat,  ife bit outdated selection okay. air canadas problem is\nthat new pro"| __truncated__ "" ... $ othervar2  : num  1 1 1 1 1 1 1 1 1 1 ... $ othervar2.1: chr  "blue" "blue" "blue" "blue" ...  length(mydataframe) [1] 3 

a better way read in text using scan(), , put data frame other variables (here made up). note took text above, , pasted file called sample.txt, after removing starting "|".

mydata <- scan("sample.txt", = "character", sep = "|") mydataframe <- data.frame(text = mydata, othervar2 = 1, othervar2 = "blue",                           stringsasfactors = false) str(mydataframe) ## 'data.frame':    3 obs. of  3 variables: ##  $ text       : chr  "painless check-in. 2 legs of 3 on ac: ac105, yyz-yvr. roomy , clean a321 fantastic crew. ac33: yvr-syd, light loa"| __truncated__ "we returned dublin toronto, on winnipeg. other cutting close due limited staffing in toront"| __truncated__ "flew toronto heathrow. worse flight on way out. paid hefty fee exit seats had no storage "| __truncated__ ##  $ othervar2  : num  1 1 1 ##  $ othervar2.1: factor w/ 1 level "blue": 1 1 1 

the othervar1, othervar2 placeholders own variables, said wanted data.frame other variables. chose integer variable , text variable, , specifying single value, gets recycled observations in dataset (in example, 3).

i realize question asks how each text in different column, not way use data.frame, since data.frames designed hold variables in columns. (with 1 text per column, cannot add other variables.)

if really want that, have coerce data after transposing it, follows:

mydataframe <- as.data.frame(t(data.frame(text = mydata, stringsasfactors = false)), stringsasfactors = false) str(mydataframe) ## 'data.frame':    1 obs. of  3 variables: ##  $ v1: chr "painless check-in. 2 legs of 3 on ac: ac105, yyz-yvr. roomy , clean a321 fantastic crew. ac33: yvr-syd, light loa"| __truncated__ ##  $ v2: chr "we returned dublin toronto, on winnipeg. other cutting close due limited staffing in toront"| __truncated__ ##  $ v3: chr "flew toronto heathrow. worse flight on way out. paid hefty fee exit seats had no storage "| __truncated__ length(mydataframe) ## [1] 3 

"measly banana loaf"? economy class.


Comments

Popular posts from this blog

angularjs - ADAL JS Angular- WebAPI add a new role claim to the token -

php - CakePHP HttpSockets send array of paramms -

node.js - Using Node without global install -