How to read a tab delimited file in R when data row contains an extra separator at end of line? -
i'm trying read file has annoying problem. header has 5 columns data has 6 due tab character @ end of line data rows. confused r , put item code row name , data not shifted 1 position.
> items <- read.csv("http://download.bls.gov/pub/time.series/cu/cu.item", sep = "\t") > items[1,] item_code item_name display_level selectable sort_sequence aa0 items - old base 0 true 2 na > row.names(items[1,]) [1] "aa0"
any idea how fix this? if specify row.names = null, read in item code "row.names" column still shifted.
> items <- read.csv("http://download.bls.gov/pub/time.series/cu/cu.item", sep = "\t", row.names = null) > items[1,] row.names item_code item_name display_level selectable sort_sequence 1 aa0 items - old base 0 true 2 na
as mentioned in comment, can try read in file without first line , add headers in later.
something like:
read.table("http://download.bls.gov/pub/time.series/cu/cu.item", header = false, skip = 1, col.names = c( scan("http://download.bls.gov/pub/time.series/cu/cu.item", = "", n = 5), "xxxxx"), sep = "\t")[-6]
the [-6]
drop column of na
values.
here's above:
head( read.table("http://download.bls.gov/pub/time.series/cu/cu.item", header = false, skip = 1, col.names = c( scan("http://download.bls.gov/pub/time.series/cu/cu.item", = "", n = 5), "xxxxx"), sep = "\t")[-6]) # read 5 items # item_code item_name display_level # 1 aa0 items - old base 0 # 2 aa0r purchasing power of consumer dollar - old base 0 # 3 sa0 items 0 # 4 sa0e energy 1 # 5 sa0l1 items less food 1 # 6 sa0l12 items less food , shelter 1 # selectable sort_sequence # 1 true 2 # 2 true 399 # 3 true 1 # 4 true 374 # 5 true 358 # 6 true 361
Comments
Post a Comment