mysql - Failing to successfully implement a left join - columns remain null -
i'm working price data , have 2 databases. in one, have daily prices , on other have quarterly fundamental data. , stocks in each database have different identifiers got linking table . want join price database linking table matching identifier each stock . final goal join modified price table fundamental data table ( i'll make join on identifier got linking table).
however , columns want fill : dados_crsp.gvkey , linkdate , linkenddate returned null. seems no modifications made price table. first time working sql i'm doing stupid! , sorry bad formatting.
the code , example of data follow below :
dados crsp (price table) :
+--------------+--------+---------+-------+----------+-------------+ | ret_date | permco | price | gvkey | linkdate | linkenddate | +--------------+--------+---------+-------+----------+-------------+ | '1986-02-04' | 7976 | -17.625 | null | null | null | | '1986-02-05' | 7976 | -18.375 | null | null | null | +--------------+--------+---------+-------+----------+-------------+
linktable :
+-------+--------+--------------+--------------+ | gvkey | permco | linkdt | linkenddt | +-------+--------+--------------+--------------+ | 1004 | 20000 | '1972-04-24' | '2014-12-31' | | 1010 | 22156 | 1962-01-31', | '1984-06-28' | +-------+--------+--------------+--------------+
code :
create table dados_crsp ( ret_date date, -- date permco integer, -- price identifier price float, gvkey integer , -- starts empty linkdate date , -- starts empty linkenddate date -- starts empty ) ; load data local infile filename table dados_crsp columns terminated ',' ignore 1 lines; create table link_table ( gvkey integer, -- fundamental data identifier permco integer, linkdate date, -- date when gvkey permco mapping starts given stock linkenddate date -- date when gvkey permco mapping end given stock ) ; load data local infile filename table link_table columns terminated ',' ignore 1 lines; update link_table set linkenddate = '2014-12-31' linkenddate = '0000-00-00'; --added indexes make join faster alter table link_table add index (permco); alter table link_table add index (linkdate); alter table link_table add index (linkenddate); alter table dados_crsp add index (permco); alter table dados_crsp add index (ret_date); --i make join on matching identifiers , when date between date of --activity given identifier. because same company may change mapping through time` update dados_crsp left join link_table on dados_crsp.permco = link_table.permco set dados_crsp.gvkey = link_table.gvkey , dados_crsp.linkdate = link_table.linkdate , dados_crsp.linkenddate = link_table.linkenddate dados_crsp.ret_date >= link_table.linkdate , dados_crsp.ret_date <= link_table.linkenddate;
your clause doesn't handle nulls, it's creating implicit inner join. also, you're update syntax wrong. list of columns after set separated commas, not ands.
you'll need following:
update dados_crsp left join link_table on dados_crsp.permco = link_table.permco set dados_crsp.gvkey = link_table.gvkey, dados_crsp.linkdate = link_table.linkdate, dados_crsp.linkenddate = link_table.linkenddate (dados_crsp.ret_date >= link_table.linkdate or link_table.linkdate null , (dados_crsp.ret_date <= link_table.linkenddate or link_table.linkenddate null);
alternately, specify condition join condition. don't need handle nulls there because, well, join hasn't happened yet.
update dados_crsp left join link_table on dados_crsp.permco = link_table.permco , dados_crsp.ret_date >= link_table.linkdate , dados_crsp.ret_date <= link_table.linkenddate set dados_crsp.gvkey = link_table.gvkey, dados_crsp.linkdate = link_table.linkdate, dados_crsp.linkenddate = link_table.linkenddate;
that's actual design issue see, that's assuming want left join here.
Comments
Post a Comment