Order by and Join in SQL or spark or mapreduce -


i have 2 tables content below.

table 1: id1     id2     id3    id4                name    descr   status        date    1    -12134    17773    8001300701101    name1    descr1    inactive    20121203 2    -12136    17773    8001300701101    name1    descr1    inactive     20121202 3    -12138    17785    9100000161822    name3    descr3    inactive    20121201 4    -12140    17785    9100000161822    name3   descr3    active        20121130 5    -12142    17787    8000500039106    name4    descr4    active        20121129  table2: id1    id2         id3        id4       name    descr   0    17781    17773    8001300701101    name1    descr1 0    17783    17783    8001300060109    name2    descr2 0    17785    17785    9100000161822    name3    descr3 0    17787    17787    8000500039106    name4    descr4 0    17789    17789    0000080052364    name5    descr5 

i trying below result.

id3        id4            name     descr      status        date         17773    8001300701101    name1    descr1      inactive    20121202 17783    8001300060109    name2    descr2      null        null 17785    9100000161822    name3    descr3      active      20121201 17787    8000500039106    name4    descr4      active      20121129 17789    0000080052364    name5    descr5      null        null 

as per above i/p , o/p, 2 tables should joined based on columns id3, id4, name , desc. if active record exists, should return active record. if inactive record exists, oldest inactive record should joined.

i tried different queries no longer near answer wanted. 4 columns joined non primary fields not nulls. there can 1 many or many many relationship between 2 tables.

i working on apache phoenix , if solution in hadoop mapreduce or in apache spark ok.

a sample query have written follows.

select table2.*, table1.status, table1.date  table1 right outer join table2 on table1.id3 = table2.id3              ,  table1.id4 = table2.id4              , table1.name = table2.name              , table1.descr = table2.descr  order (status) , order (date) 

can 1 me please?

you cannot straight join against table 1. instead, have join against multiple queries of table 1, joined together. count, going have do:

  1. a query find minimum date active records in table 1 per id3, id4, etc.
  2. a query find minimum date inactive records in table 1
  3. a full outer join between above 2 queries
  4. coalesce select active versus inactive fields.

something this:

val cookedtable1 = table1.filter(   $"status" === "active" ).groupby(   $"id3", $"id4", $"name", $"descr", $"status" ).agg(   $"id3", $"id4", $"name", $"descr", $"status", min($"date") "date" ).join(   table1.filter(     $"status" === "inactive"   ).groupby(     $"id3", $"id4", $"name", $"descr", $"status"   ).agg(     $"id3", $"id4", $"name", $"descr", $"status", min($"date") "date"   ).select(     $"id3" "id3r", $"id4" "id4r", $"name" "namer", $"descr" "descrr",     $"status" "statusr", $"date" "dater"   ),   $"id3" === $"id3r" , $"id4" === $"id4r" , $"name" === $"namer" , $"descr" === $"descrr",    "full_outer" ) .select(   coalesce($"id3", $"id3r") "id3",    coalesce($"id4",$"id4r") "id4",    coalesce($"name", $"namer") "name",   coalesce($"descr", $"descrr") "descr",   coalesce($"status", $"statusr") "status",   coalesce($"date", $"dater") "date" ) 

given have in above table 1, result like:

cookedtable1.show  id3   id4           name  descr  status   date 17785 9100000161822 name3 descr3 active   20121130 17787 8000500039106 name4 descr4 active   20121129 17773 8001300701101 name1 descr1 inactive 20121202 

now, using cookedtable1 in place of table1, same query did before:

cookedtable1.registertemptable("cookedtable1") val results = sqlcontext("select table2.*, cookedtable1.status, cookedtable1.date   cookedtable1 right outer join table2 on cookedtable1.id3 = table2.id3     , cookedtable1.id4 = table2.id4     , cookedtable1.name = table2.name     andcookedtable1.descr = table2.descr" ) 

this should results looking for.


Comments

Popular posts from this blog

node.js - Using Node without global install -

How to access a php class file from PHPFox framework into javascript code written in simple HTML file? -

java - Null response to php query in android, even though php works properly -