Spark SQL:如果是NULL处理。

I am trying to perform IF [spark's coalesce] on top of a left outer joined output, but seems that NULL is not getting handled as expected. Here are my base tables, sample query, output and expected output-

我正在尝试在一个左外连接的输出上执行IF [spark的合并]，但是似乎没有像预期的那样处理NULL。下面是我的基本表、示例查询、输出和预期输出。

Base tables:

基表:

t1:
a,100
b,101
c,102

t1:,100 b,101 c,102

t2:
101

t2:101

Query:

查询:

select a.x, a.x1, IF(b.x1 is NULL,a.x1,b.x1) from t1 a LEFT OUTER JOIN t2 b on a.x1=b.x1;

选择一个。x,a。x1,IF(b。x1是零，a.x1,b.x1，从t1 a左外连接t2 b上的a。

Output:

输出:

a,100,null
b,101,101
c,102,null

100年,零b,101101 c,102年,零

Expected:

预期:

a,100,100
b,101,101
c,102,102

100100 b,101101 c,102102

I have also tried wrapping the above query and then performing an IF on top of it. But with no success. Please suggest is I am missing something.

我也尝试过包装上面的查询，然后在上面执行一个IF。但是没有成功。请建议我缺了什么东西。

2 个解决方案

#1

This seems to be working

这似乎行得通。

File: tbl1

文件:tbl1

1   a
2   b
3   c

File: tbl2

文件:tbl2

1   c
3   d

case class c_tbl1(c1: String,c2: String)

sc.textFile("tbl1").map { row => 
val parts = row.split("\t")
c_tbl1(parts(0),parts(1)) }.registerTempTable("t_tbl1")

case class c_tbl2(c1: String,c2: String)

sc.textFile("tbl2").map { row => 
val parts = row.split("\t")
c_tbl2(parts(0),parts(1)) }.registerTempTable("t_tbl2")

sqlContext.sql("""select t.c1,t.c2,IF(t2.c1 is null,1,2),t2.c2 from t_tbl1 t left outer join t_tbl2 t2 on t.c1=t2.c1""".stripMargin).collect.foreach(println)


[1,a,2,c]
[2,b,1,null]
[3,c,2,d]

2 个解决方案

#1

更多相关文章

随机推荐