I faced a situation where I got duplicate values from LEFT JOIN. I think this might be a desired behavior but unlike from what I want.

我面临的情况是,我从左连接中获得重复的值。我认为这可能是一种需要的行为,但与我想要的不同。

I have three tables: person, department and contact.

我有三张桌子:人、部门和联系人。

person :

人:

id bigint,
person_name character varying(255)

department :

部门:

person_id bigint,
department_name character varying(255)

contact :

联系人:

person_id bigint,
phone_number character varying(255)

Sql Query :

Sql查询:

SELECT p.id, p.person_name, d.department_name, c.phone_number 
FROM person p
  LEFT JOIN department d 
    ON p.id = d.person_id
  LEFT JOIN contact c 
    ON p.id = c.person_id;

Result :

结果:

id|person_name|department_name|phone_number
--+-----------+---------------+------------
1 |"John"     |"Finance"      |"023451"
1 |"John"     |"Finance"      |"99478"
1 |"John"     |"Finance"      |"67890"
1 |"John"     |"Marketing"    |"023451"
1 |"John"     |"Marketing"    |"99478"
1 |"John"     |"Marketing"    |"67890"
2 |"Barbara"  |"Finance"      |""
3 |"Michelle" |""             |"005634"

I know it's what joins do, keeping multiplied with selected rows. But It gives a sense like phone numbers 023451,99478,67890 are for both departments while they are only related to person john with unnecessary repeated values which will escalate the problem with larger data set.
So, here is what I want:

我知道这就是join所做的,与选定的行相乘。但它给我们的感觉就像电话号码023451,99478,67890都是给两个部门的,而他们只和那些没有必要的重复值的人有联系,这将会使问题变得更大。所以,这就是我想要的:

id|person_name|department_name|phone_number
--+-----------+---------------+------------
1 |"John"     |"Finance"      |"023451"
1 |"John"     |"Marketing"    |"99478"
1 |"John"     |""             |"67890"
2 |"Barbara"  |"Finance"      |""
3 |"Michelle" |""             |"005634"

This is a sample of my situation and I am using a large set of tables and queries. So, kind of need a generic solution.

这是我的情况示例,我正在使用大量的表和查询。所以,需要一个通用的解决方案。

5 个解决方案

#1


10

I like to call this problem "cross join by proxy". Since there is no information (WHERE or JOIN condition) how the tables department and contact are supposed to match up, they are cross-joined via the proxy table person - giving you the Cartesian product. Very similar to this one:

我喜欢把这个问题称为“代理交叉连接”。由于没有表部门和联系人应该如何匹配的信息(在哪里或连接条件),因此它们通过代理表person进行交叉连接——提供笛卡尔积。和这个很相似:

  • Two SQL LEFT JOINS produce incorrect result
  • 剩下的两个SQL会产生不正确的连接结果

More explanation there.

更多的解释。

Solution for your query:

解决方案为您的查询:

SELECT p.id, p.person_name, d.department_name, c.phone_number
FROM   person p
LEFT   JOIN (
  SELECT person_id, min(department_name) AS department_name
  FROM   department
  GROUP  BY person_id
  ) d ON d.person_id = p.id
LEFT   JOIN (
  SELECT person_id, min(phone_number) AS phone_number
  FROM   contact
  GROUP  BY person_id
  ) c ON c.person_id = p.id;

You did not define which department or phone number to pick, so I arbitrarily chose the first. You can have it any other way ...

您没有定义哪个部门或电话号码可以选择,所以我任意选择了第一个。你可以用其他方法……

更多相关文章

  1. Mysql_案例1:查询出每个部门工资最高的员工信息
  2. RxJs分组热观测值的笛卡尔积
  3. sql grouping with rollup 按部门 合并一些部门

随机推荐

  1. c语言字符数组与字符串应用方法是什么?
  2. c语言fgets函数用法是什么?
  3. C语言中continue的作用是什么
  4. c语言break什么意思?
  5. c语言冒泡排序怎样实现从大到小
  6. c语言中函数调用的方式有哪些?
  7. c 语言怎么实现三个数大小排序
  8. c语言有哪些递归函数的例子?
  9. C语言中main函数的位置可以是任意的么
  10. c语言if语句格式是什么?