I have two tables. order_details which is 100,000 rows, and outbound which is 10,000 rows.

我有两张桌子。 order_details是100,000行,出站是10,000行。

I need to join them on a column called order_number, which is a VARCHAR(50) on both. order_number is not unique in the outbound table.

我需要在名为order_number的列上加入它们,这两个列都是VARCHAR(50)。 order_number在出站表中不唯一。

CREATE TABLE `outbound` (
    `outbound_id` int(12) NOT NULL,
    `order_number` varchar(50) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `order_details` (
    `order_details_id` int(12) NOT NULL,
    `order_number` varchar(50) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

This is my initial query, and it takes well over 60 seconds to run:

这是我的初始查询,运行时间超过60秒:

SELECT o.order_number
FROM outbound o
INNER JOIN order_details od
    ON o.order_number = od.order_number

This query gets the same results and takes less than a second to run:

此查询获得相同的结果,运行时间不到一秒:

SELECT o.order_number
FROM outbound o
INNER JOIN
(
    SELECT order_number
    FROM order_details
) od
ON (o.order_number = od.order_number)

This is surprising to me because usually sub-queries are significantly slower.

这对我来说很令人惊讶,因为通常子查询要慢得多。

Running EXPLAIN (which I'm still learning how to understand) shows that the sub query version uses a derived2 table, that it is using an index, and that index is auto_key0. I'm not savvy enough to know how to interpret this to understand why this makes a significant difference.

运行EXPLAIN(我还在学习如何理解)显示子查询版本使用derived2表,它使用索引,该索引是auto_key0。我不够精明,不知道如何解释这一点,以了解为什么这会产生重大影响。

I am running these queries over command line.

我在命令行上运行这些查询。

I am running MySQL Ver 14.14 Distrib 5.6.35, for Linux (x86_64) CentOS.

我正在为Linux(x86_64)CentOS运行MySQL Ver 14.14 Distrib 5.6.35。

In summary:

Why is this simple join query significantly quicker with a sub-query?

为什么这个简单的连接查询使用子查询明显更快?

1 个解决方案

#1


6

My knowledge of MySQL is very limited. But these are my thoughts:

我对MySQL的了解非常有限。但这些是我的想法:

Your tables don't have indexes. Then the join has to read the entire second table in order to compare, for each row of the first table.

您的表没有索引。然后,连接必须读取整个第二个表,以便比较第一个表的每一行。

The subquery reads the second table once and creates an index, then it doesn't need to read the entire second table for each row of the first table. It only has to check the index, which is much more faster.

子查询一次读取第二个表并创建索引,然后它不需要为第一个表的每一行读取整个第二个表。它只需要检查索引,这要快得多。

To verify if I'm ritght or not, try creating indexes for the column order_number in your two tables (CREATE INDEX ... ), and run again this two queries. Your first query should only take less than a second instead of a minute.

要验证我是否正确,请尝试在两个表(CREATE INDEX ...)中为order_number列创建索引,然后再次运行这两个查询。您的第一个查询应该只需不到一秒钟而不是一分钟。

更多相关文章

  1. MySQL——关于MySQL分组查询group by和order by获取最新时间内容
  2. 数据库sql及索引优化
  3. linux时间与网络同步 // tomcat、redis、mysql等开机启动//远程
  4. sqlserverdate时间转换给出不正确的结果?
  5. 如何:在SQL Server 2005中管理多个重叠索引
  6. mysql 将时间戳直接转换成日期时间
  7. 我如何通过日期时间PIVOT TABLE或CrossTab?
  8. 在Oracle SQL中将时间戳转换为日期
  9. mysql获取当前时间、秒数

随机推荐

  1. 【原创】Mysql中select的正确姿势
  2. 利用闪回恢复MySQL误操作数据-DML
  3. PL\SQL 客户端配置 windows 64 ORACLE
  4. 如何:在SQL Server 2005中管理多个重叠索
  5. 比比谁的单条SQL语句最长。先看我的:
  6. Oracle Pro*c 中sqlca以及oraca的定义和
  7. PostgreSQL的数据存储(十七)---数据存储
  8. Mssql根据表名获取字段
  9. 这被认为是正常形式的失败吗?
  10. 建议一种有效的查询方式