实际项目中有如下SQL, 发现效率很低,用时超过1分钟
select TaskID, StartDate = min(UpdateTime), EndDate = max(UpdateTime)
from BFDB_Code.dbo.packcodeinfo p
inner join Task t on p.TaskID = t.ID
where datediff(day, t.EndDate, getdate()) < @day
and t.Status in(4,5)
group by TaskID
通过查询计划可知上面语句进行了全表扫描,所以效率很低
单个查询时并不慢,因为在TaskID上已经建立索引
select TaskID, StartDate = min(UpdateTime), EndDate = max(UpdateTime)
from BFDB_Code.dbo.packcodeinfo p
where p.TaskID in (2488,2499)
group by TaskID
但如果是这样查询,仍然会全表扫描
select TaskID, StartDate = min(UpdateTime), EndDate = max(UpdateTime)
from BFDB_Code.dbo.packcodeinfo p
where p.TaskID in (
SELECT id FROM task t where t.Status in(2,3)
)
group by TaskID
优化方案
去除inner join,写一个函数返回类似(2488,2499)值,然后动态构造SQL语句执行。
函数定义get_begin_task
create function [dbo].[get_begin_task]()
returns varchar(1000) as
begin DECLARE @csv VARCHAR(1000) SELECT @csv = COALESCE(@csv + ',', '') + Convert(varchar(10), ID)
FROM task t
where t.Status in(2,3)
and createdate >= '2016-05-01'; Return @csv end;
修改存储过程如:
declare @sql varchar(8000);
set @sql= '
select TaskID, StartDate = min(UpdateTime), EndDate = max(UpdateTime)
from BFDB_Code.dbo.packcodeinfo p
where p.TaskID in (' + dbo.get_begin_task() +')
group by TaskID
' exec(@sql);
小插曲:
因为用到了跨数据库查询,动态执行SQL语句需要执行 EXEC sp_addlinkedserver ‘BFDB_Code’命令。