Friday, February 11, 2011

SQL Server 2008: Why table scaning when another logical condition is satisfied first?

Consider following piece of code:

declare @var bit = 0

select * from tableA as A
where
1=
(case when @var = 0 then 1
      when exists(select null from tableB as B where A.id=B.id) 
      then 1
      else 0
end)

Since variable @var is set to 0, then the result of evaluating searched case operator is 1. In the documentation of case it is written that it is evaluated until first WHEN is TRUE. But when I look at execution plan, I see that tableB is scanned as well.

Does anybody know why this happens? Probably there are ways how one can avoid second table scan when another logical condition is evaluated to TRUE?

  • Because the plan that is compiled and cached needs to work for all possible values of @var

    You would need to use something like

    if (@var = 0)
    select * from tableA 
    else
    select * from tableA as A
    where exists(select * from tableB as B where A.id=B.id) 
    

    Even OPTION RECOMPILE doesn't look like it would help actually. It still doesn't give you the plan you would have got with a literal 0=0

    declare @var bit = 0
    
    select * from 
    master.dbo.spt_values  as A
    where
    1=
    (case when 0 = @var then 1
          when exists(select null from master.dbo.spt_values as B where A.number=B.number) 
          then 1
          else 0
    end)
    option(recompile)
    

    Plan

    select * from 
    master.dbo.spt_values  as A
    where
    1=
    (case when 0 = 0 then 1
          when exists(select null from master.dbo.spt_values as B where A.number=B.number) 
          then 1
          else 0
    end)
    

    Plan

    RE: Question in comments. Try the following with the "Include Actual Execution Plan" option enabled.

    declare @var bit = datepart(second,GETDATE())%2
    
    print @var
    
    if (@var = 0)
    select * from 
    master.dbo.spt_values  --8BA71BA5-3025-4967-A0C8-38B9FBEF8BAD
    else
    select * from 
    master.dbo.spt_values  as A --8BA71BA5-3025-4967-A0C8-38B9FBEF8BAD
    where exists(select null from master.dbo.spt_values as B where A.number=B.number) 
    

    Then try

    SELECT usecounts, cacheobjtype, objtype, text, query_plan
    FROM sys.dm_exec_cached_plans 
    CROSS APPLY sys.dm_exec_sql_text(plan_handle) 
    CROSS APPLY sys.dm_exec_query_plan(plan_handle) 
    where text like '%8BA71BA5-3025-4967-A0C8-38B9FBEF8BAD%'
    

    The Compiled Plan will look like

    Plan

    The Actual Execution Plan will show only one path was actually executed though.

    Tim : In the case when @var=0 it still be scanned?
    Tim : right I have already used such approach, but wanted to avoid code duplication.
    KM : @Tim, in sql go with fast index using queries and forget about code duplication, the only thing that matters is speed. making it "pretty" often kills your execution plan and performance goes down the drain.
    Tim : @KM, completely agree
    Tim : @Martin, nice investigations :)
    Tim : @Martin, The same situation with WHERE 0=0 OR exists() where short-circuit is really applied..
    Tim : Hm, strange, but if one uses approach with IF ELSE, then scan of should appear as well in order to guarantee when @var<>0 that exists() is considered... Or in fact plan is recompiled?
    Martin Smith : @Tim - See edit.
    Tim : @Martin, super, that explains the situation, thanks for the answer!
  • if tableB has few rows, a table scan is the fastest way to go.

    best source for dynamic search conditions:

    Dynamic Search Conditions in T-SQL by Erland Sommarskog

    there are a lot of subtle implications on how you do this as to if an index can be used or not. If you are on the proper release of SQL Server 2008 you can just add OPTION (RECOMPILE) to the query and the local variable's value at run time is used for the optimizations.

    Consider this, OPTION (RECOMPILE) will take this code (where no index can be used with this mess of ORs):

    WHERE
        (@search1 IS NULL or Column1=@Search1)
        AND (@search2 IS NULL or Column2=@Search2)
        AND (@search3 IS NULL or Column3=@Search3)
    

    and optimize it at run time to be (provided that only @Search2 was passed in with a value):

    WHERE
        Column2=@Search2
    

    and an index can be used (if you have one defined on Column2)

    Tim : You are right, but unfortunately a table scan is not an issue here..
    Tim : +1 for link about dynamic search conditions.
    From KM

0 comments:

Post a Comment