I want to find the highest AutoIncremented value from a field. (its not being fetched after an insert where I can use @@SCOPE_IDENTITY etc)
Which of these two queries would run faster or gives better performance.
Id is the primary key and autoincrement field for Table1. And this is for Sql Server 2005.
SELECT MAX(Id) FROM Table1
SELECT TOP 1 Id FROM Table1 ORDER BY Id DESC
[Edit]
Yes in this case Id is the field on which I have defined the clustered index.
If the index is ID DESC then what..
And yes it would be nice to know how the performance would be affected if
1. Id is a clustered index + primary key.
2. Id is a clustered index and not primary key.
3. Id is a non clustered index ASC + primary key.
4. Id is a non clustered index ASC and not primary key.
5. Id is a non clustered index DESC + primary key.
6. Id is a non clustered index DESC and not primary key.
7. Id is just AutoIncrement
Hope its not a tall order!
-
In theory, they will use same plans and run almost same time.
In practice,
SELECT TOP 1 Id FROM Table1 ORDER BY Id DESCwill more probably use a
PRIMARY KEY INDEX.Also, this one is more extendable if you will decide to select some else column along with
id.An actual plan on
MAX()says:SELECT <- AGGREGATE <- TOP <- CLUSTERED INDEX SCAN, while plan for
TOP 1says:SELECT <- TOP <- CLUSTERED INDEX SCAN, i. e.
aggregateis omitted.Aggregate actually won't do anything here, as there is but one row.
P. S. As
@Mehrdad Afshariand@John Sansomnoted, on a non-indexed fieldMAXis slightly faster (of course not20times as optimizer says):-- 18,874,368 rows SET LANGUAGE ENGLISH SET STATISTICS TIME ON SET STATISTICS IO ON PRINT 'MAX' SELECT MAX(id) FROM master PRINT 'TOP 1' SELECT TOP 1 id FROM master ORDER BY id DESC PRINT 'MAX' SELECT MAX(id) FROM master PRINT 'TOP 1' SELECT TOP 1 id FROM master ORDER BY id DESC PRINT 'MAX' SELECT MAX(id) FROM master PRINT 'TOP 1' SELECT TOP 1 id FROM master ORDER BY id DESC PRINT 'MAX' SELECT MAX(id) FROM master PRINT 'TOP 1' SELECT TOP 1 id FROM master ORDER BY id DESC PRINT 'MAX' SELECT MAX(id) FROM master PRINT 'TOP 1' SELECT TOP 1 id FROM master ORDER BY id DESC Changed language setting to us_english. SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. MAX SQL Server Execution Times: CPU time = 0 ms, elapsed time = 20 ms. (строк обработано: 1) Table 'master'. Scan count 3, logical reads 32655, physical reads 0, read-ahead reads 447, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 5452 ms, elapsed time = 2766 ms. TOP 1 SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. (строк обработано: 1) Table 'master'. Scan count 3, logical reads 32655, physical reads 0, read-ahead reads 2, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 6813 ms, elapsed time = 3449 ms. MAX SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. (строк обработано: 1) Table 'master'. Scan count 3, logical reads 32655, physical reads 0, read-ahead reads 44, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 5359 ms, elapsed time = 2714 ms. TOP 1 SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. (строк обработано: 1) Table 'master'. Scan count 3, logical reads 32655, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 6766 ms, elapsed time = 3379 ms. MAX SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. (строк обработано: 1) Table 'master'. Scan count 3, logical reads 32655, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 5406 ms, elapsed time = 2726 ms. TOP 1 SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. (строк обработано: 1) Table 'master'. Scan count 3, logical reads 32655, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 6780 ms, elapsed time = 3415 ms. MAX SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. (строк обработано: 1) Table 'master'. Scan count 3, logical reads 32655, physical reads 0, read-ahead reads 85, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 5392 ms, elapsed time = 2709 ms. TOP 1 SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. (строк обработано: 1) Table 'master'. Scan count 3, logical reads 32655, physical reads 0, read-ahead reads 10, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 6766 ms, elapsed time = 3387 ms. MAX SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. (строк обработано: 1) Table 'master'. Scan count 3, logical reads 32655, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 5374 ms, elapsed time = 2708 ms. TOP 1 SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. (строк обработано: 1) Table 'master'. Scan count 3, logical reads 32655, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 6797 ms, elapsed time = 3494 ms.
Mehrdad Afshari : Query optimizer is deterministic. How come something will "more probably" use an index?Quassnoi : Optimizer's decisions depend on table statistics which depend on table data which are probabilistic. An optimizer MAY decide to sort the table instead of using and INDEX SCAN along with TOP. -
Just compare execution plans and you'll see (press
Ctrl+Min Management Studio when editing a query). My wild guess will be that these queries are equally performant provided there is a (clustered) index onIdcolumn.However, this as a whole is a very bad idea.
Learning : What is a very bad idea?Anton Gogolev : Using additional queries to retrieve the ID of a newly added row is a very bad idea. Quite a lot of database operations may occur between these two requests.Learning : Oh indeed. Quite true. +1Binoj Antony : @Anton please read the question carefully I have clearly stated that "(its not being fetched after an insert where I can use @@SCOPE_IDENTITY etc)" -
MAXis generally faster.Both queries will use the index on the column if exists.
If no index exists on the column, the
TOP 1query will use aTop N Sortoperator to sort the table instead of stream aggregation, which makes it slower.MAXalso provides better readability.Side Note: while
MAXwill use a stream aggregate operator in the execution plan in the indexed case, it doesn't have any specific cost as it's just processing a single row (Actual Rows = 1). You can compare queries by running them in a single batch and see the relative cost. In the indexed case, both queries will cost 50%. I tested the non-indexed case on a table with about 7000 rows and TOP will cost 65% in comparison to MAX that costs 35%.Quassnoi : TOP 1 will not sort the query. The question explicitly notes that ID is a PRIMARY KEY.Mehrdad Afshari : And I think I explicitly mentioned "if no index exists on the column". Read more carefully.Quassnoi : An index always exists on a primary key.Mehrdad Afshari : Yes it does, but remember, SO is something more or less like wikipedia. It never hurts to answer more generally. I think I first answered the question with "Both queries will use the index on the column if exists." and after that clarified the general case to avoid pitfalls.Quassnoi : Maybe it's a communication fault from my side, but as I see your answer, is says that TOP 1 will always sort the table, even if it uses the index. That's not true.Mehrdad Afshari : Yes, I didn't mean to say that. I'll fix the wording to make it more clear. -
If there is a clustered index there is virtually no difference in performance between the two queries.
This is becuase both will perform a Clustered Index Scan that will bear 100% of the query cost.
Performing the two queries on a column that does not have an index results in 3 operators being used in both execution plans.
The Top clause uses the Sort operator and the Max function uses a Stream Aggregate operator.
When there is no index, the MAX() function provides better performance.
Proof of concept can be found and full walkthrough of a test scenario can be found here
Performance Comparison Top 1 Verses MAX() Funciton
Mehrdad Afshari : he indexed case, the stream aggregate operator will have "Actual Rows = 1" which basically costs nothing. If you don't have an index, the execution plan for TOP 1 will have a "Top N" sort which will make it slower than the "Stream Aggregate" that MAX uses.John Sansom : @Mehrdad: Indeed it does, full details will be added to my blog.Binoj Antony : @John, use the permanent link to this question on your blog http://stackoverflow.com/questions/590079/John Sansom : @Binoj: Will do, thanks. -
I've just tested the two SQL statements you provided against a typical dataset:
SELECT MAX(Id) FROM Table1 SELECT TOP 1 Id FROM Table1 ORDER BY Id DESCAnd
SELECT TOP 1 Id FROM Table1 ORDER BY Id DESCis marginally faster because it has one last step in the execution plan. Here are the execution plans each query carries out:SELECT MAX(Id) FROM Table1
Clustered Index Scan >> Top >> Stream Aggregate >> Select
SELECT TOP 1 Id FROM Table1 ORDER BY Id DESC
Clustered Index Scan >> Top >> Select
-
Yes in this case Id is the field on which I have defined the clustered index. If the index is ID DESC then what.. And yes it would be nice to know how the performance would be affected if
- Id is a clustered index + primary key.
- Id is a clustered index and not primary key.
- Id is a non clustered index ASC + primary key.
- Id is a non clustered index ASC and not primary key.
- Id is a non clustered index DESC + primary key.
- Id is a non clustered index DESC and not primary key.
- Id is just AutoIncrement
For Cases 1 and 2, both will perform a clustered index scan that returns a single record. There is no IO difference between the two queries.
For Cases 3, 4, 5 and 6, both will perform an index scan that returns a single record. There is no IO difference between the two queries.
For Case 7, both will perform a table scan. There is no difference in the IO cost.
Summary: Case 1-6 are made of win! If you're in Case 7, then you've already lost from an IO standpoint.
You can measure IO by using SQL's Query analyzer. Run this before your query.
SET STATISTICS IO ON -
Nobody mentioned IDENT_CURRENT('Table1') - blows them all away - of course it only works on identity columns, but that was the question...
Matthew PK : Voted up, but not *always* applicable because `IDENT_CURRENT` isn't necessarily in scope. To be fastest and safest the transaction should be in a stored procedure and use `SCOPE_IDENTITY()`
0 comments:
Post a Comment