It's a commonly held misconception, even among some knowledgeable SQL folks, that the SQL Server query optimizer evaluates different ways of executing a query and goes with the query plan that yields the lowest estimated cost. As is often the case with misconceptions, this one is not so much wrong as a bit oversimplified. The SQL Server query optimizer breaks queries up into subsets called "groups" and optimizes the qroups separately. It then assembles the optimized groups into a single query that, when executed, returns the rows the query writer asked for. (Which, hopefully, are also the rows that he or she actually wanted.)
In most cases the query optimizer does a very good job on its own. In some cases, examining the query plan in the Management Studio can provide insights into ways SQL Server might be provided with additional resources that might help it execute a query faster.
Today we'll take a look at some techniques that are of no practical value whatsoever. Unless, of course, you are curious about digging deeper into SQL Server to see how it works.
Some Undocumented Query HintsIn the world of Microsoft "undocumented" usually means something more like "not officially documented". In any case, there are a few query hints that let us examine more closely what the query optimizer has accomplished in preparing an execution plan.
We'll start by using the same query used to examine batch processing in the blog XXXXXXXXXXXX.
DBCCTRACEON(3604)
SELECTd.CalendarMonth
,SUM(fs.SalesAmount)
-,SUM(fs.ReturnAmount)
-,SUM(fs.SalesQuantity)
-,SUM(fs.ReturnQuantity)
FROMDimDate d
JOINFactSales fs
ONfs.DateKey =d.DateKey
GROUPBYd.CalendarMonth
OPTION(RECOMPILE,QUERYTRACEON 8605);
For the output of these traces, the magical DBCC TRACEON (3604) must be executed. Some people include QUERYTRACEON 3604 as a query hint, but since DBCC TRACEON (3604) has session-level scope, we need only run it once and it will remain set for the duration of our session.
Note that the query hint includes "RECOMPILE". If we did not do this, the trace would only generate information the first time the query was run. On subsequent executions, the query would be cached and there would be no optimizer output.
Trace 8605The output for trace 8605 shows the input tree, which lays out the logical operations implied gy the query itself.
*** Converted Tree: ***
LogOp_Project QCOL: [d].CalendarMonth COL: Expr1002
LogOp_GbAgg OUT(QCOL: [d].CalendarMonth,COL: Expr1002 ,) BY(QCOL: [d].CalendarMonth,)
LogOp_Project
LogOp_Join
LogOp_Get TBL: DimDate(alias TBL: d) DimDate TableID=610101214 TableReferenceID=0 IsRow: COL: IsBaseRow1000
LogOp_Get TBL: FactSales(alias TBL: fs) FactSales TableID=1602104748 TableReferenceID=0 IsRow: COL: IsBaseRow1001
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [fs].DateKey
ScaOp_Identifier QCOL: [d].Datekey
AncOp_PrjList
AncOp_PrjList
AncOp_PrjEl COL: Expr1002
ScaOp_AggFunc stopSum Transformed
ScaOp_Identifier QCOL: [fs].SalesAmount
AncOp_PrjList
TRACE 8606Trace 8606 shows intermediate steps in the processing of the input tree
*** Input Tree: ***
LogOp_Project QCOL: [d].CalendarMonth COL: Expr1002
LogOp_GbAgg OUT(QCOL: [d].CalendarMonth,COL: Expr1002 ,) BY(QCOL: [d].CalendarMonth,)
LogOp_Project
LogOp_Select
LogOp_Join
LogOp_Get TBL: DimDate(alias TBL: d) DimDate TableID=610101214 TableReferenceID=0 IsRow: COL: IsBaseRow1000
LogOp_Get TBL: FactSales(alias TBL: fs) FactSales TableID=1602104748 TableReferenceID=0 IsRow: COL: IsBaseRow1001
ScaOp_Const TI(bit,ML=1) XVAR(bit,Not Owned,Value=1)
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [fs].DateKey
ScaOp_Identifier QCOL: [d].Datekey
AncOp_PrjList
AncOp_PrjList
AncOp_PrjEl COL: Expr1002
ScaOp_AggFunc stopSum Transformed
ScaOp_Identifier QCOL: [fs].SalesAmount
AncOp_PrjList
*******************
*** Simplified Tree: ***
LogOp_GbAgg OUT(QCOL: [d].CalendarMonth,COL: Expr1002 ,) BY(QCOL: [d].CalendarMonth,)
LogOp_Join
LogOp_Get TBL: DimDate(alias TBL: d) DimDate TableID=610101214 TableReferenceID=0 IsRow: COL: IsBaseRow1000
LogOp_Get TBL: FactSales(alias TBL: fs) FactSales TableID=1602104748 TableReferenceID=0 IsRow: COL: IsBaseRow1001
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [d].Datekey
ScaOp_Identifier QCOL: [fs].DateKey
AncOp_PrjList
AncOp_PrjEl COL: Expr1002
ScaOp_AggFunc stopAccum
ScaOp_Identifier QCOL: [fs].SalesAmount
*******************
*** Join-collapsed Tree: ***
LogOp_GbAgg OUT(QCOL: [d].CalendarMonth,COL: Expr1002 ,) BY(QCOL: [d].CalendarMonth,)
LogOp_Join
LogOp_Get TBL: DimDate(alias TBL: d) DimDate TableID=610101214 TableReferenceID=0 IsRow: COL: IsBaseRow1000
LogOp_Get TBL: FactSales(alias TBL: fs) FactSales TableID=1602104748 TableReferenceID=0 IsRow: COL: IsBaseRow1001
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [d].Datekey
ScaOp_Identifier QCOL: [fs].DateKey
AncOp_PrjList
AncOp_PrjEl COL: Expr1002
ScaOp_AggFunc stopAccum
ScaOp_Identifier QCOL: [fs].SalesAmount
*******************
*** Tree Before Project Normalization ***
LogOp_GbAgg OUT(QCOL: [d].CalendarMonth,COL: Expr1002 ,) BY(QCOL: [d].CalendarMonth,)
LogOp_Join
LogOp_Get TBL: DimDate(alias TBL: d) DimDate TableID=610101214 TableReferenceID=0 IsRow: COL: IsBaseRow1000
LogOp_Get TBL: FactSales(alias TBL: fs) FactSales TableID=1602104748 TableReferenceID=0 IsRow: COL: IsBaseRow1001
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [d].Datekey
ScaOp_Identifier QCOL: [fs].DateKey
AncOp_PrjList
AncOp_PrjEl COL: Expr1002
ScaOp_AggFunc stopAccum
ScaOp_Identifier QCOL: [fs].SalesAmount
*****************************************
*** Tree After Project Normalization ***
LogOp_GbAgg OUT(QCOL: [d].CalendarMonth,COL: Expr1002 ,) BY(QCOL: [d].CalendarMonth,)
LogOp_Join
LogOp_Get TBL: DimDate(alias TBL: d) DimDate TableID=610101214 TableReferenceID=0 IsRow: COL: IsBaseRow1000
LogOp_Get TBL: FactSales(alias TBL: fs) FactSales TableID=1602104748 TableReferenceID=0 IsRow: COL: IsBaseRow1001
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [d].Datekey
ScaOp_Identifier QCOL: [fs].DateKey
AncOp_PrjList
AncOp_PrjEl COL: Expr1002
ScaOp_AggFunc stopAccum
ScaOp_Identifier QCOL: [fs].SalesAmount
TRACE 8607Trace 8607 shows the resulting output tree. Note that the operations labeled "LogOp" have disappeared and "PhyOp" operations have taken their place.
*** Output Tree: ***
Exchange Start
PhyOp_ExecutionModeAdapter(BatchToRow)
PhyOp_HashGbAgg(batch) Eager Distinct
PhyOp_Concat (batch)
OUTPUT(COL: IsBaseRow1000 QCOL: [d].Datekey QCOL: [d].CalendarMonth QCOL: [fs].DateKey COL: partialagg1003 )
CHILD(COL: IsBaseRow1000 QCOL: [d].Datekey QCOL: [d].CalendarMonth QCOL: [fs].DateKey COL: partialagg1003 )
CHILD(COL: IsBaseRow1000 QCOL: [d].Datekey QCOL: [d].CalendarMonth QCOL: [fs].DateKey COL: partialagg1003 )
PhyOp_HashJoinx_jtInner (batch)(QCOL: [fs].DateKey)
= (QCOL: [d].Datekey)
PhyOp_HashGbAgg(batch) Eager Distinct
PhyOp_Range TBL: FactSales(alias TBL: fs)(1) ASC Bmk ( QCOL: [fs].SalesKey) IsRow: COL: IsBaseRow1001
AncOp_PrjList
AncOp_PrjEl COL: partialagg1003
ScaOp_AggFunc stopAccum
ScaOp_Identifier QCOL: [fs].SalesAmount
PhyOp_Range TBL: DimDate(alias TBL: d)(1) ASC Bmk ( QCOL: [d].Datekey) IsRow: COL: IsBaseRow1000
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [d].Datekey
ScaOp_Identifier QCOL: [fs].DateKey
PhyOp_ExecutionModeAdapter(RowToBatch)
PhyOp_Apply lookup TBL: DimDate (0) (x_jtInner)
PhyOp_ExecutionModeAdapter(BatchToRow)
PhyOp_HashGbAgg(batch) Eager Distinct [** 4 **]
PhyOp_Range TBL: DimDate(alias TBL: d)(1) ASC Bmk ( QCOL: [d].Datekey) IsRow: COL: IsBaseRow1000
ScaOp_Comp x_cmpEq[** 11 **]
AncOp_PrjList
AncOp_PrjEl COL: Expr1002
ScaOp_AggFunc stopAccumNull
ScaOp_Identifier COL: partialagg1003
TRACE 8675Trace 8675 doesn't provide much insight into the optimization process itself, but the report does show the estimated cost of alternative query plans.
End of simplification, time: 0 net: 0 total: 0 net: 0
end exploration, tasks: 133 no total cost time: 0.001 net: 0.001 total: 0 net: 0.001
end exploration, tasks: 134 no total cost time: 0 net: 0 total: 0 net: 0.001
end exploration, tasks: 659 no total cost time: 0.007 net: 0.007 total: 0 net: 0.008
end exploration, tasks: 662 no total cost time: 0 net: 0 total: 0 net: 0.009
end exploration, tasks: 674 no total cost time: 0 net: 0 total: 0 net: 0.009
end search(1), cost: 29.225tasks: 674 time: 0 net: 0 total: 0 net: 0.009
end exploration, tasks: 675 Cost = 29.225 time: 0 net: 0 total: 0 net: 0.009
end exploration, tasks: 676 Cost = 29.225 time: 0 net: 0 total: 0 net: 0.009
end exploration, tasks: 1353 Cost = 29.225 time: 0.002 net: 0.002 total: 0 net: 0.011
end exploration, tasks: 1356 Cost = 29.225 time: 0 net: 0 total: 0 net: 0.011
end exploration, tasks: 1422 Cost = 29.225 time: 0 net: 0 total: 0 net: 0.011
end search(1), cost: 17.1178tasks: 1422 time: 0 net: 0 total: 0 net: 0.011
End of post optimization rewrite, time: 0 net: 0 total: 0 net: 0.012
End of query plan compilation, time: 0 net: 0 total: 0 net: 0.012
Note that two different estimated costs have been highlighted in red. Not surprisingly, when we look at the actual execution plan, it it the smaller of the two that appears as the estimated subtree cost for the SELECT icon.
[Link]
Some Other Trace Flags2372
2373
These traces provide information on the memory usage of the query tree logical optimizations:
8608
8609
Trace flags 8608 and 8609 provide further information about the query optimizer "memo". The memo is nothing more that a chalkboard on which SQL Server writes pertinent information as it progresses through the steps of query optimization.
ConclusionsSQL Server continues to provide developers with more and deeper information on the query optimizer. This allows developers to gain insight, not to mention have more fun.
Attachments
- Original document
- Permalink
Disclaimer
Learning Tree International Inc. published this content on 12 October 2021 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 12 October 2021 19:11:07 UTC.