It's a commonly held misconception, even among some knowledgeable SQL folks, that the SQL Server query optimizer evaluates different ways of executing a query and goes with the query plan that yields the lowest estimated cost. As is often the case with misconceptions, this one is not so much wrong as a bit oversimplified. The SQL Server query optimizer breaks queries up into subsets called "groups" and optimizes the qroups separately. It then assembles the optimized groups into a single query that, when executed, returns the rows the query writer asked for. (Which, hopefully, are also the rows that he or she actually wanted.)

In most cases the query optimizer does a very good job on its own. In some cases, examining the query plan in the Management Studio can provide insights into ways SQL Server might be provided with additional resources that might help it execute a query faster.

Today we'll take a look at some techniques that are of no practical value whatsoever. Unless, of course, you are curious about digging deeper into SQL Server to see how it works.

Some Undocumented Query Hints

In the world of Microsoft "undocumented" usually means something more like "not officially documented". In any case, there are a few query hints that let us examine more closely what the query optimizer has accomplished in preparing an execution plan.

We'll start by using the same query used to examine batch processing in the blog XXXXXXXXXXXX.

DBCCTRACEON(3604)

SELECTd.CalendarMonth

,SUM(fs.SalesAmount)

-,SUM(fs.ReturnAmount)

-,SUM(fs.SalesQuantity)

-,SUM(fs.ReturnQuantity)

FROMDimDate d

JOINFactSales fs

ONfs.DateKey =d.DateKey

GROUPBYd.CalendarMonth

OPTION(RECOMPILE,QUERYTRACEON 8605);

For the output of these traces, the magical DBCC TRACEON (3604) must be executed. Some people include QUERYTRACEON 3604 as a query hint, but since DBCC TRACEON (3604) has session-level scope, we need only run it once and it will remain set for the duration of our session.

Note that the query hint includes "RECOMPILE". If we did not do this, the trace would only generate information the first time the query was run. On subsequent executions, the query would be cached and there would be no optimizer output.

Trace 8605

The output for trace 8605 shows the input tree, which lays out the logical operations implied gy the query itself.

*** Converted Tree: ***

LogOp_Project QCOL: [d].CalendarMonth COL: Expr1002

LogOp_GbAgg OUT(QCOL: [d].CalendarMonth,COL: Expr1002 ,) BY(QCOL: [d].CalendarMonth,)

LogOp_Project

LogOp_Join

LogOp_Get TBL: DimDate(alias TBL: d) DimDate TableID=610101214 TableReferenceID=0 IsRow: COL: IsBaseRow1000

LogOp_Get TBL: FactSales(alias TBL: fs) FactSales TableID=1602104748 TableReferenceID=0 IsRow: COL: IsBaseRow1001

ScaOp_Comp x_cmpEq

ScaOp_Identifier QCOL: [fs].DateKey

ScaOp_Identifier QCOL: [d].Datekey

AncOp_PrjList

AncOp_PrjList

AncOp_PrjEl COL: Expr1002

ScaOp_AggFunc stopSum Transformed

ScaOp_Identifier QCOL: [fs].SalesAmount

AncOp_PrjList

TRACE 8606

Trace 8606 shows intermediate steps in the processing of the input tree

*** Input Tree: ***

LogOp_Project QCOL: [d].CalendarMonth COL: Expr1002

LogOp_GbAgg OUT(QCOL: [d].CalendarMonth,COL: Expr1002 ,) BY(QCOL: [d].CalendarMonth,)

LogOp_Project

LogOp_Select

LogOp_Join

LogOp_Get TBL: DimDate(alias TBL: d) DimDate TableID=610101214 TableReferenceID=0 IsRow: COL: IsBaseRow1000

LogOp_Get TBL: FactSales(alias TBL: fs) FactSales TableID=1602104748 TableReferenceID=0 IsRow: COL: IsBaseRow1001

ScaOp_Const TI(bit,ML=1) XVAR(bit,Not Owned,Value=1)

ScaOp_Comp x_cmpEq

ScaOp_Identifier QCOL: [fs].DateKey

ScaOp_Identifier QCOL: [d].Datekey

AncOp_PrjList

AncOp_PrjList

AncOp_PrjEl COL: Expr1002

ScaOp_AggFunc stopSum Transformed

ScaOp_Identifier QCOL: [fs].SalesAmount

AncOp_PrjList

*******************

*** Simplified Tree: ***

LogOp_GbAgg OUT(QCOL: [d].CalendarMonth,COL: Expr1002 ,) BY(QCOL: [d].CalendarMonth,)

LogOp_Join

LogOp_Get TBL: DimDate(alias TBL: d) DimDate TableID=610101214 TableReferenceID=0 IsRow: COL: IsBaseRow1000

LogOp_Get TBL: FactSales(alias TBL: fs) FactSales TableID=1602104748 TableReferenceID=0 IsRow: COL: IsBaseRow1001

ScaOp_Comp x_cmpEq

ScaOp_Identifier QCOL: [d].Datekey

ScaOp_Identifier QCOL: [fs].DateKey

AncOp_PrjList

AncOp_PrjEl COL: Expr1002

ScaOp_AggFunc stopAccum

ScaOp_Identifier QCOL: [fs].SalesAmount

*******************

*** Join-collapsed Tree: ***

LogOp_GbAgg OUT(QCOL: [d].CalendarMonth,COL: Expr1002 ,) BY(QCOL: [d].CalendarMonth,)

LogOp_Join

LogOp_Get TBL: DimDate(alias TBL: d) DimDate TableID=610101214 TableReferenceID=0 IsRow: COL: IsBaseRow1000

LogOp_Get TBL: FactSales(alias TBL: fs) FactSales TableID=1602104748 TableReferenceID=0 IsRow: COL: IsBaseRow1001

ScaOp_Comp x_cmpEq

ScaOp_Identifier QCOL: [d].Datekey

ScaOp_Identifier QCOL: [fs].DateKey

AncOp_PrjList

AncOp_PrjEl COL: Expr1002

ScaOp_AggFunc stopAccum

ScaOp_Identifier QCOL: [fs].SalesAmount

*******************

*** Tree Before Project Normalization ***

LogOp_GbAgg OUT(QCOL: [d].CalendarMonth,COL: Expr1002 ,) BY(QCOL: [d].CalendarMonth,)

LogOp_Join

LogOp_Get TBL: DimDate(alias TBL: d) DimDate TableID=610101214 TableReferenceID=0 IsRow: COL: IsBaseRow1000

LogOp_Get TBL: FactSales(alias TBL: fs) FactSales TableID=1602104748 TableReferenceID=0 IsRow: COL: IsBaseRow1001

ScaOp_Comp x_cmpEq

ScaOp_Identifier QCOL: [d].Datekey

ScaOp_Identifier QCOL: [fs].DateKey

AncOp_PrjList

AncOp_PrjEl COL: Expr1002

ScaOp_AggFunc stopAccum

ScaOp_Identifier QCOL: [fs].SalesAmount

*****************************************

*** Tree After Project Normalization ***

LogOp_GbAgg OUT(QCOL: [d].CalendarMonth,COL: Expr1002 ,) BY(QCOL: [d].CalendarMonth,)

LogOp_Join

LogOp_Get TBL: DimDate(alias TBL: d) DimDate TableID=610101214 TableReferenceID=0 IsRow: COL: IsBaseRow1000

LogOp_Get TBL: FactSales(alias TBL: fs) FactSales TableID=1602104748 TableReferenceID=0 IsRow: COL: IsBaseRow1001

ScaOp_Comp x_cmpEq

ScaOp_Identifier QCOL: [d].Datekey

ScaOp_Identifier QCOL: [fs].DateKey

AncOp_PrjList

AncOp_PrjEl COL: Expr1002

ScaOp_AggFunc stopAccum

ScaOp_Identifier QCOL: [fs].SalesAmount

TRACE 8607

Trace 8607 shows the resulting output tree. Note that the operations labeled "LogOp" have disappeared and "PhyOp" operations have taken their place.

*** Output Tree: ***

Exchange Start

PhyOp_ExecutionModeAdapter(BatchToRow)

PhyOp_HashGbAgg(batch) Eager Distinct

PhyOp_Concat (batch)

OUTPUT(COL: IsBaseRow1000 QCOL: [d].Datekey QCOL: [d].CalendarMonth QCOL: [fs].DateKey COL: partialagg1003 )

CHILD(COL: IsBaseRow1000 QCOL: [d].Datekey QCOL: [d].CalendarMonth QCOL: [fs].DateKey COL: partialagg1003 )

CHILD(COL: IsBaseRow1000 QCOL: [d].Datekey QCOL: [d].CalendarMonth QCOL: [fs].DateKey COL: partialagg1003 )

PhyOp_HashJoinx_jtInner (batch)(QCOL: [fs].DateKey)

= (QCOL: [d].Datekey)

PhyOp_HashGbAgg(batch) Eager Distinct

PhyOp_Range TBL: FactSales(alias TBL: fs)(1) ASC Bmk ( QCOL: [fs].SalesKey) IsRow: COL: IsBaseRow1001

AncOp_PrjList

AncOp_PrjEl COL: partialagg1003

ScaOp_AggFunc stopAccum

ScaOp_Identifier QCOL: [fs].SalesAmount

PhyOp_Range TBL: DimDate(alias TBL: d)(1) ASC Bmk ( QCOL: [d].Datekey) IsRow: COL: IsBaseRow1000

ScaOp_Comp x_cmpEq

ScaOp_Identifier QCOL: [d].Datekey

ScaOp_Identifier QCOL: [fs].DateKey

PhyOp_ExecutionModeAdapter(RowToBatch)

PhyOp_Apply lookup TBL: DimDate (0) (x_jtInner)

PhyOp_ExecutionModeAdapter(BatchToRow)

PhyOp_HashGbAgg(batch) Eager Distinct [** 4 **]

PhyOp_Range TBL: DimDate(alias TBL: d)(1) ASC Bmk ( QCOL: [d].Datekey) IsRow: COL: IsBaseRow1000

ScaOp_Comp x_cmpEq[** 11 **]

AncOp_PrjList

AncOp_PrjEl COL: Expr1002

ScaOp_AggFunc stopAccumNull

ScaOp_Identifier COL: partialagg1003

TRACE 8675

Trace 8675 doesn't provide much insight into the optimization process itself, but the report does show the estimated cost of alternative query plans.

End of simplification, time: 0 net: 0 total: 0 net: 0

end exploration, tasks: 133 no total cost time: 0.001 net: 0.001 total: 0 net: 0.001

end exploration, tasks: 134 no total cost time: 0 net: 0 total: 0 net: 0.001

end exploration, tasks: 659 no total cost time: 0.007 net: 0.007 total: 0 net: 0.008

end exploration, tasks: 662 no total cost time: 0 net: 0 total: 0 net: 0.009

end exploration, tasks: 674 no total cost time: 0 net: 0 total: 0 net: 0.009

end search(1), cost: 29.225tasks: 674 time: 0 net: 0 total: 0 net: 0.009

end exploration, tasks: 675 Cost = 29.225 time: 0 net: 0 total: 0 net: 0.009

end exploration, tasks: 676 Cost = 29.225 time: 0 net: 0 total: 0 net: 0.009

end exploration, tasks: 1353 Cost = 29.225 time: 0.002 net: 0.002 total: 0 net: 0.011

end exploration, tasks: 1356 Cost = 29.225 time: 0 net: 0 total: 0 net: 0.011

end exploration, tasks: 1422 Cost = 29.225 time: 0 net: 0 total: 0 net: 0.011

end search(1), cost: 17.1178tasks: 1422 time: 0 net: 0 total: 0 net: 0.011

End of post optimization rewrite, time: 0 net: 0 total: 0 net: 0.012

End of query plan compilation, time: 0 net: 0 total: 0 net: 0.012

Note that two different estimated costs have been highlighted in red. Not surprisingly, when we look at the actual execution plan, it it the smaller of the two that appears as the estimated subtree cost for the SELECT icon.

[Link]

Some Other Trace Flags

2372

2373

These traces provide information on the memory usage of the query tree logical optimizations:

8608

8609

Trace flags 8608 and 8609 provide further information about the query optimizer "memo". The memo is nothing more that a chalkboard on which SQL Server writes pertinent information as it progresses through the steps of query optimization.

Conclusions

SQL Server continues to provide developers with more and deeper information on the query optimizer. This allows developers to gain insight, not to mention have more fun.

Attachments

  • Original document
  • Permalink

Disclaimer

Learning Tree International Inc. published this content on 12 October 2021 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 12 October 2021 19:11:07 UTC.