Operator of the Day: Segment

I previously blogged about the sequence project operator, which I helped architect/develop while working on the SQL Server QP (here).  In that same post, I promised to tell you about the Segment operator in more detail as well.

Segment is largely a “helper” operator in SQL Server.  It does very little and there isn’t a direct way to create one.  They are often used to help facilitate more complex operations.

Every operator in SQL Server’s QP takes rows of input and produces rows of output.  Some have multiple children (joins), while others are just unary.  Segment is a unary operator that takes rows of input and determines if some set of columns have changed from the previous row.  Usually, the input on those columns are ordered or grouped to make this meaningful.  An extra column is created, like in a compute scalar operation, which records if the value has changed.  That computed value is then used to drive other physical operators.

You can see what columns are considered in each segment by looking at showplan_xml (from the ranking query in my previous post):

                <RelOp NodeId=”1″ PhysicalOp=”Segment” LogicalOp=”Segment” EstimateRows=”1000″ EstimateIO=”0″ EstimateCPU=”2e-005″ AvgRowSize=”23″ EstimatedTotalSubtreeCost=”0.0328114″ Parallel=”0″ EstimateRebinds=”0″ EstimateRewinds=”0″>
                  <OutputList>
                    <ColumnReference Database=”[t1]” Schema=”[dbo]” Table=”[z]” Column=”col1″ />
                    <ColumnReference Database=”[t1]” Schema=”[dbo]” Table=”[z]” Column=”col2″ />
                    <ColumnReference Column=”Segment1005″ />
                    <ColumnReference Column=”Segment1006″ />
                  </OutputList>
                  <Segment>
                    <GroupBy>
                      <ColumnReference Database=”[t1]” Schema=”[dbo]” Table=”[z]” Column=”col2″ />
                    </GroupBy>
                    <SegmentColumn>
                      <ColumnReference Column=”Segment1006″ />
                    </SegmentColumn>

The grouping columns are listed, and you can see these “Segment100x” columns that get fabricated along the way.  These are not surfaced outside the query, so you can’t see the values of this directly.  However, you can see if the grouping columns have changed.

So, in this plan we have two segment columns on different grouping column sets, both of these passed into the sequence project (you can only actually see that they stop being propogated in the showplan_xml, but the QP is usually smart enough to not pass things along unnecessarily, so that usually means that the buck stops where it stops for a reason :).

Remember, all of the operators can change from release to release.  I don’t believe any of this changed substantally from 2005 to 2008 based on what I can see in the plans.

That concludes our daily episode of the operator of the day.  You guys have a good weekend.  I am enjoying warmth and sunshine in Austin in February, which I can tell you is a substantial change from my life while in Seattle.

Other articles

New blog location

Here is the new blog  – please update those readers. http://blogs.msdn.com/conor_cunningham_msft/default.aspx I’m getting settled into working for Microsoft again – it’s basically like drinking from

Explore

The Trouble with Triggers

(Apologies to Star Trek). I received a question about trigger performance, especially the do’s and dont’s about how they are used. Let’s first start by

Explore

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.