Wednesday, March 21, 2012

Performance Problems On Recursive Table

I have a table that has a parent-child relation to itself. (see SQL at end)
Basically, records of type 0 are related to type 1, and type 1 to type 2.
This table has about 2 million records right now. When I do this:
INNER JOIN
ExpSrvLog I ON E.ExpSrvID = I.ExpSrvID AND I.ItemType = 2
LEFT OUTER JOIN
ExpSrvLog B ON I.LogID = B.ParentID AND B.ItemType = 1
It seems to consume an inordinate amount of time. Any way to optimize this
behavior? Thanks.
CREATE TABLE [ExpSrvLog] (
[LogID] [bigint] IDENTITY (1, 1) NOT NULL ,
[ExpSrvID] [int] NOT NULL ,
[CalendarDate] [smalldatetime] NOT NULL ,
[ItemType] [tinyint] NOT NULL ,
[VendorInvoiceID] [int] NULL ,
[ClientInvoiceID] [int] NULL ,
[PaymentID] [int] NULL ,
[BillingAmount] [money] NULL ,
[ActualQty] [decimal](8, 2) NULL ,
[ActualRate] [money] NULL ,
[ParentID] [bigint] NULL ,
CONSTRAINT [PK_ExpSrvLog] PRIMARY KEY CLUSTERED
(
[LogID] DESC
) WITH FILLFACTOR = 90 ON [PRIMARY] ,
CONSTRAINT [FK_ExpSrvLog_ClientInvoice] FOREIGN KEY
(
[ClientInvoiceID]
) REFERENCES [ClientInvoice] (
[InvoiceID]
),
CONSTRAINT [FK_ExpSrvLog_ExpSrvLog] FOREIGN KEY
(
[ParentID]
) REFERENCES [ExpSrvLog] (
[LogID]
),
CONSTRAINT [FK_ExpSrvLog_VendorInvoice] FOREIGN KEY
(
[VendorInvoiceID]
) REFERENCES [VendorInvoice] (
[InvoiceID]
)
) ON [PRIMARY]
ENDApart from the primary key, are there any other indexes on that table?
Foreign key columns should at least be indexed.
ML|||"ML" wrote:
> Apart from the primary key, are there any other indexes on that table?
Yes. One over ExpSrvID, CalendarDate and Itemtype. One over ParentID. (The
VendorInvoiceID and CLientInvoiceID fields are also indexes as they are FKs.
)
I've also tried an index over ParentID and LogID, to no help.
- alphadog|||Paul, While recursive joins are elegant, they are notoriously slow on
recursive tables with > 1 million rows. You might want try to use mutiples
queries and(or) tables to accomplish your task instead of the recursive join
s.
"Paul Tiseo" wrote:

> "ML" wrote:
> Yes. One over ExpSrvID, CalendarDate and Itemtype. One over ParentID. (The
> VendorInvoiceID and CLientInvoiceID fields are also indexes as they are FK
s.)
> I've also tried an index over ParentID and LogID, to no help.
> - alphadog|||Really? Damn. It works so well that way. Oh well, thanks for the info, Frank
.
So, what are all my options? Physically splitting the table is something I
have planned, but at this point in developement is not an option. I guess
it'll have to be temp tables.
"frank chang" wrote:
> Paul, While recursive joins are elegant, they are notoriously slow on
> recursive tables with > 1 million rows. You might want try to use mutiples
> queries and(or) tables to accomplish your task instead of the recursive jo
ins.
> "Paul Tiseo" wrote:
>|||Maybe this example can help you find a way to navigate the
ancestor/descendant axes in your hierarchy:
http://milambda.blogspot.com/2005/0...or-monkeys.html
ML|||>> I have a table that has a parent-child relation to itself. (see SQL at end) Basical
ly, records [sic] of type 0 are related to type 1, and type 1 to type 2. This table h
as about 2 million records [sic] right now. When I do this: <<
No relational key, IDENTITY, too many NULL-able columns, money and
bigint proprierary datatypes and you don't know that rows and records
are not the same. .And there are not specs.
behavior? <<
Change the DDL. Does this hierarchy only go down three levels? Can
level 0 have more than one level 1 subordinate? Can a level 1 have
more than one level 2 subordinate? If I assume not, then:
CREATE TABLE ExpsrvLog
(expsrv_id INTEGER NOT NULL,
hierarchy_level INTEGER DEFAULT 0 NOT NULL
CHECK (hierarchy_level IN (0, 1, 2)),
payment_date DATETIME NOT NULL,
vendor_invoice_nbr INTEGER NOT NULL,
REFERENCES VendorInvoices (invoice_nbr)
ON UPDATE CASCADE,
client_invoice_nbr INTEGER NOT NULL
REFERENCES ClientInvoices (invoice_nbr)
ON UPDATE CASCADE,
payment_nbr INTEGER NOT NULL,
billing_amount DECIMAL(8,2) NOT NULL,
actual_qty DECIMAL(8,2) NOT NULL,
actual_rate DECIMAL(8,2) NULL,
PRIMARY KEY (expsrv_id, hierarchy_level));
If the asumption was wrong, we can move onto the nested sets model.
You might want to get a copy of TREES & HIERARCHIES IN SQL along with a
basic data modeling book.|||"--CELKO--" wrote:
> No relational key,
Of what type? I have a primary key (see first post), an alternate key (not
in the original DDL, but basically CREATE UNIQUE INDEX AK_ExpSrvLog ON
dbo.ExpSrvLog([ExpSrvID], [CalendarDate] DESC , [ItemType]) ) and some
primary-foreign key relationships. What exactly is missing? Do you mean a
natural key? If so, it's the AK that I did not include.

> IDENTITY
Why? How does IDENTITY affect my performance question?

> too many NULL-able columns,
The Real World intrudes into the World of Relational Model Puritans
sometimes. <shrug> Again, how do the NULLable columns affect my performance
problem?

> money and bigint proprierary datatypes
Not a problem for this project.

> and you don't know that rows and records are not the same.
Do you know what they say about "ASS-U-ME"? I use common idioms. Forgive me
for not abiding by your stricter one.

> And there are not specs.
I'm sorry, but I don't understand this comment. What specs do you need?

> Change the DDL. Does this hierarchy only go down three levels? Can
> level 0 have more than one level 1 subordinate? Can a level 1 have
> more than one level 2 subordinate? If I assume not, then:
Yes (three and only three levels), yes and yes.

> CREATE TABLE ExpsrvLog
> (expsrv_id INTEGER NOT NULL,
> hierarchy_level INTEGER DEFAULT 0 NOT NULL
> CHECK (hierarchy_level IN (0, 1, 2)),
> payment_date DATETIME NOT NULL,
> vendor_invoice_nbr INTEGER NOT NULL,
> REFERENCES VendorInvoices (invoice_nbr)
> ON UPDATE CASCADE,
> client_invoice_nbr INTEGER NOT NULL
> REFERENCES ClientInvoices (invoice_nbr)
> ON UPDATE CASCADE,
> payment_nbr INTEGER NOT NULL,
> billing_amount DECIMAL(8,2) NOT NULL,
> actual_qty DECIMAL(8,2) NOT NULL,
> actual_rate DECIMAL(8,2) NULL,
> PRIMARY KEY (expsrv_id, hierarchy_level));
> If the asumption was wrong, we can move onto the nested sets model.
So, you changed the structure of the table in a way that doesn't address my
original problem (in fact you seem to have removed the recursive
relationship) but makes it more purist-friendly in terms of datatypes and
nullability. Your "SQL standardization" and "Relational Model" efforts are
appreciated (although your zeal blinded you to my actual question) but it's
not a solution to my immediate problem.
I don't need to know *how* to model or query the hierarchy. I need to know
*why* it isn't performing properly when I use the self-join.
Thanks.sql

No comments:

Post a Comment