Monday, March 26, 2012

Performance question.

Hello Everybody,
I have a table which has arround 30 millions rows.
Table structure is as following..
CREATE TABLE TestTable
(Id INT, --which is PK,
EmpId INT, --There is a non cluster index on it.
DeptName VARCHAR(50),
Hours NUMERIC(5,2),
Tdate DATETIME,
ProjectNumber smallint,
.and few more columns
.
.
)
And i have following query, which is taking arround 1 minute 10 sec to run.
SELECT
DeptName,
EmpId,
SUM(CASE WHEN ProjectNumber = 11 THEN Hours ELSE 0
END) AS FinHours,
SUM(CASE WHEN ProjectId = 12 THEN Hours ELSE 0
END) AS HrHours,
SUM(CASE WHEN ProjectId = 13 THEN Hours ELSE 0
END) AS TaxHours,
FROM TestTable WHERE Tdate between @.Date1 and @.Date2
GROUP BY
DeptName,
EmpId
I do not have index on ProjectNumber column because this column will have
only
200 distinct values.
If i create index on Group by Columns, would it improve performance ?
Pls let me know, how can i imporve performance ?
Thanks.I don't know about the rest of your queries or your usage patterns, but the
most obvious choice in this case is to make the PK nonclustered and create a
clustered index on the Tdate column to support your WHERE clause.
Adam Machanic
Pro SQL Server 2005, available now
http://www.apress.com/book/bookDisplay.html?bID=457
--
"mvp" <mvp@.discussions.microsoft.com> wrote in message
news:569B7FD3-452C-4B8F-9056-06798A32B3EA@.microsoft.com...
> Hello Everybody,
> I have a table which has arround 30 millions rows.
> Table structure is as following..
> CREATE TABLE TestTable
> (Id INT, --which is PK,
> EmpId INT, --There is a non cluster index on it.
> DeptName VARCHAR(50),
> Hours NUMERIC(5,2),
> Tdate DATETIME,
> ProjectNumber smallint,
> .and few more columns
> .
> .
> )
> And i have following query, which is taking arround 1 minute 10 sec to
> run.
>
> SELECT
> DeptName,
> EmpId,
> SUM(CASE WHEN ProjectNumber = 11 THEN Hours ELSE 0
> END) AS FinHours,
> SUM(CASE WHEN ProjectId = 12 THEN Hours ELSE 0
> END) AS HrHours,
> SUM(CASE WHEN ProjectId = 13 THEN Hours ELSE 0
> END) AS TaxHours,
> FROM TestTable WHERE Tdate between @.Date1 and @.Date2
> GROUP BY
> DeptName,
> EmpId
> I do not have index on ProjectNumber column because this column will have
> only
> 200 distinct values.
> If i create index on Group by Columns, would it improve performance ?
> Pls let me know, how can i imporve performance ?
>
> Thanks.|||also you may try an index on all the columns involved in the query,
Tdate first if the interval is narrow, DeptName, EmpId first if the
interval is wide|||Your query indicates that you are using the following columns: DeptId,
EmpId, ProjectNumber, ProjectId, TDate. This means that SQL Server will have
to look at all of the rows being returned regardless of whether an index
exists on your grouped columns. This will be the case unless you were to
create a covering index for all of the columns being returned. In your case,
that's a lot of columns so I don't recommend it.
According to your DDL, I don't see an index on Tdate. I would actually start
with that. However, depending on the number of rows that are being returned
from your query, the optimizer may or may not even choose to use that index
(due to the expense of bookmark lookup). However, I would attempt a
non-clustered index on Tdate first.
Assuming you're not using the data for anything else (or much else), Adam's
method could be the best route. However, this would result in larger indexes
for all of the nonclustered indexes on this table.
Since all nonclustered indexes also include the clustered index key, and
your key is going from a 4-byte data type to an 8-byte data type and add up
114MB to each index on your table. Combined with the fact that a
"uniquifier" is applied to all non-unique clustered indexes, could add
another 4 bytes to your rows and bring each nonclustered index up to 228 MB.
"Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
news:%237v%239uXBGHA.892@.TK2MSFTNGP12.phx.gbl...
>I don't know about the rest of your queries or your usage patterns, but the
>most obvious choice in this case is to make the PK nonclustered and create
>a clustered index on the Tdate column to support your WHERE clause.
>
> --
> Adam Machanic
> Pro SQL Server 2005, available now
> http://www.apress.com/book/bookDisplay.html?bID=457
> --
>
> "mvp" <mvp@.discussions.microsoft.com> wrote in message
> news:569B7FD3-452C-4B8F-9056-06798A32B3EA@.microsoft.com...
>|||"Jared Ko" <JaredKo05@.sanbeiji.removethispart.com> wrote in message
news:uPOE8AZBGHA.736@.TK2MSFTNGP10.phx.gbl...
> Since all nonclustered indexes also include the clustered index key, and
> your key is going from a 4-byte data type to an 8-byte data type and add
> up 114MB to each index on your table. Combined with the fact that a
> "uniquifier" is applied to all non-unique clustered indexes, could add
> another 4 bytes to your rows and bring each nonclustered index up to 228
> MB.
Slight correction: The uniquifier is only added to non-unique rows, not
every row. So if the majority are unique (which we might expect from a
DATETIME column), the uniquifier will add very little overhead.
Adam Machanic
Pro SQL Server 2005, available now
http://www.apress.com/book/bookDisplay.html?bID=457
--|||Thanks Adam. I was frantically searching for that information while I was
writing my response. I was thinking that was the case but a couple of web
sites I hit suggested otherwise.
"Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
news:OybGaZZBGHA.2356@.tk2msftngp13.phx.gbl...
> "Jared Ko" <JaredKo05@.sanbeiji.removethispart.com> wrote in message
> news:uPOE8AZBGHA.736@.TK2MSFTNGP10.phx.gbl...
> Slight correction: The uniquifier is only added to non-unique rows, not
> every row. So if the majority are unique (which we might expect from a
> DATETIME column), the uniquifier will add very little overhead.
>
> --
> Adam Machanic
> Pro SQL Server 2005, available now
> http://www.apress.com/book/bookDisplay.html?bID=457
> --
>
>|||Best is to test for yourself. Have to similar tables, populate them with bun
ch of rows. One unique,
the other all with same value. Check size of the index. that is how I conclu
ded that uniqifier is
only added for the duplicates.
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Jared Ko" <JaredKo05@.sanbeiji.removethispart.com> wrote in message
news:uyOPGjZBGHA.3984@.TK2MSFTNGP14.phx.gbl...
> Thanks Adam. I was frantically searching for that information while I was
writing my response. I
> was thinking that was the case but a couple of web sites I hit suggested o
therwise.
> "Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
> news:OybGaZZBGHA.2356@.tk2msftngp13.phx.gbl...
>|||"Tibor Karaszi" <tibor_please.no.email_karaszi@.hotmail.nomail.com> wrote in
message news:OK1JSoZBGHA.3472@.TK2MSFTNGP09.phx.gbl...
> Best is to test for yourself. Have to similar tables, populate them with
> bunch of rows. One unique, the other all with same value. Check size of
> the index. that is how I concluded that uniqifier is only added for the
> duplicates.
I took the lazy way out. _Inside SQL Server 2000_, page 412:
"If your clustered index was not created with the UNIQUE property, SQL
Server adds a 4-byte field when necessary to make each key unique."
Adam Machanic
Pro SQL Server 2005, available now
http://www.apress.com/book/bookDisplay.html?bID=457
--|||Just a question - wouldn't horizontal partitioning be useful in this case?
of course apart from index on the Tdate field.
Peter

No comments:

Post a Comment