Showing posts with label structure. Show all posts
Showing posts with label structure. Show all posts

Monday, March 26, 2012

Performance question.

Hello Everybody,
I have a table which has arround 30 millions rows.
Table structure is as following..
CREATE TABLE TestTable
(Id INT, --which is PK,
EmpId INT, --There is a non cluster index on it.
DeptName VARCHAR(50),
Hours NUMERIC(5,2),
Tdate DATETIME,
ProjectNumber smallint,
.and few more columns
.
.
)
And i have following query, which is taking arround 1 minute 10 sec to run.
SELECT
DeptName,
EmpId,
SUM(CASE WHEN ProjectNumber = 11 THEN Hours ELSE 0
END) AS FinHours,
SUM(CASE WHEN ProjectId = 12 THEN Hours ELSE 0
END) AS HrHours,
SUM(CASE WHEN ProjectId = 13 THEN Hours ELSE 0
END) AS TaxHours,
FROM TestTable WHERE Tdate between @.Date1 and @.Date2
GROUP BY
DeptName,
EmpId
I do not have index on ProjectNumber column because this column will have
only
200 distinct values.
If i create index on Group by Columns, would it improve performance ?
Pls let me know, how can i imporve performance ?
Thanks.I don't know about the rest of your queries or your usage patterns, but the
most obvious choice in this case is to make the PK nonclustered and create a
clustered index on the Tdate column to support your WHERE clause.
Adam Machanic
Pro SQL Server 2005, available now
http://www.apress.com/book/bookDisplay.html?bID=457
--
"mvp" <mvp@.discussions.microsoft.com> wrote in message
news:569B7FD3-452C-4B8F-9056-06798A32B3EA@.microsoft.com...
> Hello Everybody,
> I have a table which has arround 30 millions rows.
> Table structure is as following..
> CREATE TABLE TestTable
> (Id INT, --which is PK,
> EmpId INT, --There is a non cluster index on it.
> DeptName VARCHAR(50),
> Hours NUMERIC(5,2),
> Tdate DATETIME,
> ProjectNumber smallint,
> .and few more columns
> .
> .
> )
> And i have following query, which is taking arround 1 minute 10 sec to
> run.
>
> SELECT
> DeptName,
> EmpId,
> SUM(CASE WHEN ProjectNumber = 11 THEN Hours ELSE 0
> END) AS FinHours,
> SUM(CASE WHEN ProjectId = 12 THEN Hours ELSE 0
> END) AS HrHours,
> SUM(CASE WHEN ProjectId = 13 THEN Hours ELSE 0
> END) AS TaxHours,
> FROM TestTable WHERE Tdate between @.Date1 and @.Date2
> GROUP BY
> DeptName,
> EmpId
> I do not have index on ProjectNumber column because this column will have
> only
> 200 distinct values.
> If i create index on Group by Columns, would it improve performance ?
> Pls let me know, how can i imporve performance ?
>
> Thanks.|||also you may try an index on all the columns involved in the query,
Tdate first if the interval is narrow, DeptName, EmpId first if the
interval is wide|||Your query indicates that you are using the following columns: DeptId,
EmpId, ProjectNumber, ProjectId, TDate. This means that SQL Server will have
to look at all of the rows being returned regardless of whether an index
exists on your grouped columns. This will be the case unless you were to
create a covering index for all of the columns being returned. In your case,
that's a lot of columns so I don't recommend it.
According to your DDL, I don't see an index on Tdate. I would actually start
with that. However, depending on the number of rows that are being returned
from your query, the optimizer may or may not even choose to use that index
(due to the expense of bookmark lookup). However, I would attempt a
non-clustered index on Tdate first.
Assuming you're not using the data for anything else (or much else), Adam's
method could be the best route. However, this would result in larger indexes
for all of the nonclustered indexes on this table.
Since all nonclustered indexes also include the clustered index key, and
your key is going from a 4-byte data type to an 8-byte data type and add up
114MB to each index on your table. Combined with the fact that a
"uniquifier" is applied to all non-unique clustered indexes, could add
another 4 bytes to your rows and bring each nonclustered index up to 228 MB.
"Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
news:%237v%239uXBGHA.892@.TK2MSFTNGP12.phx.gbl...
>I don't know about the rest of your queries or your usage patterns, but the
>most obvious choice in this case is to make the PK nonclustered and create
>a clustered index on the Tdate column to support your WHERE clause.
>
> --
> Adam Machanic
> Pro SQL Server 2005, available now
> http://www.apress.com/book/bookDisplay.html?bID=457
> --
>
> "mvp" <mvp@.discussions.microsoft.com> wrote in message
> news:569B7FD3-452C-4B8F-9056-06798A32B3EA@.microsoft.com...
>|||"Jared Ko" <JaredKo05@.sanbeiji.removethispart.com> wrote in message
news:uPOE8AZBGHA.736@.TK2MSFTNGP10.phx.gbl...
> Since all nonclustered indexes also include the clustered index key, and
> your key is going from a 4-byte data type to an 8-byte data type and add
> up 114MB to each index on your table. Combined with the fact that a
> "uniquifier" is applied to all non-unique clustered indexes, could add
> another 4 bytes to your rows and bring each nonclustered index up to 228
> MB.
Slight correction: The uniquifier is only added to non-unique rows, not
every row. So if the majority are unique (which we might expect from a
DATETIME column), the uniquifier will add very little overhead.
Adam Machanic
Pro SQL Server 2005, available now
http://www.apress.com/book/bookDisplay.html?bID=457
--|||Thanks Adam. I was frantically searching for that information while I was
writing my response. I was thinking that was the case but a couple of web
sites I hit suggested otherwise.
"Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
news:OybGaZZBGHA.2356@.tk2msftngp13.phx.gbl...
> "Jared Ko" <JaredKo05@.sanbeiji.removethispart.com> wrote in message
> news:uPOE8AZBGHA.736@.TK2MSFTNGP10.phx.gbl...
> Slight correction: The uniquifier is only added to non-unique rows, not
> every row. So if the majority are unique (which we might expect from a
> DATETIME column), the uniquifier will add very little overhead.
>
> --
> Adam Machanic
> Pro SQL Server 2005, available now
> http://www.apress.com/book/bookDisplay.html?bID=457
> --
>
>|||Best is to test for yourself. Have to similar tables, populate them with bun
ch of rows. One unique,
the other all with same value. Check size of the index. that is how I conclu
ded that uniqifier is
only added for the duplicates.
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Jared Ko" <JaredKo05@.sanbeiji.removethispart.com> wrote in message
news:uyOPGjZBGHA.3984@.TK2MSFTNGP14.phx.gbl...
> Thanks Adam. I was frantically searching for that information while I was
writing my response. I
> was thinking that was the case but a couple of web sites I hit suggested o
therwise.
> "Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
> news:OybGaZZBGHA.2356@.tk2msftngp13.phx.gbl...
>|||"Tibor Karaszi" <tibor_please.no.email_karaszi@.hotmail.nomail.com> wrote in
message news:OK1JSoZBGHA.3472@.TK2MSFTNGP09.phx.gbl...
> Best is to test for yourself. Have to similar tables, populate them with
> bunch of rows. One unique, the other all with same value. Check size of
> the index. that is how I concluded that uniqifier is only added for the
> duplicates.
I took the lazy way out. _Inside SQL Server 2000_, page 412:
"If your clustered index was not created with the UNIQUE property, SQL
Server adds a 4-byte field when necessary to make each key unique."
Adam Machanic
Pro SQL Server 2005, available now
http://www.apress.com/book/bookDisplay.html?bID=457
--|||Just a question - wouldn't horizontal partitioning be useful in this case?
of course apart from index on the Tdate field.
Peter

Tuesday, March 20, 2012

Performance problem with Partition tables in SQL Server 2005

I create partition table and regular table, both with the exact structure and
data.
The aim was to check whether there is an improvement in the performance or
not.
I execute the following queries (myPartTable is the partition table):
select * from myTable where instance_id = 3
select * from myPartTable where instance_id = 3
select * from myPartTable where instance_id = 3 and $partition.IntRangePF(3)
= 3
The result:
- Significant difference in the performance of the second and the third query.
- No major difference between the first and second query. In most of the
cases, the first even show better performance.
My question:
1) Why there is a significant improvement between the second and the third
query?
Isn’t SQL Server supposed to recognize and access the partition without
specify its number?
2) Shouldn't I get some improvement in the performance when using partition
table?
Attached a sample code to create the test.
Create Partition Function:
CREATE PARTITION FUNCTION IntRangePF(int)
AS RANGE left FOR VALUES (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Create Partition Scheme:
CREATE PARTITION SCHEME IntRangePScheme AS PARTITION IntRangePF
ALL TO ([PRIMARY])
Create the tables:
CREATE TABLE myPartTable (
idint NOT NULL,
instance_idint NOT NULL,
descriptionvarchar(32) NOT NULL,
temp1floatNULL,
temp2floatNULL,
temp3floatNULL,
constraint table1_P_PK primary key nonclustered(instance_id, id) on
[PRIMARY] ,
) ON IntRangePScheme (instance_id)
CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
[instance_id] ASC,
[id] ASC )
CREATE TABLE myTable (
idintNOT NULL,
instance_idint NOT NULL,
descriptionvarchar(32) NOT NULL,
temp1floatNULL,
temp2floatNULL,
temp3floatNULL,
constraint myTable_PK primary key nonclustered(instance_id, id) on
[PRIMARY] ,
)
CREATE CLUSTERED INDEX index_myTable ON myTable(
[instance_id] ASC,
[id] ASC )
Insert Data
DECLARE @.i int
SET @.i = 1;
WHILE @.i < 100
BEGIN
SET @.i = @.i + 1;
insert into myTable values (@.i, 1, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 2, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 3, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 4, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 5, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 6, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 7, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 8, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 9, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 10, 'ddddd', 1, 1, 1);
END
insert into myPartTable select * from myTable
"?" <@.discussions.microsoft.com> wrote in message
news:B2FD061B-378D-4E4D-BC2A-04D5E3B9371A@.microsoft.com...
>I create partition table and regular table, both with the exact structure
>and
> data.
> The aim was to check whether there is an improvement in the performance or
> not.
> I execute the following queries (myPartTable is the partition table):
> select * from myTable where instance_id = 3
> select * from myPartTable where instance_id = 3
> select * from myPartTable where instance_id = 3 and
> $partition.IntRangePF(3)
> = 3
> The result:
> - Significant difference in the performance of the second and the third
> query.
> - No major difference between the first and second query. In most of the
> cases, the first even show better performance.
> My question:
> 1) Why there is a significant improvement between the second and the third
> query?
> Isn't SQL Server supposed to recognize and access the partition without
> specify its number?
> 2) Shouldn't I get some improvement in the performance when using
> partition
> table?
> Attached a sample code to create the test.
> Create Partition Function:
> CREATE PARTITION FUNCTION IntRangePF(int)
> AS RANGE left FOR VALUES (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
> Create Partition Scheme:
> CREATE PARTITION SCHEME IntRangePScheme AS PARTITION IntRangePF
> ALL TO ([PRIMARY])
> Create the tables:
> CREATE TABLE myPartTable (
> id int NOT NULL,
> instance_id int NOT NULL,
> description varchar(32) NOT NULL,
> temp1 float NULL,
> temp2 float NULL,
> temp3 float NULL,
> constraint table1_P_PK primary key nonclustered(instance_id, id) on
> [PRIMARY] ,
> ) ON IntRangePScheme (instance_id)
> CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
> [instance_id] ASC,
> [id] ASC )
>
When you create a clustered index on a table you move the table into the
filegroup or partition scheme for the clustered index. So when you created
a clustered index index_myPartTable on the primary filegroup instead of on
your partition scheme you "unpartitioned" the table.
Change like this:
CREATE TABLE myPartTable (
id int NOT NULL,
instance_id int NOT NULL,
description varchar(32) NOT NULL,
temp1 float NULL,
temp2 float NULL,
temp3 float NULL,
constraint table1_P_PK primary key nonclustered(instance_id, id) on
[PRIMARY] ,
) ON IntRangePScheme (instance_id)
CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
[instance_id] ASC,
[id] ASC ) ON IntRangePScheme (instance_id)
Or, all in one statement
CREATE TABLE myPartTable (
id int NOT NULL,
instance_id int NOT NULL,
description varchar(32) NOT NULL,
temp1 float NULL,
temp2 float NULL,
temp3 float NULL,
constraint table1_P_PK
primary key clustered(instance_id, id)
ON IntRangePScheme (instance_id) ,
)
Then you will see that the partition elimination reduces the IO's required
for your queries.
David
|||I change the clustered index (the other one automatically created on the
partition scheme), but the performance are still the same.
When running the 3 query together, I am getting 43% query cost (in the
execution plan panel) for the first query, 42% for the second and 15% for the
last.
The main question is why do I get difference between the 2nd and 3rd query.
"David Browne" wrote:

> "?" <@.discussions.microsoft.com> wrote in message
> news:B2FD061B-378D-4E4D-BC2A-04D5E3B9371A@.microsoft.com...
> When you create a clustered index on a table you move the table into the
> filegroup or partition scheme for the clustered index. So when you created
> a clustered index index_myPartTable on the primary filegroup instead of on
> your partition scheme you "unpartitioned" the table.
>
> Change like this:
> CREATE TABLE myPartTable (
> id int NOT NULL,
> instance_id int NOT NULL,
> description varchar(32) NOT NULL,
> temp1 float NULL,
> temp2 float NULL,
> temp3 float NULL,
> constraint table1_P_PK primary key nonclustered(instance_id, id) on
> [PRIMARY] ,
> ) ON IntRangePScheme (instance_id)
> CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
> [instance_id] ASC,
> [id] ASC ) ON IntRangePScheme (instance_id)
>
> Or, all in one statement
> CREATE TABLE myPartTable (
> id int NOT NULL,
> instance_id int NOT NULL,
> description varchar(32) NOT NULL,
> temp1 float NULL,
> temp2 float NULL,
> temp3 float NULL,
> constraint table1_P_PK
> primary key clustered(instance_id, id)
> ON IntRangePScheme (instance_id) ,
> )
>
> Then you will see that the partition elimination reduces the IO's required
> for your queries.
> David
>
>
|||does you % are real execution or estimated plan?
when you explicitly reference a partition SQL know which partition to read
directly.
when you go through the entire table, SQL doesn't know which partition is
really used until you run it.
but you are right, sometimes I found a query like your slow. SQL scan all
the partitions instead-of scanning only the required partitions.
This completly slow down the query!
"?" <@.discussions.microsoft.com> wrote in message
news:E515D9C3-B0A6-4E37-BCB6-9128EC4D9BCC@.microsoft.com...[vbcol=seagreen]
>I change the clustered index (the other one automatically created on the
> partition scheme), but the performance are still the same.
> When running the 3 query together, I am getting 43% query cost (in the
> execution plan panel) for the first query, 42% for the second and 15% for
> the
> last.
> The main question is why do I get difference between the 2nd and 3rd
> query.
> "David Browne" wrote:
|||"?" <@.discussions.microsoft.com> wrote in message
news:E515D9C3-B0A6-4E37-BCB6-9128EC4D9BCC@.microsoft.com...
>I change the clustered index (the other one automatically created on the
> partition scheme), but the performance are still the same.
> When running the 3 query together, I am getting 43% query cost (in the
> execution plan panel) for the first query, 42% for the second and 15% for
> the
> last.
> The main question is why do I get difference between the 2nd and 3rd
> query.
>
The second and third both hit a single partition for me. Look at the actual
execution plan. Hover the index scan and you should see a PartitionID
or save the plan and view it with a text editor. You should see the
PartitionId uder the IndexScan, something like
<IndexScan Ordered="true" ScanDirection="FORWARD"
ForcedIndex="false" NoExpandHint="false">
<DefinedValues> . . . </DefinedValues>
<Object Database="[Demo]" Schema="[dbo]"
Table="[myPartTable]" Index="[table1_P_PK]" />
<SeekPredicates>
<SeekPredicate>
<Prefix ScanType="EQ">
<RangeColumns>
<ColumnReference Database="[Demo]" Schema="[dbo]"
Table="[myPartTable]" Column="instance_id" />
</RangeColumns>
<RangeExpressions>
<ScalarOperator ScalarString="(3)">
<Const ConstValue="(3)" />
</ScalarOperator>
</RangeExpressions>
</Prefix>
</SeekPredicate>
</SeekPredicates>
<PartitionId>
<ColumnReference Column="ConstExpr1007">
<ScalarOperator ScalarString="(3)">
<Const ConstValue="(3)" />
</ScalarOperator>
</ColumnReference>
</PartitionId>
</IndexScan>
David
|||I can see that it scan the partition, but i'm not getting the same performance.
"David Browne" wrote:

> "?" <@.discussions.microsoft.com> wrote in message
> news:E515D9C3-B0A6-4E37-BCB6-9128EC4D9BCC@.microsoft.com...
> The second and third both hit a single partition for me. Look at the actual
> execution plan. Hover the index scan and you should see a PartitionID
> or save the plan and view it with a text editor. You should see the
> PartitionId uder the IndexScan, something like
> <IndexScan Ordered="true" ScanDirection="FORWARD"
> ForcedIndex="false" NoExpandHint="false">
> <DefinedValues> . . . </DefinedValues>
> <Object Database="[Demo]" Schema="[dbo]"
> Table="[myPartTable]" Index="[table1_P_PK]" />
> <SeekPredicates>
> <SeekPredicate>
> <Prefix ScanType="EQ">
> <RangeColumns>
> <ColumnReference Database="[Demo]" Schema="[dbo]"
> Table="[myPartTable]" Column="instance_id" />
> </RangeColumns>
> <RangeExpressions>
> <ScalarOperator ScalarString="(3)">
> <Const ConstValue="(3)" />
> </ScalarOperator>
> </RangeExpressions>
> </Prefix>
> </SeekPredicate>
> </SeekPredicates>
> <PartitionId>
> <ColumnReference Column="ConstExpr1007">
> <ScalarOperator ScalarString="(3)">
> <Const ConstValue="(3)" />
> </ScalarOperator>
> </ColumnReference>
> </PartitionId>
> </IndexScan>
> David
>
>
|||"?" <@.discussions.microsoft.com> wrote in message
news:987784C1-C165-40A9-B76E-B7C1F4E99F07@.microsoft.com...
>I can see that it scan the partition, but i'm not getting the same
>performance.
>
What performance difference are you expecting? And how are you measuring it?
Partitioning is fundamentally a performance feature, but it's designed for
tables a lot of rows. Partitioning makes it practical to have have tables
with billions of rows. Small databases you won't always see a large
difference with partitioning.
David
|||When I run the queries, I include Actual Execution plan, and then I examine
the performance in the Execution Plan tab.
I’m expecting to get better performance when accessing the partition table
as oppose to the regular table.
I run the same test on tables with 1,000,000 records per instance, and the
results are the same – the performance are the usually the same.
The biggest question is why I get difference between these 2 queries (when
accessing the partition table):
(Query 1) select * from myPartTable where instance_id = 3
(Query 2) select * from myPartTable where instance_id = 3 and
$partition.IntRangePF(3) = 3
When running these queries on partition table with 7251452 records, I get
the following performance:
Query 1: Query Cost = 76%, Subtree Cost = 4.26687
Plan:
Clustered Index Seek, Estimate I/O Cost = 3.46905
Query 2: Query Cost = 24%, Subtree Cost = 1.34275
Plan:
Clustered Index Seek, Estimate I/O Cost = 1.04239; Subtree Cost = 1.28184
Filter, Subtree Cost = 1.34275
*** In both cases the plan show access to partition 3.
Why adding the statement “$partition.IntRangePF(3) = 3” had such an effect
on the performance, while this statement is not suppose to effect filter in
the index. I will get the same performance if I will add
“$partition.IntRangePF(1) = 1” on the same query.
Another improvement I get with the following query:
(Query 3)select * from myPartTable where instance_id = 3 and
$partition.IntRangePF(instance_id) = $partition.IntRangePF(3)
Running this query with the other show the following cost:
Query 1: Query Cost = 70%, Subtree Cost = 4.26687
Plan:
Clustered Index Seek, Estimate I/O Cost = 3.46905
Query 2: Query Cost = 22%, Subtree Cost = 1.34275
Plan:
Clustered Index Seek, Estimate I/O Cost = 1.04239; Subtree Cost = 1.28184
Filter (Cost 5%), Subtree Cost = 1.34275
Query 3: Query Cost = 8%, Subtree Cost = 0.501189
Plan:
Clustered Index Seek, Estimate I/O Cost = 0.317536; Subtree Cost = 0.390208
Filter (Cost 22%), Subtree Cost = 0.500957
"David Browne" wrote:

> "?" <@.discussions.microsoft.com> wrote in message
> news:987784C1-C165-40A9-B76E-B7C1F4E99F07@.microsoft.com...
>
> What performance difference are you expecting? And how are you measuring it?
> Partitioning is fundamentally a performance feature, but it's designed for
> tables a lot of rows. Partitioning makes it practical to have have tables
> with billions of rows. Small databases you won't always see a large
> difference with partitioning.
> David
>
>
|||"?" <@.discussions.microsoft.com> wrote in message
news:34C036B2-A6C7-48C4-BF41-C9BA15C94E82@.microsoft.com...
> When I run the queries, I include Actual Execution plan, and then I
> examine
> the performance in the Execution Plan tab.
> I'm expecting to get better performance when accessing the partition table
> as oppose to the regular table.
> I run the same test on tables with 1,000,000 records per instance, and the
> results are the same - the performance are the usually the same.
> The biggest question is why I get difference between these 2 queries (when
> accessing the partition table):
> (Query 1) select * from myPartTable where instance_id = 3
> (Query 2) select * from myPartTable where instance_id = 3 and
> $partition.IntRangePF(3) = 3
> When running these queries on partition table with 7251452 records, I get
> the following performance:
> Query 1: Query Cost = 76%, Subtree Cost = 4.26687
> Plan:
> Clustered Index Seek, Estimate I/O Cost = 3.46905
> Query 2: Query Cost = 24%, Subtree Cost = 1.34275
> Plan:
> Clustered Index Seek, Estimate I/O Cost = 1.04239; Subtree Cost =
> 1.28184
> Filter, Subtree Cost = 1.34275
>
Ok I see what you mean.
The estimated cost of the second query is incorrect, leading to an apparent
performance difference. The difference is not real. It is only in the
estimated cost of the queries. In particular the "Estimated Number of Rows"
for the Clustered Index Seek operator in the second query is not very
accurate. The actual cost of the queries is almost identical.
Run this to see that the actual cost of the queries is similar:
set statistics IO on
set statistics time on
select * from myPartTable where instance_id = 3
select * from myPartTable where instance_id = 3 and $partition.IntRangePF(3)
= 3
David
|||test
*** Sent via Developersdex http://www.codecomments.com ***

Performance problem with Partition tables in SQL Server 2005

I create partition table and regular table, both with the exact structure and
data.
The aim was to check whether there is an improvement in the performance or
not.
I execute the following queries (myPartTable is the partition table):
select * from myTable where instance_id = 3
select * from myPartTable where instance_id = 3
select * from myPartTable where instance_id = 3 and $partition.IntRangePF(3)
= 3
The result:
- Significant difference in the performance of the second and the third query.
- No major difference between the first and second query. In most of the
cases, the first even show better performance.
My question:
1) Why there is a significant improvement between the second and the third
query?
Isnâ't SQL Server supposed to recognize and access the partition without
specify its number?
2) Shouldn't I get some improvement in the performance when using partition
table?
---
Attached a sample code to create the test.
Create Partition Function:
CREATE PARTITION FUNCTION IntRangePF(int)
AS RANGE left FOR VALUES (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Create Partition Scheme:
CREATE PARTITION SCHEME IntRangePScheme AS PARTITION IntRangePF
ALL TO ([PRIMARY])
Create the tables:
CREATE TABLE myPartTable (
id int NOT NULL,
instance_id int NOT NULL,
description varchar(32) NOT NULL,
temp1 float NULL,
temp2 float NULL,
temp3 float NULL,
constraint table1_P_PK primary key nonclustered(instance_id, id) on
[PRIMARY] ,
) ON IntRangePScheme (instance_id)
CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
[instance_id] ASC,
[id] ASC )
CREATE TABLE myTable (
id int NOT NULL,
instance_id int NOT NULL,
description varchar(32) NOT NULL,
temp1 float NULL,
temp2 float NULL,
temp3 float NULL,
constraint myTable_PK primary key nonclustered(instance_id, id) on
[PRIMARY] ,
)
CREATE CLUSTERED INDEX index_myTable ON myTable(
[instance_id] ASC,
[id] ASC )
Insert Data
DECLARE @.i int
SET @.i = 1;
WHILE @.i < 100
BEGIN
SET @.i = @.i + 1;
insert into myTable values (@.i, 1, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 2, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 3, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 4, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 5, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 6, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 7, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 8, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 9, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 10, 'ddddd', 1, 1, 1);
END
insert into myPartTable select * from myTable"?" <@.discussions.microsoft.com> wrote in message
news:B2FD061B-378D-4E4D-BC2A-04D5E3B9371A@.microsoft.com...
>I create partition table and regular table, both with the exact structure
>and
> data.
> The aim was to check whether there is an improvement in the performance or
> not.
> I execute the following queries (myPartTable is the partition table):
> select * from myTable where instance_id = 3
> select * from myPartTable where instance_id = 3
> select * from myPartTable where instance_id = 3 and
> $partition.IntRangePF(3)
> = 3
> The result:
> - Significant difference in the performance of the second and the third
> query.
> - No major difference between the first and second query. In most of the
> cases, the first even show better performance.
> My question:
> 1) Why there is a significant improvement between the second and the third
> query?
> Isn't SQL Server supposed to recognize and access the partition without
> specify its number?
> 2) Shouldn't I get some improvement in the performance when using
> partition
> table?
> ---
> Attached a sample code to create the test.
> Create Partition Function:
> CREATE PARTITION FUNCTION IntRangePF(int)
> AS RANGE left FOR VALUES (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
> Create Partition Scheme:
> CREATE PARTITION SCHEME IntRangePScheme AS PARTITION IntRangePF
> ALL TO ([PRIMARY])
> Create the tables:
> CREATE TABLE myPartTable (
> id int NOT NULL,
> instance_id int NOT NULL,
> description varchar(32) NOT NULL,
> temp1 float NULL,
> temp2 float NULL,
> temp3 float NULL,
> constraint table1_P_PK primary key nonclustered(instance_id, id) on
> [PRIMARY] ,
> ) ON IntRangePScheme (instance_id)
> CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
> [instance_id] ASC,
> [id] ASC )
>
When you create a clustered index on a table you move the table into the
filegroup or partition scheme for the clustered index. So when you created
a clustered index index_myPartTable on the primary filegroup instead of on
your partition scheme you "unpartitioned" the table.
Change like this:
CREATE TABLE myPartTable (
id int NOT NULL,
instance_id int NOT NULL,
description varchar(32) NOT NULL,
temp1 float NULL,
temp2 float NULL,
temp3 float NULL,
constraint table1_P_PK primary key nonclustered(instance_id, id) on
[PRIMARY] ,
) ON IntRangePScheme (instance_id)
CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
[instance_id] ASC,
[id] ASC ) ON IntRangePScheme (instance_id)
Or, all in one statement
CREATE TABLE myPartTable (
id int NOT NULL,
instance_id int NOT NULL,
description varchar(32) NOT NULL,
temp1 float NULL,
temp2 float NULL,
temp3 float NULL,
constraint table1_P_PK
primary key clustered(instance_id, id)
ON IntRangePScheme (instance_id) ,
)
Then you will see that the partition elimination reduces the IO's required
for your queries.
David|||I change the clustered index (the other one automatically created on the
partition scheme), but the performance are still the same.
When running the 3 query together, I am getting 43% query cost (in the
execution plan panel) for the first query, 42% for the second and 15% for the
last.
The main question is why do I get difference between the 2nd and 3rd query.
"David Browne" wrote:
> "?" <@.discussions.microsoft.com> wrote in message
> news:B2FD061B-378D-4E4D-BC2A-04D5E3B9371A@.microsoft.com...
> >I create partition table and regular table, both with the exact structure
> >and
> > data.
> > The aim was to check whether there is an improvement in the performance or
> > not.
> >
> > I execute the following queries (myPartTable is the partition table):
> > select * from myTable where instance_id = 3
> > select * from myPartTable where instance_id = 3
> > select * from myPartTable where instance_id = 3 and
> > $partition.IntRangePF(3)
> > = 3
> >
> > The result:
> > - Significant difference in the performance of the second and the third
> > query.
> > - No major difference between the first and second query. In most of the
> > cases, the first even show better performance.
> >
> > My question:
> > 1) Why there is a significant improvement between the second and the third
> > query?
> > Isn't SQL Server supposed to recognize and access the partition without
> > specify its number?
> > 2) Shouldn't I get some improvement in the performance when using
> > partition
> > table?
> >
> > ---
> > Attached a sample code to create the test.
> >
> > Create Partition Function:
> > CREATE PARTITION FUNCTION IntRangePF(int)
> > AS RANGE left FOR VALUES (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
> >
> > Create Partition Scheme:
> > CREATE PARTITION SCHEME IntRangePScheme AS PARTITION IntRangePF
> > ALL TO ([PRIMARY])
> >
> > Create the tables:
> > CREATE TABLE myPartTable (
> > id int NOT NULL,
> > instance_id int NOT NULL,
> > description varchar(32) NOT NULL,
> > temp1 float NULL,
> > temp2 float NULL,
> > temp3 float NULL,
> > constraint table1_P_PK primary key nonclustered(instance_id, id) on
> > [PRIMARY] ,
> > ) ON IntRangePScheme (instance_id)
> >
> > CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
> > [instance_id] ASC,
> > [id] ASC )
> >
> When you create a clustered index on a table you move the table into the
> filegroup or partition scheme for the clustered index. So when you created
> a clustered index index_myPartTable on the primary filegroup instead of on
> your partition scheme you "unpartitioned" the table.
>
> Change like this:
> CREATE TABLE myPartTable (
> id int NOT NULL,
> instance_id int NOT NULL,
> description varchar(32) NOT NULL,
> temp1 float NULL,
> temp2 float NULL,
> temp3 float NULL,
> constraint table1_P_PK primary key nonclustered(instance_id, id) on
> [PRIMARY] ,
> ) ON IntRangePScheme (instance_id)
> CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
> [instance_id] ASC,
> [id] ASC ) ON IntRangePScheme (instance_id)
>
> Or, all in one statement
> CREATE TABLE myPartTable (
> id int NOT NULL,
> instance_id int NOT NULL,
> description varchar(32) NOT NULL,
> temp1 float NULL,
> temp2 float NULL,
> temp3 float NULL,
> constraint table1_P_PK
> primary key clustered(instance_id, id)
> ON IntRangePScheme (instance_id) ,
> )
>
> Then you will see that the partition elimination reduces the IO's required
> for your queries.
> David
>
>|||does you % are real execution or estimated plan?
when you explicitly reference a partition SQL know which partition to read
directly.
when you go through the entire table, SQL doesn't know which partition is
really used until you run it.
but you are right, sometimes I found a query like your slow. SQL scan all
the partitions instead-of scanning only the required partitions.
This completly slow down the query!
"?" <@.discussions.microsoft.com> wrote in message
news:E515D9C3-B0A6-4E37-BCB6-9128EC4D9BCC@.microsoft.com...
>I change the clustered index (the other one automatically created on the
> partition scheme), but the performance are still the same.
> When running the 3 query together, I am getting 43% query cost (in the
> execution plan panel) for the first query, 42% for the second and 15% for
> the
> last.
> The main question is why do I get difference between the 2nd and 3rd
> query.
> "David Browne" wrote:
>> "?" <@.discussions.microsoft.com> wrote in message
>> news:B2FD061B-378D-4E4D-BC2A-04D5E3B9371A@.microsoft.com...
>> >I create partition table and regular table, both with the exact
>> >structure
>> >and
>> > data.
>> > The aim was to check whether there is an improvement in the performance
>> > or
>> > not.
>> >
>> > I execute the following queries (myPartTable is the partition table):
>> > select * from myTable where instance_id = 3
>> > select * from myPartTable where instance_id = 3
>> > select * from myPartTable where instance_id = 3 and
>> > $partition.IntRangePF(3)
>> > = 3
>> >
>> > The result:
>> > - Significant difference in the performance of the second and the third
>> > query.
>> > - No major difference between the first and second query. In most of
>> > the
>> > cases, the first even show better performance.
>> >
>> > My question:
>> > 1) Why there is a significant improvement between the second and the
>> > third
>> > query?
>> > Isn't SQL Server supposed to recognize and access the partition without
>> > specify its number?
>> > 2) Shouldn't I get some improvement in the performance when using
>> > partition
>> > table?
>> >
>> > ---
>> > Attached a sample code to create the test.
>> >
>> > Create Partition Function:
>> > CREATE PARTITION FUNCTION IntRangePF(int)
>> > AS RANGE left FOR VALUES (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
>> >
>> > Create Partition Scheme:
>> > CREATE PARTITION SCHEME IntRangePScheme AS PARTITION IntRangePF
>> > ALL TO ([PRIMARY])
>> >
>> > Create the tables:
>> > CREATE TABLE myPartTable (
>> > id int NOT NULL,
>> > instance_id int NOT NULL,
>> > description varchar(32) NOT NULL,
>> > temp1 float NULL,
>> > temp2 float NULL,
>> > temp3 float NULL,
>> > constraint table1_P_PK primary key nonclustered(instance_id, id) on
>> > [PRIMARY] ,
>> > ) ON IntRangePScheme (instance_id)
>> >
>> > CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
>> > [instance_id] ASC,
>> > [id] ASC )
>> >
>> When you create a clustered index on a table you move the table into the
>> filegroup or partition scheme for the clustered index. So when you
>> created
>> a clustered index index_myPartTable on the primary filegroup instead of
>> on
>> your partition scheme you "unpartitioned" the table.
>>
>> Change like this:
>> CREATE TABLE myPartTable (
>> id int NOT NULL,
>> instance_id int NOT NULL,
>> description varchar(32) NOT NULL,
>> temp1 float NULL,
>> temp2 float NULL,
>> temp3 float NULL,
>> constraint table1_P_PK primary key nonclustered(instance_id, id) on
>> [PRIMARY] ,
>> ) ON IntRangePScheme (instance_id)
>> CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
>> [instance_id] ASC,
>> [id] ASC ) ON IntRangePScheme (instance_id)
>>
>> Or, all in one statement
>> CREATE TABLE myPartTable (
>> id int NOT NULL,
>> instance_id int NOT NULL,
>> description varchar(32) NOT NULL,
>> temp1 float NULL,
>> temp2 float NULL,
>> temp3 float NULL,
>> constraint table1_P_PK
>> primary key clustered(instance_id, id)
>> ON IntRangePScheme (instance_id) ,
>> )
>>
>> Then you will see that the partition elimination reduces the IO's
>> required
>> for your queries.
>> David
>>|||"?" <@.discussions.microsoft.com> wrote in message
news:E515D9C3-B0A6-4E37-BCB6-9128EC4D9BCC@.microsoft.com...
>I change the clustered index (the other one automatically created on the
> partition scheme), but the performance are still the same.
> When running the 3 query together, I am getting 43% query cost (in the
> execution plan panel) for the first query, 42% for the second and 15% for
> the
> last.
> The main question is why do I get difference between the 2nd and 3rd
> query.
>
The second and third both hit a single partition for me. Look at the actual
execution plan. Hover the index scan and you should see a PartitionID
or save the plan and view it with a text editor. You should see the
PartitionId uder the IndexScan, something like
<IndexScan Ordered="true" ScanDirection="FORWARD"
ForcedIndex="false" NoExpandHint="false">
<DefinedValues> . . . </DefinedValues>
<Object Database="[Demo]" Schema="[dbo]"
Table="[myPartTable]" Index="[table1_P_PK]" />
<SeekPredicates>
<SeekPredicate>
<Prefix ScanType="EQ">
<RangeColumns>
<ColumnReference Database="[Demo]" Schema="[dbo]"
Table="[myPartTable]" Column="instance_id" />
</RangeColumns>
<RangeExpressions>
<ScalarOperator ScalarString="(3)">
<Const ConstValue="(3)" />
</ScalarOperator>
</RangeExpressions>
</Prefix>
</SeekPredicate>
</SeekPredicates>
<PartitionId>
<ColumnReference Column="ConstExpr1007">
<ScalarOperator ScalarString="(3)">
<Const ConstValue="(3)" />
</ScalarOperator>
</ColumnReference>
</PartitionId>
</IndexScan>
David|||I can see that it scan the partition, but i'm not getting the same performance.
"David Browne" wrote:
> "?" <@.discussions.microsoft.com> wrote in message
> news:E515D9C3-B0A6-4E37-BCB6-9128EC4D9BCC@.microsoft.com...
> >I change the clustered index (the other one automatically created on the
> > partition scheme), but the performance are still the same.
> >
> > When running the 3 query together, I am getting 43% query cost (in the
> > execution plan panel) for the first query, 42% for the second and 15% for
> > the
> > last.
> >
> > The main question is why do I get difference between the 2nd and 3rd
> > query.
> >
> The second and third both hit a single partition for me. Look at the actual
> execution plan. Hover the index scan and you should see a PartitionID
> or save the plan and view it with a text editor. You should see the
> PartitionId uder the IndexScan, something like
> <IndexScan Ordered="true" ScanDirection="FORWARD"
> ForcedIndex="false" NoExpandHint="false">
> <DefinedValues> . . . </DefinedValues>
> <Object Database="[Demo]" Schema="[dbo]"
> Table="[myPartTable]" Index="[table1_P_PK]" />
> <SeekPredicates>
> <SeekPredicate>
> <Prefix ScanType="EQ">
> <RangeColumns>
> <ColumnReference Database="[Demo]" Schema="[dbo]"
> Table="[myPartTable]" Column="instance_id" />
> </RangeColumns>
> <RangeExpressions>
> <ScalarOperator ScalarString="(3)">
> <Const ConstValue="(3)" />
> </ScalarOperator>
> </RangeExpressions>
> </Prefix>
> </SeekPredicate>
> </SeekPredicates>
> <PartitionId>
> <ColumnReference Column="ConstExpr1007">
> <ScalarOperator ScalarString="(3)">
> <Const ConstValue="(3)" />
> </ScalarOperator>
> </ColumnReference>
> </PartitionId>
> </IndexScan>
> David
>
>|||"?" <@.discussions.microsoft.com> wrote in message
news:987784C1-C165-40A9-B76E-B7C1F4E99F07@.microsoft.com...
>I can see that it scan the partition, but i'm not getting the same
>performance.
>
What performance difference are you expecting? And how are you measuring it?
Partitioning is fundamentally a performance feature, but it's designed for
tables a lot of rows. Partitioning makes it practical to have have tables
with billions of rows. Small databases you won't always see a large
difference with partitioning.
David|||When I run the queries, I include Actual Execution plan, and then I examine
the performance in the Execution Plan tab.
Iâ'm expecting to get better performance when accessing the partition table
as oppose to the regular table.
I run the same test on tables with 1,000,000 records per instance, and the
results are the same â' the performance are the usually the same.
The biggest question is why I get difference between these 2 queries (when
accessing the partition table):
(Query 1) select * from myPartTable where instance_id = 3
(Query 2) select * from myPartTable where instance_id = 3 and
$partition.IntRangePF(3) = 3
When running these queries on partition table with 7251452 records, I get
the following performance:
Query 1: Query Cost = 76%, Subtree Cost = 4.26687
Plan:
Clustered Index Seek, Estimate I/O Cost = 3.46905
Query 2: Query Cost = 24%, Subtree Cost = 1.34275
Plan:
Clustered Index Seek, Estimate I/O Cost = 1.04239; Subtree Cost = 1.28184
Filter, Subtree Cost = 1.34275
*** In both cases the plan show access to partition 3.
Why adding the statement â'$partition.IntRangePF(3) = 3â' had such an effect
on the performance, while this statement is not suppose to effect filter in
the index. I will get the same performance if I will add
â'$partition.IntRangePF(1) = 1â' on the same query.
Another improvement I get with the following query:
(Query 3)select * from myPartTable where instance_id = 3 and
$partition.IntRangePF(instance_id) = $partition.IntRangePF(3)
Running this query with the other show the following cost:
Query 1: Query Cost = 70%, Subtree Cost = 4.26687
Plan:
Clustered Index Seek, Estimate I/O Cost = 3.46905
Query 2: Query Cost = 22%, Subtree Cost = 1.34275
Plan:
Clustered Index Seek, Estimate I/O Cost = 1.04239; Subtree Cost = 1.28184
Filter (Cost 5%), Subtree Cost = 1.34275
Query 3: Query Cost = 8%, Subtree Cost = 0.501189
Plan:
Clustered Index Seek, Estimate I/O Cost = 0.317536; Subtree Cost = 0.390208
Filter (Cost 22%), Subtree Cost = 0.500957
"David Browne" wrote:
> "?" <@.discussions.microsoft.com> wrote in message
> news:987784C1-C165-40A9-B76E-B7C1F4E99F07@.microsoft.com...
> >I can see that it scan the partition, but i'm not getting the same
> >performance.
> >
>
> What performance difference are you expecting? And how are you measuring it?
> Partitioning is fundamentally a performance feature, but it's designed for
> tables a lot of rows. Partitioning makes it practical to have have tables
> with billions of rows. Small databases you won't always see a large
> difference with partitioning.
> David
>
>|||"?" <@.discussions.microsoft.com> wrote in message
news:34C036B2-A6C7-48C4-BF41-C9BA15C94E82@.microsoft.com...
> When I run the queries, I include Actual Execution plan, and then I
> examine
> the performance in the Execution Plan tab.
> I'm expecting to get better performance when accessing the partition table
> as oppose to the regular table.
> I run the same test on tables with 1,000,000 records per instance, and the
> results are the same - the performance are the usually the same.
> The biggest question is why I get difference between these 2 queries (when
> accessing the partition table):
> (Query 1) select * from myPartTable where instance_id = 3
> (Query 2) select * from myPartTable where instance_id = 3 and
> $partition.IntRangePF(3) = 3
> When running these queries on partition table with 7251452 records, I get
> the following performance:
> Query 1: Query Cost = 76%, Subtree Cost = 4.26687
> Plan:
> Clustered Index Seek, Estimate I/O Cost = 3.46905
> Query 2: Query Cost = 24%, Subtree Cost = 1.34275
> Plan:
> Clustered Index Seek, Estimate I/O Cost = 1.04239; Subtree Cost => 1.28184
> Filter, Subtree Cost = 1.34275
>
Ok I see what you mean.
The estimated cost of the second query is incorrect, leading to an apparent
performance difference. The difference is not real. It is only in the
estimated cost of the queries. In particular the "Estimated Number of Rows"
for the Clustered Index Seek operator in the second query is not very
accurate. The actual cost of the queries is almost identical.
Run this to see that the actual cost of the queries is similar:
set statistics IO on
set statistics time on
select * from myPartTable where instance_id = 3
select * from myPartTable where instance_id = 3 and $partition.IntRangePF(3)
= 3
David

Performance problem with Partition tables in SQL Server 2005

I create partition table and regular table, both with the exact structure an
d
data.
The aim was to check whether there is an improvement in the performance or
not.
I execute the following queries (myPartTable is the partition table):
select * from myTable where instance_id = 3
select * from myPartTable where instance_id = 3
select * from myPartTable where instance_id = 3 and $partition.IntRangePF(3)
= 3
The result:
- Significant difference in the performance of the second and the third quer
y.
- No major difference between the first and second query. In most of the
cases, the first even show better performance.
My question:
1) Why there is a significant improvement between the second and the third
query?
Isn’t SQL Server supposed to recognize and access the partition without
specify its number?
2) Shouldn't I get some improvement in the performance when using partition
table?
---
Attached a sample code to create the test.
Create Partition Function:
CREATE PARTITION FUNCTION IntRangePF(int)
AS RANGE left FOR VALUES (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Create Partition Scheme:
CREATE PARTITION SCHEME IntRangePScheme AS PARTITION IntRangePF
ALL TO ([PRIMARY])
Create the tables:
CREATE TABLE myPartTable (
id int NOT NULL,
instance_id int NOT NULL,
description varchar(32) NOT NULL,
temp1 float NULL,
temp2 float NULL,
temp3 float NULL,
constraint table1_P_PK primary key nonclustered(instance_id, id) on
[PRIMARY] ,
) ON IntRangePScheme (instance_id)
CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
[instance_id] ASC,
[id] ASC )
CREATE TABLE myTable (
id int NOT NULL,
instance_id int NOT NULL,
description varchar(32) NOT NULL,
temp1 float NULL,
temp2 float NULL,
temp3 float NULL,
constraint myTable_PK primary key nonclustered(instance_id, id) on
[PRIMARY] ,
)
CREATE CLUSTERED INDEX index_myTable ON myTable(
[instance_id] ASC,
[id] ASC )
Insert Data
DECLARE @.i int
SET @.i = 1;
WHILE @.i < 100
BEGIN
SET @.i = @.i + 1;
insert into myTable values (@.i, 1, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 2, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 3, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 4, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 5, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 6, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 7, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 8, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 9, 'ddddd', 1, 1, 1);
insert into myTable values (@.i, 10, 'ddddd', 1, 1, 1);
END
insert into myPartTable select * from myTable"?" <@.discussions.microsoft.com> wrote in message
news:B2FD061B-378D-4E4D-BC2A-04D5E3B9371A@.microsoft.com...
>I create partition table and regular table, both with the exact structure
>and
> data.
> The aim was to check whether there is an improvement in the performance or
> not.
> I execute the following queries (myPartTable is the partition table):
> select * from myTable where instance_id = 3
> select * from myPartTable where instance_id = 3
> select * from myPartTable where instance_id = 3 and
> $partition.IntRangePF(3)
> = 3
> The result:
> - Significant difference in the performance of the second and the third
> query.
> - No major difference between the first and second query. In most of the
> cases, the first even show better performance.
> My question:
> 1) Why there is a significant improvement between the second and the third
> query?
> Isn't SQL Server supposed to recognize and access the partition without
> specify its number?
> 2) Shouldn't I get some improvement in the performance when using
> partition
> table?
> ---
> Attached a sample code to create the test.
> Create Partition Function:
> CREATE PARTITION FUNCTION IntRangePF(int)
> AS RANGE left FOR VALUES (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
> Create Partition Scheme:
> CREATE PARTITION SCHEME IntRangePScheme AS PARTITION IntRangePF
> ALL TO ([PRIMARY])
> Create the tables:
> CREATE TABLE myPartTable (
> id int NOT NULL,
> instance_id int NOT NULL,
> description varchar(32) NOT NULL,
> temp1 float NULL,
> temp2 float NULL,
> temp3 float NULL,
> constraint table1_P_PK primary key nonclustered(instance_id, id) on
> [PRIMARY] ,
> ) ON IntRangePScheme (instance_id)
> CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
> [instance_id] ASC,
> [id] ASC )
>
When you create a clustered index on a table you move the table into the
filegroup or partition scheme for the clustered index. So when you created
a clustered index index_myPartTable on the primary filegroup instead of on
your partition scheme you "unpartitioned" the table.
Change like this:
CREATE TABLE myPartTable (
id int NOT NULL,
instance_id int NOT NULL,
description varchar(32) NOT NULL,
temp1 float NULL,
temp2 float NULL,
temp3 float NULL,
constraint table1_P_PK primary key nonclustered(instance_id, id) on
[PRIMARY] ,
) ON IntRangePScheme (instance_id)
CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
[instance_id] ASC,
[id] ASC ) ON IntRangePScheme (instance_id)
Or, all in one statement
CREATE TABLE myPartTable (
id int NOT NULL,
instance_id int NOT NULL,
description varchar(32) NOT NULL,
temp1 float NULL,
temp2 float NULL,
temp3 float NULL,
constraint table1_P_PK
primary key clustered(instance_id, id)
ON IntRangePScheme (instance_id) ,
)
Then you will see that the partition elimination reduces the IO's required
for your queries.
David|||I change the clustered index (the other one automatically created on the
partition scheme), but the performance are still the same.
When running the 3 query together, I am getting 43% query cost (in the
execution plan panel) for the first query, 42% for the second and 15% for th
e
last.
The main question is why do I get difference between the 2nd and 3rd query.
"David Browne" wrote:

> "?" <@.discussions.microsoft.com> wrote in message
> news:B2FD061B-378D-4E4D-BC2A-04D5E3B9371A@.microsoft.com...
> When you create a clustered index on a table you move the table into the
> filegroup or partition scheme for the clustered index. So when you create
d
> a clustered index index_myPartTable on the primary filegroup instead of on
> your partition scheme you "unpartitioned" the table.
>
> Change like this:
> CREATE TABLE myPartTable (
> id int NOT NULL,
> instance_id int NOT NULL,
> description varchar(32) NOT NULL,
> temp1 float NULL,
> temp2 float NULL,
> temp3 float NULL,
> constraint table1_P_PK primary key nonclustered(instance_id, id) on
> [PRIMARY] ,
> ) ON IntRangePScheme (instance_id)
> CREATE CLUSTERED INDEX index_myPartTable ON myPartTable(
> [instance_id] ASC,
> [id] ASC ) ON IntRangePScheme (instance_id)
>
> Or, all in one statement
> CREATE TABLE myPartTable (
> id int NOT NULL,
> instance_id int NOT NULL,
> description varchar(32) NOT NULL,
> temp1 float NULL,
> temp2 float NULL,
> temp3 float NULL,
> constraint table1_P_PK
> primary key clustered(instance_id, id)
> ON IntRangePScheme (instance_id) ,
> )
>
> Then you will see that the partition elimination reduces the IO's required
> for your queries.
> David
>
>|||does you % are real execution or estimated plan?
when you explicitly reference a partition SQL know which partition to read
directly.
when you go through the entire table, SQL doesn't know which partition is
really used until you run it.
but you are right, sometimes I found a query like your slow. SQL scan all
the partitions instead-of scanning only the required partitions.
This completly slow down the query!
"?" <@.discussions.microsoft.com> wrote in message
news:E515D9C3-B0A6-4E37-BCB6-9128EC4D9BCC@.microsoft.com...[vbcol=seagreen]
>I change the clustered index (the other one automatically created on the
> partition scheme), but the performance are still the same.
> When running the 3 query together, I am getting 43% query cost (in the
> execution plan panel) for the first query, 42% for the second and 15% for
> the
> last.
> The main question is why do I get difference between the 2nd and 3rd
> query.
> "David Browne" wrote:
>|||"?" <@.discussions.microsoft.com> wrote in message
news:E515D9C3-B0A6-4E37-BCB6-9128EC4D9BCC@.microsoft.com...
>I change the clustered index (the other one automatically created on the
> partition scheme), but the performance are still the same.
> When running the 3 query together, I am getting 43% query cost (in the
> execution plan panel) for the first query, 42% for the second and 15% for
> the
> last.
> The main question is why do I get difference between the 2nd and 3rd
> query.
>
The second and third both hit a single partition for me. Look at the actual
execution plan. Hover the index scan and you should see a PartitionID
or save the plan and view it with a text editor. You should see the
PartitionId uder the IndexScan, something like
<IndexScan Ordered="true" ScanDirection="FORWARD"
ForcedIndex="false" NoExpandHint="false">
<DefinedValues> . . . </DefinedValues>
<Object Database="[Demo]" Schema="[dbo]"
Table="[myPartTable]" Index="[table1_P_PK]" />
<SeekPredicates>
<SeekPredicate>
<Prefix ScanType="EQ">
<RangeColumns>
<ColumnReference Database="[Demo]" Schema="[dbo]"
Table="[myPartTable]" Column="instance_id" />
</RangeColumns>
<RangeExpressions>
<ScalarOperator ScalarString="(3)">
<Const ConstValue="(3)" />
</ScalarOperator>
</RangeExpressions>
</Prefix>
</SeekPredicate>
</SeekPredicates>
<PartitionId>
<ColumnReference Column="ConstExpr1007">
<ScalarOperator ScalarString="(3)">
<Const ConstValue="(3)" />
</ScalarOperator>
</ColumnReference>
</PartitionId>
</IndexScan>
David|||I can see that it scan the partition, but i'm not getting the same performan
ce.
"David Browne" wrote:

> "?" <@.discussions.microsoft.com> wrote in message
> news:E515D9C3-B0A6-4E37-BCB6-9128EC4D9BCC@.microsoft.com...
> The second and third both hit a single partition for me. Look at the actu
al
> execution plan. Hover the index scan and you should see a PartitionID
> or save the plan and view it with a text editor. You should see the
> PartitionId uder the IndexScan, something like
> <IndexScan Ordered="true" ScanDirection="FORWARD"
> ForcedIndex="false" NoExpandHint="false">
> <DefinedValues> . . . </DefinedValues>
> <Object Database="[Demo]" Schema="[dbo]"
> Table="[myPartTable]" Index="[table1_P_PK]" />
> <SeekPredicates>
> <SeekPredicate>
> <Prefix ScanType="EQ">
> <RangeColumns>
> <ColumnReference Database="[Demo]" Schema="
1;dbo]"
> Table="[myPartTable]" Column="instance_id" />
> </RangeColumns>
> <RangeExpressions>
> <ScalarOperator ScalarString="(3)">
> <Const ConstValue="(3)" />
> </ScalarOperator>
> </RangeExpressions>
> </Prefix>
> </SeekPredicate>
> </SeekPredicates>
> <PartitionId>
> <ColumnReference Column="ConstExpr1007">
> <ScalarOperator ScalarString="(3)">
> <Const ConstValue="(3)" />
> </ScalarOperator>
> </ColumnReference>
> </PartitionId>
> </IndexScan>
> David
>
>|||"?" <@.discussions.microsoft.com> wrote in message
news:987784C1-C165-40A9-B76E-B7C1F4E99F07@.microsoft.com...
>I can see that it scan the partition, but i'm not getting the same
>performance.
>
What performance difference are you expecting? And how are you measuring it?
Partitioning is fundamentally a performance feature, but it's designed for
tables a lot of rows. Partitioning makes it practical to have have tables
with billions of rows. Small databases you won't always see a large
difference with partitioning.
David|||When I run the queries, I include Actual Execution plan, and then I examine
the performance in the Execution Plan tab.
I’m expecting to get better performance when accessing the partition table
as oppose to the regular table.
I run the same test on tables with 1,000,000 records per instance, and the
results are the same – the performance are the usually the same.
The biggest question is why I get difference between these 2 queries (when
accessing the partition table):
(Query 1) select * from myPartTable where instance_id = 3
(Query 2) select * from myPartTable where instance_id = 3 and
$partition.IntRangePF(3) = 3
When running these queries on partition table with 7251452 records, I get
the following performance:
Query 1: Query Cost = 76%, Subtree Cost = 4.26687
Plan:
Clustered Index Seek, Estimate I/O Cost = 3.46905
Query 2: Query Cost = 24%, Subtree Cost = 1.34275
Plan:
Clustered Index Seek, Estimate I/O Cost = 1.04239; Subtree Cost = 1.28184
Filter, Subtree Cost = 1.34275
*** In both cases the plan show access to partition 3.
Why adding the statement “$partition.IntRangePF(3) = 3” had such an effe
ct
on the performance, while this statement is not suppose to effect filter in
the index. I will get the same performance if I will add
“$partition.IntRangePF(1) = 1” on the same query.
Another improvement I get with the following query:
(Query 3)select * from myPartTable where instance_id = 3 and
$partition.IntRangePF(instance_id) = $partition.IntRangePF(3)
Running this query with the other show the following cost:
Query 1: Query Cost = 70%, Subtree Cost = 4.26687
Plan:
Clustered Index Seek, Estimate I/O Cost = 3.46905
Query 2: Query Cost = 22%, Subtree Cost = 1.34275
Plan:
Clustered Index Seek, Estimate I/O Cost = 1.04239; Subtree Cost = 1.28184
Filter (Cost 5%), Subtree Cost = 1.34275
Query 3: Query Cost = 8%, Subtree Cost = 0.501189
Plan:
Clustered Index Seek, Estimate I/O Cost = 0.317536; Subtree Cost = 0.39020
8
Filter (Cost 22%), Subtree Cost = 0.500957
"David Browne" wrote:

> "?" <@.discussions.microsoft.com> wrote in message
> news:987784C1-C165-40A9-B76E-B7C1F4E99F07@.microsoft.com...
>
> What performance difference are you expecting? And how are you measuring i
t?
> Partitioning is fundamentally a performance feature, but it's designed for
> tables a lot of rows. Partitioning makes it practical to have have tables
> with billions of rows. Small databases you won't always see a large
> difference with partitioning.
> David
>
>|||"?" <@.discussions.microsoft.com> wrote in message
news:34C036B2-A6C7-48C4-BF41-C9BA15C94E82@.microsoft.com...
> When I run the queries, I include Actual Execution plan, and then I
> examine
> the performance in the Execution Plan tab.
> I'm expecting to get better performance when accessing the partition table
> as oppose to the regular table.
> I run the same test on tables with 1,000,000 records per instance, and the
> results are the same - the performance are the usually the same.
> The biggest question is why I get difference between these 2 queries (when
> accessing the partition table):
> (Query 1) select * from myPartTable where instance_id = 3
> (Query 2) select * from myPartTable where instance_id = 3 and
> $partition.IntRangePF(3) = 3
> When running these queries on partition table with 7251452 records, I get
> the following performance:
> Query 1: Query Cost = 76%, Subtree Cost = 4.26687
> Plan:
> Clustered Index Seek, Estimate I/O Cost = 3.46905
> Query 2: Query Cost = 24%, Subtree Cost = 1.34275
> Plan:
> Clustered Index Seek, Estimate I/O Cost = 1.04239; Subtree Cost =
> 1.28184
> Filter, Subtree Cost = 1.34275
>
Ok I see what you mean.
The estimated cost of the second query is incorrect, leading to an apparent
performance difference. The difference is not real. It is only in the
estimated cost of the queries. In particular the "Estimated Number of Rows"
for the Clustered Index Seek operator in the second query is not very
accurate. The actual cost of the queries is almost identical.
Run this to see that the actual cost of the queries is similar:
set statistics IO on
set statistics time on
select * from myPartTable where instance_id = 3
select * from myPartTable where instance_id = 3 and $partition.IntRangePF(3)
= 3
David|||test
*** Sent via Developersdex http://www.codecomments.com ***