Monday, March 12, 2012

Performance Optimization

I have one problem while optimizing the sql query .For a few rows the
query works perfectly ,but as the number of rows increases it works but
gives wrong result.
I am using nested queries W/O aliasing . So what I assume is that Query
Optimizer is trying to flatten the query (converting it into joins) and
in the process ,because of no Alias Name takes a long time .
Something like this :--
Select * From EmpMst Where deptid in
( Select deptid from deptmst where deptname='acc')
--
into This --
--
Select * From EmpMst EM
Inner Join DeptMst DM On Dm.deptid = EM.deptid and DM.deptname='acc'
What might be the reasons for performance debacle? Following is what I
am using
1. Views (using *) --
2. Indexing (Clustered)
3. History Data (2 Billion Rows)
4. No Indexed views
5. Scalar Functions ( a bit For format Checking like All
alphabets,digits etc)
6. No Cursor
7. Updating a permanent temp table for intermediate results.
With Warm regards
Jatinder SinghIt is not clear to me what your question is? Is your question about
performance, or about getting a queries that always returns correct
results?
Which query are you currently using? The first or the second?
If you replace "*" with "EM.*" in the second query, then it is still not
(necessarily) equivalent to the first. If table DeptMst contains
duplicate deptid values, then the second query will return more rows
than the first. This effect can be cancelled out by adding the DISTINCT
keyword (and selection only from the EmpMst table).
Also, if column DeptMst.deptid contains NULLs, then the queries will
return different results.
If you are using SQL2K SP4, and the queries does not perform well, then
maybe you don't have proper indexes in place, or maybe your statistics
are not up to date. You did not post any DDL, so you indexing strategy
is unknown to us.
Gert-Jan
jsfromynr wrote:
> I have one problem while optimizing the sql query .For a few rows the
> query works perfectly ,but as the number of rows increases it works but
> gives wrong result.
> I am using nested queries W/O aliasing . So what I assume is that Query
> Optimizer is trying to flatten the query (converting it into joins) and
> in the process ,because of no Alias Name takes a long time .
> Something like this :--
> Select * From EmpMst Where deptid in
> ( Select deptid from deptmst where deptname='acc')
> --
> into This --
> --
> Select * From EmpMst EM
> Inner Join DeptMst DM On Dm.deptid = EM.deptid and DM.deptname='acc'
> What might be the reasons for performance debacle? Following is what I
> am using
> 1. Views (using *) --
> 2. Indexing (Clustered)
> 3. History Data (2 Billion Rows)
> 4. No Indexed views
> 5. Scalar Functions ( a bit For format Checking like All
> alphabets,digits etc)
> 6. No Cursor
> 7. Updating a permanent temp table for intermediate results.
> With Warm regards
> Jatinder Singh|||Hi There,
Sorry for not making myself clear . I know you can solve / suggest
some new points.
I am using First Query.
The example I had taken is where empid is PK in EmpMst and deptid is PK
in DeptMst so they cannot be null.
(if column DeptMst.deptid contains NULLs, then the queries will return
different results.
And If table DeptMst contains duplicate deptid values, then the second
query will return more rows than the first)
Select * From EmpMst Where deptid in
( Select deptid from deptmst where deptname='acc')
--
into This --
--
Select * From EmpMst EM
Inner Join DeptMst DM On Dm.deptid = EM.deptid and DM.deptname='acc'
That is the way I thought (I can be wrong) Query Optimizer will Convert
the queries with IN (Subquery...)
What might be the reasons for performance debacle (in Case I am using
Subqueries/Corelated Subqueries)?I assume here that Subqueries are
executed using tempdb.
Following is what I am using
1. Views (using *) --
2. Indexing (Clustered)
3. History Data (2 Billion Rows)
4. No Indexed views
5. Scalar Functions ( a bit For format Checking like All
alphabets,digits etc)
6. No Cursor
7. Updating a permanent temp table for intermediate results.
With Warm regards
Jatinder Singh|||You can look at the execution plan to see if the optimizer transforms
the query the way you describe. Press CTRL+G in Query Analyser and then
run the query to see the execution plan.
How the optimizer will execute your query depends on the relative table
sizes, the selectivity of the column you join on, and the 'width' of the
available indexes.
In this case the optimizer can choose between three strategies:
1a) use EmpMst as outer table and perform lookups for each relevant
deptid in DeptMst
1b) use DeptMst as outer table and perform lookups for each relevant
deptid in EmpMst (provided that you have an index on EmpMst(deptid))
(unlikely in your case)
2) scan both tables and join the result with the hashing algorithm
3) use a merge join to simulateneously scan both tables and join them
(unlikely in your case)
Under normal circumstances it is not necessary to manually rewrite the
query, because the optimizer will figure out the best plan.
However, if DeptMst and/or EmpMst are views, then no one can tell
anything about it, because you haven't posted any view definition, or
any DDL, indexes, etc. This is important information for performance
related questions. All 7 point you mention can affect performance, but
without addition information...
Maybe you can find some useful information on
http://www.sql-server-performance.com/
If not, then please provide more information (see
http://www.aspfaq.com/etiquette.asp?id=5006)
Gert-Jan
jsfromynr wrote:
> Hi There,
> Sorry for not making myself clear . I know you can solve / suggest
> some new points.
> I am using First Query.
> The example I had taken is where empid is PK in EmpMst and deptid is PK
> in DeptMst so they cannot be null.
> (if column DeptMst.deptid contains NULLs, then the queries will return
> different results.
> And If table DeptMst contains duplicate deptid values, then the second
> query will return more rows than the first)
> Select * From EmpMst Where deptid in
> ( Select deptid from deptmst where deptname='acc')
> --
> into This --
> --
> Select * From EmpMst EM
> Inner Join DeptMst DM On Dm.deptid = EM.deptid and DM.deptname='acc'
> That is the way I thought (I can be wrong) Query Optimizer will Convert
> the queries with IN (Subquery...)
> What might be the reasons for performance debacle (in Case I am using
> Subqueries/Corelated Subqueries)?I assume here that Subqueries are
> executed using tempdb.
> Following is what I am using
> 1. Views (using *) --
> 2. Indexing (Clustered)
> 3. History Data (2 Billion Rows)
> 4. No Indexed views
> 5. Scalar Functions ( a bit For format Checking like All
> alphabets,digits etc)
> 6. No Cursor
> 7. Updating a permanent temp table for intermediate results.
> With Warm regards
> Jatinder Singh|||Hi There,
Thanks for your input. The database design you are asking for is too
big to post . Your explaination is excellent. I should say with minimal
information ,you gave me geniune help. I hope someday , I will be
helping others as you do.
Yeah www.sql-server-performance.com is the site I regulary visit , not
to forget dbazine.com
I wish , I could learn how to flatten the correlated subqueries ; The
cursor based mindset push me there . Do you know some reference
articles / sites which can provide help on this topic ( Performance ).
Thanks Again.
With Warm regards
Jatinder Singh

No comments:

Post a Comment