Showing posts with label product. Show all posts
Showing posts with label product. Show all posts

Monday, March 26, 2012

Performance question

Hello,
Our main product catalog is approx. 3.1 million rows, with a full-text index
on 3 (varchar) columns. For the past year as our catalog has grown, we have
experienced continuing performance degredation, to the point that we are
looking at biting the bullet and migrating this application to Oracle Text,
which from our initial testing is several orders of magnitude faster (we're
going to stick with SQL Server for everything else). Obviously we'd like to
avoid that due to cost issues. However, this application is very important
and if we've reached the limit of SQL Server then so be it. Our problem is
that when a customer searches our catalog, we sort the search results based
on their sales rank which does not allow us to use the "top_n" parameter of
containstable or freetexttable. For example, say a customer searches our
catalog for a relatively common word that results in around 72,000 results
(takes approx 11 sec on subsequent runs...over 1 min on first run, which is
the most important statistic). If we were to use (say) n=2000 for the top_n
parameter, our best selling products would not be returned from the FTS
engine. We need to return all the results so we can sort them by sales rank
and display them to customers.
The server is running Windows 2003, with 6GB RAM, 16 x 15,000K RPM SCSI
drive in a RAID 10, in a dual opteron configuration the with transaction log
on a seperate RAID volume. SQL is SQL Server 2000, SP4 (I've included the
output of @.@.version below). Perfmon shows that the server isn't sweating at
all during these queries from a disk, memory, or CPU standpoint, so that
leaves SQL as the performance bottleneck. Our most recent population was
around a month ago, so the catalog is relatively up to date.
We've fooled around with increasing the memory available to FTS, but that
did not seem to make a difference. Perhaps we did not do it right since
mssearch.exe is still only showing about 49,000K in memory--but since we
are using AWE this could be distorted.
We're going to make one last gasp at improving the performance here before
dumping SQL Server and moving to Oracle. Help!
John
@.@.Version:
Microsoft SQL Server 2000 - 8.00.760 (Intel X86) Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation Enterprise Edition on Windows
NT 5.2 (Build 3790: )
Sample query:
SELECT distinct <field list>
FROM containstable(<ft-table>, <ft-field>,'"<common term>"') as ct
JOIN <ft-table> t with (nolock) on t.<PK>=ct.[key] //tables are 100%
readonly except during monthly updates, hence the nolocks
join salesRank sr with (nolock) on sr.<PK>=ct.[key]
order by sr.SalesRank
The approach you take for problems like this is to partition your tables,
perhaps in your case by sales rank. For instance you might want to break
your tables into 10 sub tables. One from 1-10, another from 11-20, etc.
Then limit each results set to 100 and union the results. This might end up
more expensive than what you are currently experiencing.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"John" <john36356@.community.nospam> wrote in message
news:ebYiMcaWFHA.2572@.TK2MSFTNGP14.phx.gbl...
> Hello,
> Our main product catalog is approx. 3.1 million rows, with a full-text
index
> on 3 (varchar) columns. For the past year as our catalog has grown, we
have
> experienced continuing performance degredation, to the point that we are
> looking at biting the bullet and migrating this application to Oracle
Text,
> which from our initial testing is several orders of magnitude faster
(we're
> going to stick with SQL Server for everything else). Obviously we'd like
to
> avoid that due to cost issues. However, this application is very
important
> and if we've reached the limit of SQL Server then so be it. Our problem
is
> that when a customer searches our catalog, we sort the search results
based
> on their sales rank which does not allow us to use the "top_n" parameter
of
> containstable or freetexttable. For example, say a customer searches our
> catalog for a relatively common word that results in around 72,000 results
> (takes approx 11 sec on subsequent runs...over 1 min on first run, which
is
> the most important statistic). If we were to use (say) n=2000 for the
top_n
> parameter, our best selling products would not be returned from the FTS
> engine. We need to return all the results so we can sort them by sales
rank
> and display them to customers.
> The server is running Windows 2003, with 6GB RAM, 16 x 15,000K RPM SCSI
> drive in a RAID 10, in a dual opteron configuration the with transaction
log
> on a seperate RAID volume. SQL is SQL Server 2000, SP4 (I've included the
> output of @.@.version below). Perfmon shows that the server isn't sweating
at
> all during these queries from a disk, memory, or CPU standpoint, so that
> leaves SQL as the performance bottleneck. Our most recent population was
> around a month ago, so the catalog is relatively up to date.
> We've fooled around with increasing the memory available to FTS, but that
> did not seem to make a difference. Perhaps we did not do it right since
> mssearch.exe is still only showing about 49,000K in memory--but since we
> are using AWE this could be distorted.
> We're going to make one last gasp at improving the performance here before
> dumping SQL Server and moving to Oracle. Help!
> John
> @.@.Version:
> Microsoft SQL Server 2000 - 8.00.760 (Intel X86) Dec 17 2002 14:22:05
> Copyright (c) 1988-2003 Microsoft Corporation Enterprise Edition on
Windows
> NT 5.2 (Build 3790: )
> Sample query:
> SELECT distinct <field list>
> FROM containstable(<ft-table>, <ft-field>,'"<common term>"') as ct
> JOIN <ft-table> t with (nolock) on t.<PK>=ct.[key] //tables are 100%
> readonly except during monthly updates, hence the nolocks
> join salesRank sr with (nolock) on sr.<PK>=ct.[key]
> order by sr.SalesRank
>

Friday, March 23, 2012

Performance question

We have a server product which talks to SQL Server. One of the tables
we maintain is a session table. So a session row is added when login
occurs, it is looked up on subsequent requests and at certain points
may be updated to reflect recent activity (meaning the session's
lifetime is extended).
So what we essentially have is a fairly dynamic table of data whose
value is limited to however long the server is running. If the server
or DBMS machine is restarted, there is no particular need to retain
the contents of this table.
While this is currently a standard database table, I wondered whether
might be a more performant way to hold this information. For
instance, I note in Kalen Delaney's "Inside Microsoft SQL Server 2000"
that because of reduced logging requirements, "...data modification
operations on tables in tempdb can be up to four times faster than the
same operations in other databases." (p.310)
However reading more on the web I get the impression that some feel
tempdb operations can be less performant - the fear seems to relate to
contention.
Can anyone offer suggestions, particularly with regard to the scenario
I outlined above?
Thanks in advance.
Ed BarrettI do wonder though what happens if a client connection bombs out for some
reason. For example, if the client becomes disconnected in a non-proper
manner, e.g., network wire disconnected, blue screen. What happens if that
occurrs while you're updating this temp table? Will you get bad data? I
assume you already know to keep your transactions short and to update and
access the tables always in the same order whereby you can reduce the
probablility of contention.
Perhaps if you're looking for performance you can look to hardware. Since
high IOs are important, have you considered moving your special database (or
TEMPDB if you so choose) to an array of numerous spindles? If it's
available to you, go with a vacant BUS, private RAID CARD, and new drives
(RAID 10 recommended) for the best IO you can achieve. Testing of course is
recommended, but I am sure you know that.
Moving TEMPDB to its own RAID array is always recommended anyway.
hth
Eric
"Ed Barrett" <ebarrett@.metastorm.com> wrote in message
news:4026318.0310200720.21613799@.posting.google.com...
> We have a server product which talks to SQL Server. One of the tables
> we maintain is a session table. So a session row is added when login
> occurs, it is looked up on subsequent requests and at certain points
> may be updated to reflect recent activity (meaning the session's
> lifetime is extended).
> So what we essentially have is a fairly dynamic table of data whose
> value is limited to however long the server is running. If the server
> or DBMS machine is restarted, there is no particular need to retain
> the contents of this table.
> While this is currently a standard database table, I wondered whether
> might be a more performant way to hold this information. For
> instance, I note in Kalen Delaney's "Inside Microsoft SQL Server 2000"
> that because of reduced logging requirements, "...data modification
> operations on tables in tempdb can be up to four times faster than the
> same operations in other databases." (p.310)
> However reading more on the web I get the impression that some feel
> tempdb operations can be less performant - the fear seems to relate to
> contention.
> Can anyone offer suggestions, particularly with regard to the scenario
> I outlined above?
> Thanks in advance.
> Ed Barrett