Monday, March 26, 2012

Performance Question

Hi All
I have a database of students which contains a table StoredCvs. This
contains all of their CVs (Resume's) and has a full text index stored in a
"StoredFilesCatalog". It is unlikely that this table will grow to more than
1 million rows although it is possible.
I now want to start storing files (for searching) for other types of records
in our system i.e. company files, project files, other candidate files etc.
etc. I am now faced with a few choices and wandered which would be the best
one. I have decided not to create another catalog at this stage as I feel
this could be done later if performance is really bad. However, I was
wandering what the difference would be between: a) Creating another table
(StoredFiles), which uses the "StoredFilesCatalog", but would have a
seperate fulltextindex. b) Add a FileTypeID column to my StoredCVs table
(would rename this to StoredFiles) and store all files in the one table
(with one fulltextindex).
The main function of the system is to search Candidate Cvs (Resume's) so I
was wandering if there is any performance advantage of having 2 tables or if
the searches would be the same as they both use the same
"StoredFilesCatalog" anyway.
Hope this makes sense and really appreciate any advice.
Cheers
Joe
You will get better performance with two catalogs - one for each table. You
could add a separate column for FileType but this could be problematic if
you are using top_n_by_Rank
For instance suppose you do a query like this
Select * from StoredCvs join containstable(StoredCvs,*,'microsoft',200) as T
on T.[Key]=StoredCvs.PK
where filetype='resume'
order by rank desc
If the first 200 hits returned were all not of the fileType resume, you
would get no hits, even though there could be matches that might not occur
in the first 200 hits.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
"Joe Zammit" <zammit_joe@.hotmail.com> wrote in message
news:u619ySFBFHA.824@.TK2MSFTNGP11.phx.gbl...
> Hi All
> I have a database of students which contains a table StoredCvs. This
> contains all of their CVs (Resume's) and has a full text index stored in a
> "StoredFilesCatalog". It is unlikely that this table will grow to more
than
> 1 million rows although it is possible.
> I now want to start storing files (for searching) for other types of
records
> in our system i.e. company files, project files, other candidate files
etc.
> etc. I am now faced with a few choices and wandered which would be the
best
> one. I have decided not to create another catalog at this stage as I feel
> this could be done later if performance is really bad. However, I was
> wandering what the difference would be between: a) Creating another table
> (StoredFiles), which uses the "StoredFilesCatalog", but would have a
> seperate fulltextindex. b) Add a FileTypeID column to my StoredCVs table
> (would rename this to StoredFiles) and store all files in the one table
> (with one fulltextindex).
> The main function of the system is to search Candidate Cvs (Resume's) so I
> was wandering if there is any performance advantage of having 2 tables or
if
> the searches would be the same as they both use the same
> "StoredFilesCatalog" anyway.
> Hope this makes sense and really appreciate any advice.
> Cheers
> Joe
>
|||Joe,
So, I can be sure of your environment, could you also post the full output
of -- SELECT @.@.version -- as this is most helpful in understanding your
environment. As I understand it you have one table: "StoredCvs with <1
million rows and one FT Catalog:StoredFilesCatalog", and you've decided not
to create another FT Catalog. Note, you can only have one FT Catalog defined
per table, but that one FT Catalog can support multiple column per table as
well as multiple tables.
Your decision is between the two option (a or b) below. Correct?
a) Creating another table (StoredFiles), which uses the
"StoredFilesCatalog", but would have a separate fulltextindex
b) Add a FileTypeID column to my StoredCVs table (would rename this to
StoredFiles) and store all files in the one table (with one fulltextindex).
If I have your environment correct, I'd recommend option a - create another
table (StoredFiles). Primarily because, SQL Server 2000 FT Catalogs start to
have performance issues with SQL Server 2000 tables at approx. 1 million
rows (still functional, but just need performance tuning) and adding
addition file types and larger files to your existing table (StoredCvs) will
cause it to grow above the 1 million row threshold. See SQL Server 2000 BOL
title "Full-text Search Recommendations" for more information on performance
tuning FT Catalogs on tables with more than 1 million rows.
Another issue/question that you did not mention is whether or not these two
tables will be often (always, sometimes, never) joined together in common
queries or in common FTS queries. If they are seldom or never joined in
frequently used queries, then it makes more sense for their to be separate
tables.
Hope that helps!
John
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/
"Joe Zammit" <zammit_joe@.hotmail.com> wrote in message
news:u619ySFBFHA.824@.TK2MSFTNGP11.phx.gbl...
> Hi All
> I have a database of students which contains a table StoredCvs. This
> contains all of their CVs (Resume's) and has a full text index stored in a
> "StoredFilesCatalog". It is unlikely that this table will grow to more
than
> 1 million rows although it is possible.
> I now want to start storing files (for searching) for other types of
records
> in our system i.e. company files, project files, other candidate files
etc.
> etc. I am now faced with a few choices and wandered which would be the
best
> one. I have decided not to create another catalog at this stage as I feel
> this could be done later if performance is really bad. However, I was
> wandering what the difference would be between: a) Creating another table
> (StoredFiles), which uses the "StoredFilesCatalog", but would have a
> seperate fulltextindex. b) Add a FileTypeID column to my StoredCVs table
> (would rename this to StoredFiles) and store all files in the one table
> (with one fulltextindex).
> The main function of the system is to search Candidate Cvs (Resume's) so I
> was wandering if there is any performance advantage of having 2 tables or
if
> the searches would be the same as they both use the same
> "StoredFilesCatalog" anyway.
> Hope this makes sense and really appreciate any advice.
> Cheers
> Joe
>
|||Good Point!
I was thinking along the 2 table line anyway so thanks for your help.
"Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
news:OJX5EQIBFHA.1392@.TK2MSFTNGP14.phx.gbl...
> You will get better performance with two catalogs - one for each table.
You
> could add a separate column for FileType but this could be problematic if
> you are using top_n_by_Rank
> For instance suppose you do a query like this
> Select * from StoredCvs join containstable(StoredCvs,*,'microsoft',200) as
T[vbcol=seagreen]
> on T.[Key]=StoredCvs.PK
> where filetype='resume'
> order by rank desc
> If the first 200 hits returned were all not of the fileType resume, you
> would get no hits, even though there could be matches that might not occur
> in the first 200 hits.
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> "Joe Zammit" <zammit_joe@.hotmail.com> wrote in message
> news:u619ySFBFHA.824@.TK2MSFTNGP11.phx.gbl...
a[vbcol=seagreen]
> than
> records
> etc.
> best
feel[vbcol=seagreen]
table[vbcol=seagreen]
I[vbcol=seagreen]
or
> if
>
sql

No comments:

Post a Comment