Mombu the Microsoft Forum sponsored links

Go Back   Mombu the Microsoft Forum > Microsoft > new Noise words that don't exist!? how do i know what to parse out?
User Name
Password
REGISTER NOW! Mark Forums Read

sponsored links


Reply
 
1 18th March 12:43
henry
External User
 
Posts: 1
Default new Noise words that don't exist!? how do i know what to parse out?



hello, i have a searchpage that parses out the default US English noise
words. i then perform an inflectional search of each word within the
containstable predicate.

when i search for "new used" , the searchPage runs this query:
SELECT a.category_id, a.categoryName
FROM category a INNER JOIN
storecategory b ON b.category_id = a.category_id INNER JOIN
CONTAINSTABLE(category, *,'FORMSOF(INFLECTIONAL, "new") AND
FORMSOF(INFLECTIONAL, "used")' ,10) c ON c.[key] = b.category_id AND c.[key]
= a.category_id
WHERE b.store_id = 4

it returns:
91 New & Used Goods

now here is the problem.....

when i search for "new & used" , the searchPage runs this query:
SELECT a.category_id, a.categoryName
FROM category a INNER JOIN
storecategory b ON b.category_id = a.category_id INNER JOIN
CONTAINSTABLE(category, *,'FORMSOF(INFLECTIONAL, "new") AND
FORMSOF(INFLECTIONAL, "&") AND FORMSOF(INFLECTIONAL, "used")' ,10) c ON
c.[key] = b.category_id AND c.[key] = a.category_id
WHERE b.store_id = 4

it returns:
Execution of a full-text operation failed. A clause of the query contained
only ignored words.

i searched for the phantom noise word "&", but could not find it in the
'\Config\noise.enu and .dat' file. i even look within the
'\system32\noise.enu and .dat' file and could not find the word "&". i then
performed some further tests and found that other special characters not
found in the noise word list, such as "!" and "^" produced the error. Please
help . I am stuck.
  Reply With Quote


  sponsored links


2 18th March 12:43
john kane
External User
 
Posts: 1
Default new Noise words that don't exist!? how do i know what to parse out?



Henry,
We've seen this issue before with other US_English words, specifically
related to using FORMSOF(INFLECTIONAL()) with words that may not have true
inflectional variations. For example with proper nouns such as "hydrocylone"
or "sealant" in one case and in another case the word "best" generated Error
Msg 7619: "A clause of the query contained only ignored words" with
'formsof (inflectional, best)', even though best or better are not consider
US_English noise words. I believe that you are also hitting this issue with
words "new" and "used", both of which are not US_English noise words, with
your use of FORMSOF(INFLECTIONAL()).

Additionally, in a past thread, this issue was identified as a "DOC bug" by
Microsoft and hopefully do***ented in future releases of either the BOL or
in future releases of SQL Server.... Below is from that thread in this
fulltext newsgroup:

I've found the past thread with a similar DOC "bug" for inflectional use of
the root word forms of "well" as "well is the root form of better and best.
This *might* also be related to what linguist call "blocking", i.e., the
principle that forbids a rule to apply to a word if the word already has a
corresponding irregular form. For example, the existence of "came" blocks a
rule for adding "ed" to "come", thereby preempting "comed".

Regards,
John
  Reply With Quote
3 18th March 12:43
hilary cotter
External User
 
Posts: 1
Default new Noise words that don't exist!? how do i know what to parse out?


I am unable to reproduce your results on my NT 4 machine,
or on Win2k machine.

I do get your error on my win2003 server though.

Is there anyway you can remove the boolean symbols in your
search?

specifically

may not have true

such as "hydrocylone"

word "best" generated Error

words" with

better are not consider

hitting this issue with

noise words, with

as a "DOC bug" by

either the BOL or

thread in this

inflectional use of

of better and best.

call "blocking", i.e., the

word already has a

of "came" blocks a

preempting "comed".


(INFLECTIONAL, "used")' ,10) c ON


not find it in the


special characters not

produced the error.
  Reply With Quote
4 18th March 12:43
henry
External User
 
Posts: 1
Default new Noise words that don't exist!? how do i know what to parse out?


I am using

Microsoft SQL Server 2000 - 8.00.760 (Intel X86)
Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation
Personal Edition on Windows NT 5.1 (Build 2600: Service Pack 1)

i found that its not just boolean symbols, but any high ascii character.
i have successfully worked around this problem by finding and removing any
non-word characters, using regular expression "\W", in addition to the
ignored words.
  Reply With Quote
5 18th March 12:43
john kane
External User
 
Posts: 1
Default new Noise words that don't exist!? how do i know what to parse out?


Sorry, Henry,
I missed the FORMSOF(INFLECTIONAL, "&") in your 2nd query!
However, I did some additional testing of a couple of punctuation characters
on Win2003, with the following results:

use pubs
go
select @@version
go
select @@language
go
select * from CONTAINSTABLE(pub_info, *, 'FORMSOF(INFLECTIONAL,
''"book"'')')
-- select * from CONTAINSTABLE(pub_info, *, 'FORMSOF(INFLECTIONAL,
''book'')') -- returns same results
-- select * from CONTAINSTABLE(pub_info, *, 'FORMSOF(INFLECTIONAL,
') -- returns same results
/* -- returns:
Microsoft SQL Server 2000 - 8.00.760 (Intel X86)
Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation
Enterprise Edition on Windows NT 5.2 (Build 3790: )

us_english

KEY RANK
---- -----------
9952 48
0736 1000
(2 row(s) affected)
*/

-- However, I get different results - syntax error or Error Msg 7619 ,
depending upon the punctuation character used:
-- Using double quotes within single quotes and various punctuation
characters on Win2003 RTM with langw
select * from CONTAINSTABLE(pub_info, *, 'FORMSOF(INFLECTIONAL, ''"@"'')')
-- select * from CONTAINSTABLE(pub_info, *, 'FORMSOF(INFLECTIONAL,
''@'')') -- returns Error Msg 7619
-- select * from CONTAINSTABLE(pub_info, *, 'FORMSOF(INFLECTIONAL,
-- returns Error Msg 7619
/* -- returns:
Server: Msg 7619, Level 16, State 1, Line 1
Execution of a full-text operation failed. A clause of the query contained
only ignored words.
*/

select * from CONTAINSTABLE(pub_info, *, 'FORMSOF(INFLECTIONAL, ''"#"'')')
-- select * from CONTAINSTABLE(pub_info, *, 'FORMSOF(INFLECTIONAL,
''#'')') -- returns Error Msg 7619
-- select * from CONTAINSTABLE(pub_info, *, 'FORMSOF(INFLECTIONAL,
-- returns Error Msg 7619
/* -- returns:
Server: Msg 7619, Level 16, State 1, Line 1
Execution of a full-text operation failed. A clause of the query contained
only ignored words.
*/


select * from CONTAINSTABLE(pub_info, *, 'FORMSOF(INFLECTIONAL, ''"&"'')')
-- select * from CONTAINSTABLE(pub_info, *, 'FORMSOF(INFLECTIONAL,
''&'')') -- returns Error Msg 7631
-- select * from CONTAINSTABLE(pub_info, *, 'FORMSOF(INFLECTIONAL,
-- returns Error Msg 7631
/* -- returns:
Server: Msg 7631, Level 15, State 1, Line 1
Syntax error occurred near '&'. Expected '',', ')'' in search condition
'FORMSOF(INFLECTIONAL, '"&"')'.
*/


Hilary, he's using WinXP (Windows NT 5.1 (Build 2600: Service Pack 1)) which
uses the same new wordbreaker langwrbk.dll, so this could be the explanation
as to why you could not repo this on NT4.0 or Win2K machines as these OS
platforms use the infosoft.dll wordbreaker. I guess another "level" to add
to that grid of OS/SQL Server/single letters & punctuation characters!!

Regards,
John
  Reply With Quote
Reply


Thread Tools
Display Modes




Copyright © 2006 SmartyDevil.com - Dies Mies Jeschet Boenedoesef Douvema Enitemaus -
666