MSSQL+Unicode+CONTAINS: Your Help

Dears... I have one problem here

I have MS SQL 2000 server Unicode database. Characters stored in this database are non-Latin (Arabic) characters. I have enabled the Full-Text Search service and used the CONTAINS function to search the database. Unfortunately, the produced query result is incorrect. It has extra records that are not valid. The word breakers seem to be not appropriately defined such that the records with similar, but not exact, words are listed in the result set. I have tried to set the Language Neutral Full-Text option to 0 and 1. However, undesirable records still appear in the record set.

Looking forward your help?
 
Can you show me an example of where this happens? I have never seen this before and need to make sure I understand what is happening.
 
My problem is that the records with similar, but not exact, words are listed in the result set using the Full-Text Search Index service. The CONTAINS function returns extra records other than the desired ones.

My analysis is that the word brakers in Unicode encoding are not well defined in my environment.

For example, if you search for "smith", records returned will be those containing "smith", "smooth", "smth"... However, this only happens with Arabic strings and not English strings. The example above is just for illustration.
 
Unicode has two encoding formats: UTF-8 and UCS-2. Web browsers speak UTF-8. On the other hand, most DBMSes speak UCS-2. In MS SQL 2000 Server, searching and string comparison work robustly if Unicode data were stored only in UCS-8 format. However, Internet Explorer can display Unicode data only in UTF-8.

Thus, the requirement is to convert between UTF-8 and UCS-2. Fortunately, in Microsoft-based web application, IIS can take responsibility of Unicode format conversion by adding "<% Session.Codepage=65001 %>" or "<%@ CodePage=65001 %>" to the server-side ASP script responsible for database manipulation.

For more details, please visit
INF: Storing UTF-8 Data in SQL Server
http://support.microsoft.com/default.aspx?scid=kb;en-us;232580&Product=sql2k

Thanks for taking care
 
From what I am reading there it looks to me like you could include those strings they mentioned on each page where the conversion is needed, or you could make it as an include and pull the include on every asp page.

In other words, it looks to be more of a code issue than a server issue from what I can see.
 
Back
Top