By defaultIn OracleText, diacritics are treated as separate characters by default. For example, searching for the word Daniel will not find Daniél. The full-text index can be configured to transform letters with diacritics to regular letters by enabling a setting on the lexer.
UI Text Box |
---|
|
Note |
---|
|
By default, Blueriq uses the CTX_SYS.DEFAULT_LEXER which comes pre-configured with various settings depending on the language used when the database was installed. For example, if the Oracle database was installed with the Dutch language, the default lexer has composite indexing and alternate spelling enabled for the Dutch language. If a custom lexer is defined, make sure not to omit any settings from the default lexer that you would like to keep. |
Creating the lexer
A custom lexer that transforms letters with diacritics to their normal counterparts must be created as follows:
Code Block |
---|
|
begin
ctx_ddl.create_preference('example_lexer', 'BASIC_LEXER');
ctx_ddl.set_attribute('example_lexer', 'base_letter', 'yes'); -- yes = transform diacritics, no = do not transform diacritics
end; |
Setting the lexer at index creation time
The custom lexer is specified as an index parameter:
Code Block |
---|
|
drop index aq_fulltext_index;
create index aq_fulltext_index on aq_fulltext(text)
indextype is ctxsys.context
parameters ('datastore aq_fulltext_uds lexer example_lexer sync(every "sysdate+1/24")'); |
Changing the lexer without recreating the index
The lexer can also be changed without dropping the index first:
...
UI Expand |
---|
|
Content by Label |
---|
showLabels | false |
---|
max | 5 |
---|
spaces | BKB |
---|
showSpace | false |
---|
sort | modified | showSpace |
---|
false | reverse | true |
---|
type | page |
---|
cql | label in ("bq97","search") and type = "page" and space = "BKB" |
---|
labels | search bq97 |
---|
|
|
...
...