9.13. 文字檢索函式及運算子
Table 9.40,Table 9.41andTable 9.42summarize the functions and operators that are provided for full text searching. SeeChapter 12for a detailed explanation ofPostgreSQL’s text search facility.
Table 9.40. Text Search Operators
Operator | Return Type | Description | Example | Result | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
@@ |
boolean |
tsvector matchestsquery ? |
to_tsvector('fat cats ate rats') @@ to_tsquery('cat & rat') |
t |
|||||||
@@@ |
boolean |
deprecated synonym for@@ |
to_tsvector('fat cats ate rats') @@@ to_tsquery('cat & rat') |
t |
|||||||
` | ` | tsvector |
concatenatetsvector s |
`‘a:1 b:2’::tsvector | ‘c:1 d:2 b:3’::tsvector` | 'a':1 'b':2,5 'c':3 'd':4 |
|||||
&& |
tsquery |
ANDtsquery s together |
`‘fat | rat’::tsquery && ‘cat’::tsquery` | `( ‘fat’ | ‘rat’ ) & ‘cat’` | |||||
` | ` | tsquery |
ORtsquery s together |
`‘fat | rat’::tsquery | ‘cat’::tsquery` | `( ‘fat’ | ‘rat’ ) | ‘cat’` | ||
!! |
tsquery |
negate atsquery |
!! 'cat'::tsquery |
!'cat' |
|||||||
<-> |
tsquery |
tsquery followed bytsquery |
to_tsquery('fat') <-> to_tsquery('rat') |
'fat' <-> 'rat' |
|||||||
@> |
boolean |
tsquery contains another ? |
'cat'::tsquery @> 'cat & rat'::tsquery |
f |
|||||||
<@ |
boolean |
tsquery is contained in ? |
'cat'::tsquery <@ 'cat & rat'::tsquery |
t |
Note
Thetsquery
containment operators consider only the lexemes listed in the two queries, ignoring the combining operators.
In addition to the operators shown in the table, the ordinary B-tree comparison operators (=
,<
, etc) are defined for typestsvector
andtsquery
. These are not very useful for text searching but allow, for example, unique indexes to be built on columns of these types.
Table 9.41. Text Search Functions
Function | Return Type | Description | Example | Result | ||
---|---|---|---|---|---|---|
array_to_tsvector(text[] ) |
tsvector |
convert array of lexemes totsvector |
array_to_tsvector('{fat,cat,rat}'::text[]) |
'cat' 'fat' 'rat' |
||
get_current_ts_config() |
regconfig |
get default text search configuration | get_current_ts_config() |
english |
||
length(tsvector ) |
integer |
number of lexemes intsvector |
length('fat:2,4 cat:3 rat:5A'::tsvector) |
3 |
||
numnode(tsquery ) |
integer |
number of lexemes plus operators intsquery |
`numnode(‘(fat & rat) | cat’::tsquery)` | 5 |
|
plainto_tsquery([configregconfig ,]querytext ) |
tsquery |
producetsquery ignoring punctuation |
plainto_tsquery('english', 'The Fat Rats') |
'fat' & 'rat' |
||
phraseto_tsquery([configregconfig ,]querytext ) |
tsquery |
producetsquery that searches for a phrase, ignoring punctuation |
phraseto_tsquery('english', 'The Fat Rats') |
'fat' <-> 'rat' |
||
querytree(querytsquery ) |
text |
get indexable part of atsquery |
querytree('foo & ! bar'::tsquery) |
'foo' |
||
setweight(vectortsvector ,weight"char" ) |
tsvector |
assignweight _to each element ofvector _ |
setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A') |
'cat':3A 'fat':2A,4A 'rat':5A |
||
setweight(vectortsvector ,weight"char" ,lexemestext[] ) |
tsvector |
assignweight _to elements ofvector that are listed inlexemes _ |
setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A', '{cat,rat}') |
'cat':3A 'fat':2,4 'rat':5A |
||
strip(tsvector ) |
tsvector |
remove positions and weights fromtsvector |
strip('fat:2,4 cat:3 rat:5A'::tsvector) |
'cat' 'fat' 'rat' |
||
to_tsquery([configregconfig ,]querytext ) |
tsquery |
normalize words and convert totsquery |
to_tsquery('english', 'The & Fat & Rats') |
'fat' & 'rat' |
||
to_tsvector([configregconfig ,]documenttext ) |
tsvector |
reduce document text totsvector |
to_tsvector('english', 'The Fat Rats') |
'fat':2 'rat':3 |
||
to_tsvector([configregconfig ,]documentjson(b) ) |
tsvector |
reduce each string value in the document to atsvector , and then concatentate those in document order to produce a singletsvector |
to_tsvector('english', '{"a": "The Fat Rats"}'::json) |
'fat':2 'rat':3 |
||
ts_delete(vectortsvector ,lexemetext ) |
tsvector |
remove givenlexeme _fromvector _ |
ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat') |
'cat':3 'rat':5A |
||
ts_delete(vectortsvector ,lexemestext[] ) |
tsvector |
remove any occurrence of lexemes inlexemes _fromvector _ |
ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat']) |
'cat':3 |
||
ts_filter(vectortsvector ,weights"char"[] ) |
tsvector |
select only elements with givenweights _fromvector _ |
ts_filter('fat:2,4 cat:3b rat:5A'::tsvector, '{a,b}') |
'cat':3B 'rat':5A |
||
ts_headline([configregconfig ,]documenttext ,querytsquery [,optionstext ]) |
text |
display a query match | ts_headline('x y z', 'z'::tsquery) |
x y <b>z</b> |
||
ts_headline([configregconfig ,]documentjson(b) ,querytsquery [,optionstext ]) |
text |
display a query match | ts_headline('{"a":"x y z"}'::json, 'z'::tsquery) |
{"a":"x y <b>z</b>"} |
||
ts_rank([weightsfloat4[] ,]vectortsvector ,querytsquery [,normalizationinteger ]) |
float4 |
rank document for query | ts_rank(textsearch, query) |
0.818 |
||
ts_rank_cd([weightsfloat4[] ,]vectortsvector ,querytsquery [,normalizationinteger ]) |
float4 |
rank document for query using cover density | ts_rank_cd('{0.1, 0.2, 0.4, 1.0}', textsearch, query) |
2.01317 |
||
ts_rewrite(querytsquery ,targettsquery ,substitutetsquery ) |
tsquery |
replacetarget _withsubstitute _within query |
`ts_rewrite(‘a & b’::tsquery, ‘a’::tsquery, ‘foo | bar’::tsquery)` | `‘b’ & ( ‘foo’ | ‘bar’ )` |
ts_rewrite(querytsquery ,selecttext ) |
tsquery |
replace using targets and substitutes from aSELECT command |
SELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases') |
`‘b’ & ( ‘foo’ | ‘bar’ )` | |
tsquery_phrase(query1tsquery ,query2tsquery ) |
tsquery |
make query that searches forquery1 _followed byquery2 _(same as<-> operator) |
tsquery_phrase(to_tsquery('fat'), to_tsquery('cat')) |
'fat' <-> 'cat' |
||
tsquery_phrase(query1tsquery ,query2tsquery ,distanceinteger ) |
tsquery |
make query that searches forquery1 _followed byquery2 at distancedistance _ |
tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10) |
'fat' <10> 'cat' |
||
tsvector_to_array(tsvector ) |
text[] |
converttsvector to array of lexemes |
tsvector_to_array('fat:2,4 cat:3 rat:5A'::tsvector) |
{cat,fat,rat} |
||
tsvector_update_trigger() |
trigger |
trigger function for automatictsvector column update |
CREATE TRIGGER ... tsvector_update_trigger(tsvcol, 'pg_catalog.swedish', title, body) |
|||
tsvector_update_trigger_column() |
trigger |
trigger function for automatictsvector column update |
CREATE TRIGGER ... tsvector_update_trigger_column(tsvcol, configcol, title, body) |
|||
unnest(tsvector , OUTlexemetext , OUTpositionssmallint[] , OUTweightstext ) |
setof record |
expand a tsvector to a set of rows | unnest('fat:2,4 cat:3 rat:5A'::tsvector) |
(cat,{3},{D}) ... |
Note
All the text search functions that accept an optionalregconfig
argument will use the configuration specified bydefault_text_search_configwhen that argument is omitted.
The functions inTable 9.42are listed separately because they are not usually used in everyday text searching operations. They are helpful for development and debugging of new text search configurations.
Table 9.42. Text Search Debugging Functions
Function | Return Type | Description | Example | Result |
---|---|---|---|---|
ts_debug([configregconfig ,]documenttext , OUTaliastext , OUTdescriptiontext , OUTtokentext , OUTdictionariesregdictionary[] , OUTdictionaryregdictionary , OUTlexemestext[] ) |
setof record |
test a configuration | ts_debug('english', 'The Brightest supernovaes') |
(asciiword,"Word, all ASCII",The,{english_stem},english_stem,{}) ... |
ts_lexize(dictregdictionary ,tokentext ) |
text[] |
test a dictionary | ts_lexize('english_stem', 'stars') |
{star} |
ts_parse(parser_nametext ,documenttext , OUTtokidinteger , OUTtokentext ) |
setof record |
test a parser | ts_parse('default', 'foo - bar') |
(1,foo) ... |
ts_parse(parser_oidoid ,documenttext , OUTtokidinteger , OUTtokentext ) |
setof record |
test a parser | ts_parse(3722, 'foo - bar') |
(1,foo) ... |
ts_token_type(parser_nametext , OUTtokidinteger , OUTaliastext , OUTdescriptiontext ) |
setof record |
get token types defined by parser | ts_token_type('default') |
(1,asciiword,"Word, all ASCII") ... |
ts_token_type(parser_oidoid , OUTtokidinteger , OUTaliastext , OUTdescriptiontext ) |
setof record |
get token types defined by parser | ts_token_type(3722) |
(1,asciiword,"Word, all ASCII") ... |
ts_stat(sqlquerytext , [weightstext ,] OUTwordtext , OUTndocinteger , OUTnentryinteger ) |
setof record |
get statistics of atsvector column |
ts_stat('SELECT vector from apod') |
(foo,10,15) ... |