文本检索函数和操作符

文本检索操作符

  • @@

    描述:tsvector类型的词汇与tsquery类型的词汇是否匹配

    示例:

    1. postgres=# SELECT to_tsvector('fat cats ate rats') @@ to_tsquery('cat & rat') AS RESULT;
    2. result
    3. --------
    4. t
    5. (1 row)
  • @@@

    描述:@@的同义词

    示例:

    1. postgres=# SELECT to_tsvector('fat cats ate rats') @@@ to_tsquery('cat & rat') AS RESULT;
    2. result
    3. --------
    4. t
    5. (1 row)
  • ||

    描述:连接两个tsvector类型的词汇

    示例:

    1. postgres=# SELECT 'a:1 b:2'::tsvector || 'c:1 d:2 b:3'::tsvector AS RESULT;
    2. result
    3. ---------------------------
    4. 'a':1 'b':2,5 'c':3 'd':4
    5. (1 row)
  • &&

    描述:将两个tsquery类型的词汇进行“与”操作

    示例:

    1. postgres=# SELECT 'fat | rat'::tsquery && 'cat'::tsquery AS RESULT;
    2. result
    3. ---------------------------
    4. ( 'fat' | 'rat' ) & 'cat'
    5. (1 row)
  • ||

    描述:将两个tsquery类型的词汇进行“或”操作

    示例:

    1. postgres=# SELECT 'fat | rat'::tsquery || 'cat'::tsquery AS RESULT;
    2. result
    3. ---------------------------
    4. ( 'fat' | 'rat' ) | 'cat'
    5. (1 row)
  • !!

    描述:tsquery类型词汇的非关系

    示例:

    1. postgres=# SELECT !! 'cat'::tsquery AS RESULT;
    2. result
    3. --------
    4. !'cat'
    5. (1 row)
  • @>

    描述:一个tsquery类型的词汇是否包含另一个tsquery类型的词汇

    示例:

    1. postgres=# SELECT 'cat'::tsquery @> 'cat & rat'::tsquery AS RESULT;
    2. result
    3. --------
    4. f
    5. (1 row)
  • <@

    描述:一个tsquery类型的词汇是否被包含另一个tsquery类型的词汇

    示例:

    1. postgres=# SELECT 'cat'::tsquery <@ 'cat & rat'::tsquery AS RESULT;
    2. result
    3. --------
    4. t
    5. (1 row)

除了上述的操作符,还为tsvector类型和tsquery类型的数据定义了普通的B-tree比较操作符(=,<等)。

文本检索函数

  • get_current_ts_config()

    描述:获取文本检索的默认配置。

    返回类型:regconfig

    示例:

    1. postgres=# SELECT get_current_ts_config();
    2. get_current_ts_config
    3. -----------------------
    4. english
    5. (1 row)
  • length(tsvector)

    描述:tsvector类型词汇的单词数。

    返回类型:integer

    示例:

    1. postgres=# SELECT length('fat:2,4 cat:3 rat:5A'::tsvector);
    2. length
    3. --------
    4. 3
    5. (1 row)
  • numnode(tsquery)

    描述:tsquery类型的单词加上操作符的数量。

    返回类型:integer

    示例:

    1. postgres=# SELECT numnode('(fat & rat) | cat'::tsquery);
    2. numnode
    3. ---------
    4. 5
    5. (1 row)
  • plainto_tsquery([ config regconfig , ] query text)

    描述:产生tsquery类型的词汇,并忽略标点

    返回类型:tsquery

    示例:

    1. postgres=# SELECT plainto_tsquery('english', 'The Fat Rats');
    2. plainto_tsquery
    3. -----------------
    4. 'fat' & 'rat'
    5. (1 row)
  • querytree(query tsquery)

    描述:获取tsquery类型的词汇可加索引的部分。

    返回类型:text

    示例:

    1. postgres=# SELECT querytree('foo & ! bar'::tsquery);
    2. querytree
    3. -----------
    4. 'foo'
    5. (1 row)
  • setweight(tsvector, “char”)

    描述:给tsvector类型的每个元素分配权值。

    返回类型:tsvector

    示例:

    1. postgres=# SELECT setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A');
    2. setweight
    3. -------------------------------
    4. 'cat':3A 'fat':2A,4A 'rat':5A
    5. (1 row)
  • strip(tsvector)

    描述:删除tsvector类型单词中的position和权值。

    返回类型:tsvector

    示例:

    1. postgres=# SELECT strip('fat:2,4 cat:3 rat:5A'::tsvector);
    2. strip
    3. -------------------
    4. 'cat' 'fat' 'rat'
    5. (1 row)
  • to_tsquery([ config regconfig , ] query text)

    描述:标准化单词,并转换为tsquery类型。

    返回类型:tsquery

    示例:

    1. postgres=# SELECT to_tsquery('english', 'The & Fat & Rats');
    2. to_tsquery
    3. ---------------
    4. 'fat' & 'rat'
    5. (1 row)
  • to_tsvector([ config regconfig , ] document text)

    描述:去除文件信息,并转换为tsvector类型。

    返回类型:tsvector

    示例:

    1. postgres=# SELECT to_tsvector('english', 'The Fat Rats');
    2. to_tsvector
    3. -----------------
    4. 'fat':2 'rat':3
    5. (1 row)
  • ts_headline([ config regconfig, ] document text, query tsquery [, options text ])

    描述:高亮显示查询的匹配项。

    返回类型:text

    示例:

    1. postgres=# SELECT ts_headline('x y z', 'z'::tsquery);
    2. ts_headline
    3. --------------
    4. x y <b>z</b>
    5. (1 row)
  • ts_rank([ weights float4[], ] vector tsvector, query tsquery [, normalization integer ])

    描述:文档查询排名。

    返回类型:float4

    示例:

    1. postgres=# SELECT ts_rank('hello world'::tsvector, 'world'::tsquery);
    2. ts_rank
    3. ----------
    4. .0607927
    5. (1 row)
  • ts_rank_cd([ weights float4[], ] vector tsvector, query tsquery [, normalization integer ])

    描述:排序文件查询使用覆盖密度。

    返回类型:float4

    示例:

    1. postgres=# SELECT ts_rank_cd('hello world'::tsvector, 'world'::tsquery);
    2. ts_rank_cd
    3. ------------
    4. .1
    5. (1 row)
  • ts_rewrite(query tsquery, target tsquery, substitute tsquery)

    描述:替换目标tsquery类型的单词。

    返回类型:tsquery

    示例:

    1. postgres=# SELECT ts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo|bar'::tsquery);
    2. ts_rewrite
    3. -------------------------
    4. 'b' & ( 'foo' | 'bar' )
    5. (1 row)
  • ts_rewrite(query tsquery, select text)

    描述:使用SELECT命令的结果替代目标中tsquery类型的单词。

    返回类型:tsquery

    示例:

    1. postgres=# SELECT ts_rewrite('world'::tsquery, 'select ''world''::tsquery, ''hello''::tsquery');
    2. ts_rewrite
    3. ------------
    4. 'hello'
    5. (1 row)

文本检索调试函数

  • ts_debug([ config regconfig, ] document text, OUT alias text, OUT description text, OUT token text, OUT dictionaries regdictionary[], OUT dictionary regdictionary, OUT lexemes text[])

    描述:测试一个配置。

    返回类型:setof record

    示例:

    1. postgres=# SELECT ts_debug('english', 'The Brightest supernovaes');
    2. ts_debug
    3. -----------------------------------------------------------------------------------
    4. (asciiword,"Word, all ASCII",The,{english_stem},english_stem,{})
    5. (blank,"Space symbols"," ",{},,)
    6. (asciiword,"Word, all ASCII",Brightest,{english_stem},english_stem,{brightest})
    7. (blank,"Space symbols"," ",{},,)
    8. (asciiword,"Word, all ASCII",supernovaes,{english_stem},english_stem,{supernova})
    9. (5 rows)
  • ts_lexize(dict regdictionary, token text)

    描述:测试一个数据字典。

    返回类型:text[]

    示例:

    1. postgres=# SELECT ts_lexize('english_stem', 'stars');
    2. ts_lexize
    3. -----------
    4. {star}
    5. (1 row)
  • ts_parse(parser_name text, document text, OUT tokid integer, OUT token text)

    描述:测试一个解析。

    返回类型:setof record

    示例:

    1. postgres=# SELECT ts_parse('default', 'foo - bar');
    2. ts_parse
    3. -----------
    4. (1,foo)
    5. (12," ")
    6. (12,"- ")
    7. (1,bar)
    8. (4 rows)
  • ts_parse(parser_oid oid, document text, OUT tokid integer, OUT token text)

    描述:测试一个解析。

    返回类型:setof record

    示例:

    1. postgres=# SELECT ts_parse(3722, 'foo - bar');
    2. ts_parse
    3. -----------
    4. (1,foo)
    5. (12," ")
    6. (12,"- ")
    7. (1,bar)
    8. (4 rows)
  • ts_token_type(parser_name text, OUT tokid integer, OUT alias text, OUT description text)

    描述:获取分析器定义的记号类型。

    返回类型:setof record

    示例:

    1. postgres=# SELECT ts_token_type('default');
    2. ts_token_type
    3. --------------------------------------------------------------
    4. (1,asciiword,"Word, all ASCII")
    5. (2,word,"Word, all letters")
    6. (3,numword,"Word, letters and digits")
    7. (4,email,"Email address")
    8. (5,url,URL)
    9. (6,host,Host)
    10. (7,sfloat,"Scientific notation")
    11. (8,version,"Version number")
    12. (9,hword_numpart,"Hyphenated word part, letters and digits")
    13. (10,hword_part,"Hyphenated word part, all letters")
    14. (11,hword_asciipart,"Hyphenated word part, all ASCII")
    15. (12,blank,"Space symbols")
    16. (13,tag,"XML tag")
    17. (14,protocol,"Protocol head")
    18. (15,numhword,"Hyphenated word, letters and digits")
    19. (16,asciihword,"Hyphenated word, all ASCII")
    20. (17,hword,"Hyphenated word, all letters")
    21. (18,url_path,"URL path")
    22. (19,file,"File or path name")
    23. (20,float,"Decimal notation")
    24. (21,int,"Signed integer")
    25. (22,uint,"Unsigned integer")
    26. (23,entity,"XML entity")
    27. (23 rows)
  • ts_token_type(parser_oid oid, OUT tokid integer, OUT alias text, OUT description text)

    描述:获取分析器定义的记号类型。

    返回类型:setof record

    示例:

    1. postgres=# SELECT ts_token_type(3722);
    2. ts_token_type
    3. --------------------------------------------------------------
    4. (1,asciiword,"Word, all ASCII")
    5. (2,word,"Word, all letters")
    6. (3,numword,"Word, letters and digits")
    7. (4,email,"Email address")
    8. (5,url,URL)
    9. (6,host,Host)
    10. (7,sfloat,"Scientific notation")
    11. (8,version,"Version number")
    12. (9,hword_numpart,"Hyphenated word part, letters and digits")
    13. (10,hword_part,"Hyphenated word part, all letters")
    14. (11,hword_asciipart,"Hyphenated word part, all ASCII")
    15. (12,blank,"Space symbols")
    16. (13,tag,"XML tag")
    17. (14,protocol,"Protocol head")
    18. (15,numhword,"Hyphenated word, letters and digits")
    19. (16,asciihword,"Hyphenated word, all ASCII")
    20. (17,hword,"Hyphenated word, all letters")
    21. (18,url_path,"URL path")
    22. (19,file,"File or path name")
    23. (20,float,"Decimal notation")
    24. (21,int,"Signed integer")
    25. (22,uint,"Unsigned integer")
    26. (23,entity,"XML entity")
    27. (23 rows)
  • ts_stat(sqlquery text, [ weights text, ] OUT word text, OUT ndoc integer, OUT nentry integer)

    描述:获取tsvector列的统计数据。

    返回类型:setof record

    示例:

    1. postgres=# SELECT ts_stat('select ''hello world''::tsvector');
    2. ts_stat
    3. -------------
    4. (world,1,1)
    5. (hello,1,1)
    6. (2 rows)