[Groonga-commit] groonga/groonga at a44c8de [master] doc reference: add token filters

Zurück zum Archiv-Index

naoa null+****@clear*****
Sun Oct 26 09:57:29 JST 2014


naoa	2014-10-26 09:57:29 +0900 (Sun, 26 Oct 2014)

  New Revision: a44c8de55b0674d05f99cbf93df359ee59f04cb6
  https://github.com/groonga/groonga/commit/a44c8de55b0674d05f99cbf93df359ee59f04cb6

  Merged 145db6d: Merge pull request #227 from naoa/doc-add-token-filters

  Message:
    doc reference: add token filters

  Added files:
    doc/source/example/reference/token_filters/example-table-create.log
    doc/source/example/reference/token_filters/stem.log
    doc/source/example/reference/token_filters/stop_word.log
    doc/source/reference/token_filters.rst
  Modified files:
    doc/locale/en/LC_MESSAGES/reference.po
    doc/locale/ja/LC_MESSAGES/reference.po
    doc/source/reference.rst

  Modified: doc/locale/en/LC_MESSAGES/reference.po (+70 -0)
===================================================================
--- doc/locale/en/LC_MESSAGES/reference.po    2014-10-26 08:57:04 +0900 (453e3e9)
+++ doc/locale/en/LC_MESSAGES/reference.po    2014-10-26 09:57:29 +0900 (1593b77)
@@ -14559,6 +14559,76 @@ msgstr ""
 "database (sharding) or reduce each key size to handle 4GiB or more larger "
 "total key size."
 
+msgid "Token filters"
+msgstr "Token filters"
+
+msgid "Groonga has token filter module that some processes tokenized token."
+msgstr "Groonga has token filter module that some processes tokenized token."
+
+msgid "Token filter module can be added as a plugin."
+msgstr "Token filter module can be added as a plugin."
+
+msgid ""
+"You can customize tokenized token by registering your token filters plugins "
+"to Groonga."
+msgstr ""
+"You can customize tokenized token by registering your token filters plugins "
+"to Groonga."
+
+msgid ""
+"A token filter module is attached to a table. A token filter can have zero "
+"or N token filter module. You can attach a normalizer module to a table "
+"``token_filters`` option in :doc:`/reference/commands/table_create`."
+msgstr ""
+"A token filter module is attached to a table. A token filter can have zero "
+"or N token filter module. You can attach a normalizer module to a table "
+"``token_filters`` option in :doc:`/reference/commands/table_create`."
+
+msgid ""
+"Here is an example ``table_create`` that uses ``TokenFilterStopWord`` token "
+"filter module:"
+msgstr ""
+"Here is an example ``table_create`` that uses ``TokenFilterStopWord`` token "
+"filter module:"
+
+msgid "Available token filters"
+msgstr "Available token filters"
+
+msgid "Here are the list of available token filters:"
+msgstr "Here are the list of available token filters:"
+
+msgid "``TokenFilterStopWord``"
+msgstr ""
+
+msgid "``TokenFilterStem``"
+msgstr ""
+
+msgid ""
+"``TokenFilterStopWord`` removes stopword from tokenized token in searching "
+"the documents."
+msgstr ""
+"``TokenFilterStopWord`` removes stopword from tokenized token in searching "
+"the documents."
+
+msgid ""
+"``TokenFilterStopWord`` can specify stopword after adding the documents, "
+"because It removes token in searching the documents."
+msgstr ""
+"``TokenFilterStopWord`` can specify stopword after adding the documents, "
+"because It removes token in searching the documents."
+
+msgid "The stopword is specified ``stopword`` column on lexicon table."
+msgstr "The stopword is specified ``stopword`` column on lexicon table."
+
+msgid "Here is an example that uses ``TokenFilterStopWord`` token filter:"
+msgstr "Here is an example that uses ``TokenFilterStopWord`` token filter:"
+
+msgid "``TokenFilterStem`` stemming tokenized token."
+msgstr "``TokenFilterStem`` stemming tokenized token."
+
+msgid "Here is an example that uses ``TokenFilterStem`` token filter:"
+msgstr "Here is an example that uses ``TokenFilterStem`` token filter:"
+
 msgid "Tokenizers"
 msgstr "Tokenizers"
 

  Modified: doc/locale/ja/LC_MESSAGES/reference.po (+74 -0)
===================================================================
--- doc/locale/ja/LC_MESSAGES/reference.po    2014-10-26 08:57:04 +0900 (3b7521d)
+++ doc/locale/ja/LC_MESSAGES/reference.po    2014-10-26 09:57:29 +0900 (4650225)
@@ -13797,6 +13797,80 @@ msgstr ""
 "テーブルを分割したり、データベースを分割したり(シャーディング)、それぞれの"
 "キーのサイズを減らしてください。"
 
+msgid "Token filters"
+msgstr "Token filters"
+
+msgid "Groonga has token filter module that some processes tokenized token."
+msgstr ""
+"Groongaにはトークナイズされたトークンに所定の処理を行うトークンフィルター"
+"モジュールがあります。"
+
+msgid "Token filter module can be added as a plugin."
+msgstr "トークンフィルターモジュールはプラグインとして追加できます。"
+
+msgid ""
+"You can customize tokenized token by registering your token filters plugins "
+"to Groonga."
+msgstr ""
+"トークンフィルタープラグインをGroongaに追加することでトークナイズされたトークン"
+"をカスタマイズできます。"
+
+msgid ""
+"A token filter module is attached to a table. A token filter can have zero "
+"or N token filter module. You can attach a normalizer module to a table "
+"``token_filters`` option in :doc:`/reference/commands/table_create`."
+msgstr ""
+"トークンフィルターモジュールはテーブルに関連付いています。テーブルは0個かN個"
+"のトークンフィルターモジュールを持つことができます。"
+":doc:`/reference/commands/table_create` の ``token_filters`` オプションで"
+"テーブルにトークンフィルターオプションを関連付けることができます。"
+
+msgid ""
+"Here is an example ``table_create`` that uses ``TokenFilterStopWord`` token "
+"filter module:"
+msgstr ""
+"以下は ``TokenFilterStopWord`` トークンフィルターモジュールを使う"
+" ``table_create`` の例です。"
+
+msgid "Available token filters"
+msgstr "利用可能なトークンフィルター"
+
+msgid "Here are the list of available token filters:"
+msgstr "以下は組み込みのトークンフィルターのリストです。"
+
+msgid "``TokenFilterStopWord``"
+msgstr ""
+
+msgid "``TokenFilterStem``"
+msgstr ""
+
+msgid ""
+"``TokenFilterStopWord`` removes stopword from tokenized token in searching "
+"the documents."
+msgstr ""
+"``TokenFilterStopWord`` は、文書を検索する時にトークナイズされたトークンから"
+"ストップワードを除去します。"
+
+msgid ""
+"``TokenFilterStopWord`` can specify stopword after adding the documents, "
+"because It removes token in searching the documents."
+msgstr ""
+"``TokenFilterStopWord`` は、文書を検索する時のみトークン除去するため、文書"
+"を追加した後でストップワードを指定することもできます。"
+
+msgid "The stopword is specified ``stopword`` column on lexicon table."
+msgstr ""
+"ストップワードは、語彙表の ``stopword`` カラムで指定します。"
+
+msgid "Here is an example that uses ``TokenFilterStopWord`` token filter:"
+msgstr "以下は ``TokenFilterStopWord`` トークンフィルターを使う例です。"
+
+msgid "``TokenFilterStem`` stemming tokenized token."
+msgstr "``TokenFilterStem`` は、トークナイズされたトークンをステミングします。"
+
+msgid "Here is an example that uses ``TokenFilterStem`` token filter:"
+msgstr "以下は ``TokenFilterStem`` トークンフィルターを使う例です。"
+
 msgid "Tokenizers"
 msgstr ""
 

  Added: doc/source/example/reference/token_filters/example-table-create.log (+7 -0) 100644
===================================================================
--- /dev/null
+++ doc/source/example/reference/token_filters/example-table-create.log    2014-10-26 09:57:29 +0900 (ac93153)
@@ -0,0 +1,7 @@
+Execution example::
+
+  table_create Terms TABLE_PAT_KEY ShortText \
+    --default_tokenizer TokenBigram \
+    --normalizer NormalizerAuto \
+    --token_filters TokenFilterStopWord
+  # [[0,0.0,0.0],true]

  Added: doc/source/example/reference/token_filters/stem.log (+59 -0) 100644
===================================================================
--- /dev/null
+++ doc/source/example/reference/token_filters/stem.log    2014-10-26 09:57:29 +0900 (790bf4b)
@@ -0,0 +1,59 @@
+Execution example::
+
+  register token_filters/stem
+  # [[0,0.0,0.0],true]
+  table_create Memos TABLE_NO_KEY
+  # [[0,0.0,0.0],true]
+  column_create Memos content COLUMN_SCALAR ShortText
+  # [[0,0.0,0.0],true]
+  table_create Terms TABLE_PAT_KEY ShortText \
+    --default_tokenizer TokenBigram \
+    --normalizer NormalizerAuto \
+    --token_filters TokenFilterStem
+  # [[0,0.0,0.0],true]
+  column_create Terms memos_content COLUMN_INDEX|WITH_POSITION Memos content
+  # [[0,0.0,0.0],true]
+  load --table Memos
+  [
+  {"content": "I develop Groonga"},
+  {"content": "I'm developing Groonga"},
+  {"content": "I developed Groonga"}
+  ]
+  # [[0,0.0,0.0],3]
+  select Memos --match_columns content --query "develops"
+  # [
+  #   [
+  #     0,
+  #     0.0,
+  #     0.0
+  #   ],
+  #   [
+  #     [
+  #       [
+  #         3
+  #       ],
+  #       [
+  #         [
+  #           "_id",
+  #           "UInt32"
+  #         ],
+  #         [
+  #           "content",
+  #           "ShortText"
+  #         ]
+  #       ],
+  #       [
+  #         1,
+  #         "I develop Groonga"
+  #       ],
+  #       [
+  #         2,
+  #         "I'm developing Groonga"
+  #       ],
+  #       [
+  #         3,
+  #         "I developed Groonga"
+  #       ]
+  #     ]
+  #   ]
+  # ]

  Added: doc/source/example/reference/token_filters/stop_word.log (+62 -0) 100644
===================================================================
--- /dev/null
+++ doc/source/example/reference/token_filters/stop_word.log    2014-10-26 09:57:29 +0900 (7084ec2)
@@ -0,0 +1,62 @@
+Execution example::
+
+  register token_filters/stop_word
+  # [[0,0.0,0.0],true]
+  table_create Memos TABLE_NO_KEY
+  # [[0,0.0,0.0],true]
+  column_create Memos content COLUMN_SCALAR ShortText
+  # [[0,0.0,0.0],true]
+  table_create Terms TABLE_PAT_KEY ShortText \
+    --default_tokenizer TokenBigram \
+    --normalizer NormalizerAuto \
+    --token_filters TokenFilterStopWord
+  # [[0,0.0,0.0],true]
+  column_create Terms memos_content COLUMN_INDEX|WITH_POSITION Memos content
+  # [[0,0.0,0.0],true]
+  column_create Terms is_stop_word COLUMN_SCALAR Bool
+  # [[0,0.0,0.0],true]
+  load --table Terms
+  [
+  {"_key": "and", "is_stop_word": true}
+  ]
+  # [[0,0.0,0.0],1]
+  load --table Memos
+  [
+  {"content": "Hello"},
+  {"content": "Hello and Good-bye"},
+  {"content": "Good-bye"}
+  ]
+  # [[0,0.0,0.0],3]
+  select Memos --match_columns content --query "Hello and"
+  # [
+  #   [
+  #     0,
+  #     0.0,
+  #     0.0
+  #   ],
+  #   [
+  #     [
+  #       [
+  #         2
+  #       ],
+  #       [
+  #         [
+  #           "_id",
+  #           "UInt32"
+  #         ],
+  #         [
+  #           "content",
+  #           "ShortText"
+  #         ]
+  #       ],
+  #       [
+  #         1,
+  #         "Hello"
+  #       ],
+  #       [
+  #         2,
+  #         "Hello and Good-bye"
+  #       ]
+  #     ]
+  #   ]
+  # ]

  Modified: doc/source/reference.rst (+1 -0)
===================================================================
--- doc/source/reference.rst    2014-10-26 08:57:04 +0900 (80a5e87)
+++ doc/source/reference.rst    2014-10-26 09:57:29 +0900 (2055254)
@@ -17,6 +17,7 @@ Reference manual
    reference/column
    reference/normalizers
    reference/tokenizers
+   reference/token_filters
    reference/query_expanders
    reference/pseudo_column
    reference/grn_expr

  Added: doc/source/reference/token_filters.rst (+97 -0) 100644
===================================================================
--- /dev/null
+++ doc/source/reference/token_filters.rst    2014-10-26 09:57:29 +0900 (bbe9c02)
@@ -0,0 +1,97 @@
+.. -*- rst -*-
+
+.. highlightlang:: none
+
+.. groonga-command
+.. database: token_filters
+
+Token filters
+=============
+
+Summary
+-------
+Groonga has token filter module that some processes tokenized token.
+
+Token filter module can be added as a plugin. 
+
+You can customize tokenized token by registering your token filters plugins to Groonga.
+
+A token filter module is attached to a table. A token filter can have zero or
+N token filter module. You can attach a normalizer module to a table
+``token_filters`` option in :doc:`/reference/commands/table_create`.
+
+Here is an example ``table_create`` that uses ``TokenFilterStopWord``
+token filter module:
+
+.. groonga-command
+.. include:: ../example/reference/token_filters/example-table-create.log
+.. table_create Terms TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram --normalizer NormalizerAuto --token_filters TokenFilterStopWord
+
+Available token filters
+-----------------------
+
+Here are the list of available token filters:
+
+* ``TokenFilterStopWord``
+* ``TokenFilterStem``
+
+``TokenFilterStopWord``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+``TokenFilterStopWord`` removes stopword from tokenized token
+in searching the documents. 
+
+``TokenFilterStopWord`` can specify stopword after adding the
+documents, because It removes token in searching the documents.
+
+The stopword is specified ``stopword`` column on lexicon table.
+
+Here is an example that uses ``TokenFilterStopWord`` token filter:
+
+.. groonga-command
+.. include:: ../example/reference/token_filters/stop_word.log
+.. register token_filters/stop_word
+.. table_create Memos TABLE_NO_KEY
+.. column_create Memos content COLUMN_SCALAR ShortText
+.. table_create Terms TABLE_PAT_KEY ShortText   --default_tokenizer TokenBigram   --normalizer NormalizerAuto   --token_filters TokenFilterStopWord
+.. column_create Terms memos_content COLUMN_INDEX|WITH_POSITION Memos content
+.. column_create Terms is_stop_word COLUMN_SCALAR Bool
+.. load --table Terms
+.. [
+.. {"_key": "and", "is_stop_word": true}
+.. ]
+.. load --table Memos
+.. [
+.. {"content": "Hello"},
+.. {"content": "Hello and Good-bye"},
+.. {"content": "Good-bye"}
+.. ]
+.. select Memos --match_columns content --query "Hello and"
+
+``TokenFilterStem``
+^^^^^^^^^^^^^^^^^^^
+
+``TokenFilterStem`` stemming tokenized token.
+
+Here is an example that uses ``TokenFilterStem`` token filter:
+
+.. groonga-command
+.. include:: ../example/reference/token_filters/stem.log
+.. register token_filters/stop_word
+.. register token_filters/stem
+.. table_create Memos TABLE_NO_KEY
+.. column_create Memos content COLUMN_SCALAR ShortText
+.. table_create Terms TABLE_PAT_KEY ShortText   --default_tokenizer TokenBigram   --normalizer NormalizerAuto   --token_filters TokenFilterStem
+.. column_create Terms memos_content COLUMN_INDEX|WITH_POSITION Memos content
+.. load --table Memos
+.. [
+.. {"content": "I develop Groonga"},
+.. {"content": "I'm developing Groonga"},
+.. {"content": "I developed Groonga"}
+.. ]
+.. select Memos --match_columns content --query "develops"
+
+See also
+--------
+
+* :doc:`/reference/commands/table_create`
-------------- next part --------------
HTML����������������������������...
Download 



More information about the Groonga-commit mailing list
Zurück zum Archiv-Index