您当前的位置: 首页 >  搜索

彭世瑜

暂无认证

  • 2浏览

    0关注

    2791博文

    0收益

  • 0浏览

    0点赞

    0打赏

    0留言

私信
关注
热门博文

Elasticsearch添加拼音搜索支持

彭世瑜 发布时间:2019-11-19 23:29:24 ,浏览量:2

一份不错的资料 ELASTIC 搜索开发实战

一、安装插件

拼音分词扩展elasticsearch-analysis-pinyin安装

文档: https://github.com/medcl/elasticsearch-analysis-pinyin

二、新建索引添加拼音支持

替换为 实际 index 替换为 实际 type

PUT 
{
  "settings" : {
      "analysis" : {
        "analyzer" : {
          "pinyin_analyzer" : {
              "tokenizer" : "my_pinyin"
              }
        },
        
        "tokenizer" : {
          "my_pinyin" : {
            "type" : "pinyin",
            "keep_first_letter":false,
            "keep_separate_first_letter" : false,
            "keep_full_pinyin" : true,
            "keep_original" : false,
            "limit_first_letter_length" : 16,
            "lowercase" : true
          }
        }
      }
    },

  "mappings": {
    "": {
      "properties": {
        "name": {
          "type": "text",
          "index": true,
          "fields":{
              "pinyin":{
                  "type":"text",
                  "analyzer":"pinyin_analyzer"
              }
           }
      	},
        "link": {
          "type": "keyword",
          "index": false
        },
        "id": {
          "type": "long"
        },
        "update_time": {
          "type": "date",
          "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
        }
      }
    }
  }
}

分词测试

GET /_analyze
{
  "field": "name.pinyin",
  "text": "内蒙古"
}

返回
{
  "tokens": [
    {
      "token": "nei",
      "start_offset": 0,
      "end_offset": 1,
      "type": "word",
      "position": 0
    },
    {
      "token": "meng",
      "start_offset": 1,
      "end_offset": 2,
      "type": "word",
      "position": 1
    },
    {
      "token": "gu",
      "start_offset": 2,
      "end_offset": 3,
      "type": "word",
      "position": 2
    }
  ]
}
二、已有索引添加拼音支持

1、新建索引

PUT 
{
  "mappings": {
    "": {
      "properties": {
        "name": {
          "type": "keyword",
          "index": true
        },
        "link": {
          "type": "keyword",
          "index": false
        },
        "id": {
          "type": "long"
        },
        "update_time": {
          "type": "date",
          "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
        }
      }
    }
  }
}

2、设置拼音分词器

POST  /_close

PUT /_settings
{
  "index": {
    "analysis": {
      "analyzer": {
        "pinyin_analyzer": {
          "tokenizer": "my_pinyin"
        }
      },
      "tokenizer": {
        "my_pinyin": {
          "type": "pinyin",
          "keep_first_letter": true,
          "keep_separate_first_letter": true,
          "keep_full_pinyin": true,
          "keep_original": false,
          "limit_first_letter_length": 16,
          "lowercase": true
        }
      }
    }
  }
}

POST  /_open

3、修改mapping,添加拼音分词器

PUT //_mapping
{
  "": {
    "properties": {
      "name": {
        "type": "keyword",
        "index": true,
            "fields":{
                "pinyin":{
                    "type":"text",
                    "analyzer":"pinyin_analyzer"
                }
            }
      },
      "link": {
        "type": "keyword",
        "index": false
      },
      "id": {
        "type": "long"
      },
      "update_time": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
      }
    }
  }
}


GET /_mapping


# 将当前索引的数据重新替换一下当前索引
POST /_update_by_query?conflicts=proceed

4、搜索测试


get /_search
{
  "query_string": {
    "fields": [
      "name",
      "name.pinyin"
    ],
    "query": "王苏川",
    "default_operator": "AND"
  }
}

参考 Elastic 搜索开发实战 拼音处理

关注
打赏
1665367115
查看更多评论
立即登录/注册

微信扫码登录

0.1727s