您当前的位置: 首页 >  彭世瑜 ar

拼音分词扩展elasticsearch-analysis-pinyin安装

彭世瑜 发布时间:2019-11-19 10:23:22 ,浏览量:1

1、打开找到对应的版本
https://github.com/medcl/elasticsearch-analysis-pinyin/releases

2、复制下载链接安装

例如:
我的elasticsearch是5.6.16

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-pinyin/releases/download/v5.6.16/elasticsearch-analysis-pinyin-5.6.16.zip

3、重启ES!

4、分词测试

GET _analyze
{
  "text": "学习",
  "analyzer": "pinyin"
}

分词结果

{
  "tokens": [
    {
      "token": "xue",
      "start_offset": 0,
      "end_offset": 1,
      "type": "word",
      "position": 0
    },
    {
      "token": "xi",
      "start_offset": 1,
      "end_offset": 2,
      "type": "word",
      "position": 1
    },
    {
      "token": "xx",
      "start_offset": 0,
      "end_offset": 2,
      "type": "word",
      "position": 1
    }
  ]
}

自定义参数

参数默认值说明
keep_first_lettertrue刘德华>ldh
keep_separate_first_letterfalse刘德华>l,d,h
limit_first_letter_length16set max length of the first_letter result
keep_full_pinyintrue刘德华> [liu,de,hua]
keep_joined_full_pinyinfalse刘德华> [liudehua]
keep_none_chinesetruekeep non chinese letter or number in result
keep_none_chinese_togethertruetrue:DJ音乐家 -> DJ,yin,yue,jia;
false:DJ音乐家 -> D,J,yin,yue,jia
keep_none_chinese_in_first_lettertrue刘德华AT2016->ldhat2016
keep_none_chinese_in_joined_full_pinyinfalseeg: 刘德华2016->liudehua2016
none_chinese_pinyin_tokenizetrueeg: liudehuaalibaba13zhuanghan -> liu,de,hua,a,li,ba,ba,13,zhuang,han
keep_originalfalse-
lowercasetrue-
trim_whitespacetrue-
remove_duplicated_termfalsede的 > de
ignore_pinyin_offsettrue-
关注
打赏
查看更多评论

彭世瑜

暂无认证

  • 1浏览

    0关注

    2727博文

    0收益

  • 0浏览

    0点赞

    0打赏

    0留言

私信
关注
热门博文
立即登录/注册

微信扫码登录