1、打开找到对应的版本 https://github.com/medcl/elasticsearch-analysis-pinyin/releases
2、复制下载链接安装
例如: 我的elasticsearch是5.6.16
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-pinyin/releases/download/v5.6.16/elasticsearch-analysis-pinyin-5.6.16.zip
3、重启ES!
4、分词测试
GET _analyze
{
"text": "学习",
"analyzer": "pinyin"
}
分词结果
{
"tokens": [
{
"token": "xue",
"start_offset": 0,
"end_offset": 1,
"type": "word",
"position": 0
},
{
"token": "xi",
"start_offset": 1,
"end_offset": 2,
"type": "word",
"position": 1
},
{
"token": "xx",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 1
}
]
}
自定义参数
参数默认值说明keep_first_lettertrue刘德华>ldhkeep_separate_first_letterfalse刘德华>l,d,hlimit_first_letter_length16set max length of the first_letter resultkeep_full_pinyintrue刘德华> [liu,de,hua]keep_joined_full_pinyinfalse刘德华> [liudehua]keep_none_chinesetruekeep non chinese letter or number in resultkeep_none_chinese_togethertruetrue:DJ音乐家 -> DJ,yin,yue,jia; false:DJ音乐家 -> D,J,yin,yue,jiakeep_none_chinese_in_first_lettertrue刘德华AT2016->ldhat2016keep_none_chinese_in_joined_full_pinyinfalseeg: 刘德华2016->liudehua2016none_chinese_pinyin_tokenizetrueeg: liudehuaalibaba13zhuanghan -> liu,de,hua,a,li,ba,ba,13,zhuang,hankeep_originalfalse-lowercasetrue-trim_whitespacetrue-remove_duplicated_termfalsede的 > deignore_pinyin_offsettrue-