【声明:】本文是作者(蘑菇v5)原创,版权归作者 蘑菇v5所有,侵权必究。本文首发在简书。如若转发,请注明作者和来源地址!未经授权,严禁私自转载!
前段时间,公司项目中需要加入联系人功能,而联系人数据保存在公司服务器上,所以手机端通过OkHttp框架以https协议从服务器上获取json格式的人员数据。
在开发的过程当中,遇到了一些问题,比如搜索的时候,要能够以拼音全写和简写的形式,模糊搜索到相应的人员,还有姓氏多音字的问题,之前用的拼音转换工具pinyin4j.jar,pinyin4j是一个流行的Java库,支持中文字符和拼音之间的转换,但是对于多音字没有相应的处理。而在Android的系统应用联系人中也给我们实现了汉字与拼接转换的方式,那就是android提供的HanziToPinyin工具类,成功处理了比如“单”姓所遇到的dan与shan。HanziToPinyin类代码如下:
importandroid.text.TextUtils;
importandroid.util.Log;
importjava.util.ArrayList;
importlibcore.icu.Transliterator;
/**
*An object to convert Chinese character to its corresponding pinyin string.
*For characters with multiple possible pinyin string, only one is selected
*according to ICU Transliterator class. Polyphone is not supported in this
*implementation.
*/
public class HanziToPinyin {
private static final String TAG = "HanziToPinyin";
private static HanziToPinyin sInstance;
private Transliterator mPinyinTransliterator;
private Transliterator mAsciiTransliterator;
public static class Token {
/**
* Separator between target string for each source char
*/
public static final String SEPARATOR = " ";
public static final int LATIN = 1;
public static final int PINYIN = 2;
public static final int UNKNOWN = 3;
public Token() {
}
public Token(int type, String source, String target) {
this.type = type;
this.source = source;
this.target = target;
}
/**
* Type of this token, ASCII, PINYIN or UNKNOWN.
*/
public int type;
/**
* Original string before translation.
*/
public String source;
/**
* Translated string of source. For Han, target is corresponding Pinyin. Otherwise target is
* original string in source.
*/
public String target;
}
private HanziToPinyin() {
try {
mPinyinTransliterator = new Transliterator(
"Han-Latin/Names; Latin-Ascii; Any-Upper");
mAsciiTransliterator = new Transliterator("Latin-Ascii");
} catch (IllegalArgumentException e) {
Log.w(TAG, "Han-Latin/Names transliterator data is missing,"
+ " HanziToPinyin is disabled");
}
}
public boolean hasChineseTransliterator() {
return mPinyinTransliterator != null;
}
public static HanziToPinyin getInstance() {
synchronized (HanziToPinyin.class) {
if (sInstance == null) {
sInstance = new HanziToPinyin();
}
return sInstance;
}
}
private void tokenize(char character, Token token) {
token.source = Character.toString(character);
// ASCII
if (character < 128) {
token.type = Token.LATIN;
token.target = token.source;
return;
}
// Extended Latin. Transcode these to ASCII equivalents
if (character < 0x250 || (0x1e00 <= character && character < 0x1eff)) {
token.type = Token.LATIN;
token.target = mAsciiTransliterator == null ? token.source :
mAsciiTransliterator.transliterate(token.source);
return;
}
token.type = Token.PINYIN;
token.target = mPinyinTransliterator.transliterate(token.source);
if (TextUtils.isEmpty(token.target) ||
TextUtils.equals(token.source, token.target)) {
token.type = Token.UNKNOWN;
token.target = token.source;
}
}
public String transliterate(final String input) {
if (!hasChineseTransliterator() || TextUtils.isEmpty(input)) {
return null;
}
return mPinyinTransliterator.transliterate(input);
}
/**
* Convert the input to a array of tokens. The sequence of ASCII or Unknown characters without
* space will be put into a Token, One Hanzi character which has pinyin will be treated as a
* Token. If there is no Chinese transliterator, the empty token array is returned.
*/
public ArrayList<Token> getTokens(final String input) {
ArrayList<Token> tokens = new ArrayList<Token>();
if (!hasChineseTransliterator() || TextUtils.isEmpty(input)) {
// return empty tokens.
return tokens;
}
final int inputLength = input.length();
final StringBuilder sb = new StringBuilder();
int tokenType = Token.LATIN;
Token token = new Token();
// Go through the input, create a new token when
// a. Token type changed
// b. Get the Pinyin of current charater.
// c. current character is space.
for (int i = 0; i < inputLength; i++) {
final char character = input.charAt(i);
if (Character.isSpaceChar(character)) {
if (sb.length() > 0) {
addToken(sb, tokens, tokenType);
}
} else {
tokenize(character, token);
if (token.type == Token.PINYIN) {
if (sb.length() > 0) {
addToken(sb, tokens, tokenType);
}
tokens.add(token);
token = new Token();
} else {
if (tokenType != token.type && sb.length() > 0) {
addToken(sb, tokens, tokenType);
}
sb.append(token.target);
}
tokenType = token.type;
}
}
if (sb.length() > 0) {
addToken(sb, tokens, tokenType);
}
return tokens;
}
private void addToken(
final StringBuilder sb, final ArrayList<Token> tokens, final int tokenType) {
String str = sb.toString();
tokens.add(new Token(tokenType, str, str));
sb.setLength(0);
}
/**
* 输入汉字返回拼音的通用方法函数
*/
public static String getPinYin(String hanzi) {
ArrayList<Token> tokens = HanziToPinyin.getInstance().getTokens(hanzi);
StringBuilder sb = new StringBuilder();
if (tokens != null && tokens.size() > 0) {
for (Token token : tokens) {
if (Token.PINYIN == token.type) {
sb.append(token.target);
} else {
sb.append(token.source);
}
}
}
return sb.toString().toUpperCase();
}
}
上面的Transliterator类,一定要放在libcore.icu包下面:
Transliterator类的代码如下:
public final class Transliterator {
private long peer;
/**
* Creates a new Transliterator for the given id.
*/
public Transliterator(String id) {
peer = create(id);
}
@Override protected synchronized void finalize() throws Throwable {
try {
destroy(peer);
peer = 0;
} finally {
super.finalize();
}
}
/**
* Returns the ids of all known transliterators.
*/
public static native String[] getAvailableIDs();
/**
* Transliterates the specified string.
*/
public String transliterate(String s) {
return transliterate(peer, s);
}
private static native long create(String id);
private static native void destroy(long peer);
private static native String transliterate(long peer, String s);
}
在项目中使用的时候,以如下方式调用:
String pinyin=HanziToPinyin.getPinYin(name);
以上是通讯录多音字的时候,下面介绍如何模糊搜索,网上关于此方面的文章很多,大都是拼音全拼和文字检索,本文介绍了拼音简写的搜索。
首先,写一个实体类,如SortModel,用于方便存取对象中某个字段的数据,如下:
public class SortModel extends Contact implements Serializable{
public SortModel() {
super();
}
public SortModel(String id,String usercode,String name,String pinyin,String status,String serverTime) {
super(id,usercode,name,pinyin,status,serverTime);
}
public SortModel(String id,String usercode,String name,String pinyin,String status) {
super(id,usercode,name,pinyin,status);
}
public String sortLetters;//显示数据拼音的首字母
public SortToken sortToken= new SortToken();//中文全名,全拼,简拼
}
如下contact类:
public class Contact implements Serializable{
public String id;
public String usercode;
public String name;
public String pinyin;
//人员状态
public String status;//add del update
public boolean isChecked;
//服务器时间
public String serverTime;
public Contact(){}
public Contact(String id,String usercode,String name,String pinyin){
this.id=id;
this.usercode=usercode;
this.name=name;
this.pinyin=pinyin;
}
public Contact(String id, String usercode, String name, String pinyin, String status) {
this.id = id;
this.usercode = usercode;
this.name = name;
this.pinyin = pinyin;
this.status = status;
}
public Contact(String id, String usercode, String name, String pinyin, String status, String serverTime) {
this.id = id;
this.usercode = usercode;
this.name = name;
this.pinyin = pinyin;
this.status = status;
this.serverTime = serverTime;
}
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public String getUsercode() {
return usercode;
}
public void setUsercode(String usercode) {
this.usercode = usercode;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getPinyin() {
return pinyin;
}
public void setPinyin(String pinyin) {
this.pinyin = pinyin;
}
public String getStatus() {
return status;
}
public void setStatus(String status) {
this.status = status;
}
public void setIsChecked(boolean isChecked) {
this.isChecked = isChecked;
}
public boolean isChecked() {
return isChecked;
}
public String getServerTime() {
return serverTime;
}
public void setServerTime(String serverTime) {
this.serverTime = serverTime;
}
@Override
public String toString() {
return "Contact{" +
"id='" + id + '\'' +
", usercode='" + usercode + '\'' +
", name='" + name + '\'' +
", pinyin='" + pinyin + '\'' +
", status='" + status + '\'' +
", isChecked=" + isChecked +
'}';
}
}
SortToken类:
/**
*拼音
*/
public class SortToken implements Serializable{
public String simpleSpell="";//简拼
public String wholeSpell="";//全拼
public String chName="";//中文全名
}
拼音简拼和全拼,自己处理字符串的操作,这里不做详细说明了(作者是通过处理sortKey的方式,sortKey的格式:SHI 世 JIE 界 NI 你 HAO 好,自己写了个工具类PinyinUtils,处理拼音:
public class PinyinUtils {
/**
* 名字转拼音,取首字母
* @param name
* @return
*/
public static String getSortLetter(String name,String pinyin) {
String letter = "#";
if (name == null) {
return letter;
}
String sortString = pinyin.substring(0, 1).toUpperCase(Locale.CHINESE);
// 正则表达式,判断首字母是否是英文字母
if (sortString.matches("[A-Z]")) {
letter = sortString.toUpperCase(Locale.CHINESE);
}
return letter;
}
private static final String chReg = "[\\u4E00-\\u9FA5]+";//中文字符串匹配
//String chReg="[^\\u4E00-\\u9FA5]";//除中文外的字符匹配
/**
* 解析sort_key,封装简拼,全拼
* @param sortKey
* @return
*/
public static SortToken parseSortKey(String sortKey) {
SortToken token = new SortToken();
if (sortKey != null && sortKey.length() > 0) {
//其中包含的中文字符
String[] enStrs = sortKey.replace(" ", "").split(chReg);
for (int i = 0, length = enStrs.length; i < length; i++) {
if (enStrs[i].length() > 0) {
//拼接简拼
token.simpleSpell += enStrs[i].charAt(0);
//拼接全拼
token.wholeSpell += enStrs[i];
}
}
}
return token;
}
}
下面说一下,模糊查询的方法,如下:
/**
*通过名字或者拼音搜索
* @paramstr
* @return
*/
public List<SortModel> searchContact(final String str, List<SortModel> mAllContactsList){
List<SortModel> filterList = new ArrayList<SortModel>();// 过滤后的list
//if (str.matches("^([0-9]|[/+])*$")) {// 正则表达式 匹配号码
if (str.matches("^([0-9]|[/+]).*")) {// 正则表达式 匹配以数字或者加号开头的字符串(包括了带空格及-分割的号码)
for (SortModel contact : mAllContactsList) {
if (contact.name != null) {
if (contact.name.contains(str)) {
if (!filterList.contains(contact)) {
filterList.add(contact);
}
}
}
}
}else {
for (SortModel contact : mAllContactsList) {
if (contact.name != null) {
//姓名全匹配,姓名首字母简拼匹配,姓名全字母匹配
if (contact.name.toLowerCase(Locale.CHINESE).contains(str.toLowerCase(Locale.CHINESE))
|| contact.sortToken.simpleSpell.toLowerCase(Locale.CHINESE).contains(str.toLowerCase(Locale.CHINESE))
|| contact.sortToken.wholeSpell.toLowerCase(Locale.CHINESE).contains(str.toLowerCase(Locale.CHINESE))) {
if (!filterList.contains(contact)) {
filterList.add(contact);
}
}
}
}
}
return filterList;
}
通过以上方式能够处理自己在项目中遇到的问题,截个效果图吧,运行在自己手机上的效果,图标和标题已ps掉: