学期即将结束,学习通突然加了一门课,章节练习特别多,好在每一章节都提供了相应的参考答案。
对着答案做了几题后我逐渐烦躁,失去耐心,因为参考答案是以图片的形式提供,没法Ctrl+F搜索,题目乱序的。所有章节加起来,几十张图片,我只能一夜又一夜的翻
此刻我不禁赞叹起老师的高明之处,简直妙啊!
于是......
我想到了利用文字识别能力将图片中的文字转成文本,再输入关键字搜索相应的答案,完美。
说干就干
首先选择一个AI能力较强的平台,百度就不错,每个月都有几百免费调用次数。
-
百度搜索“百度大脑”,注册登录,在开放能力选择文字识别并创建应用,得到相应的 key ,如下图。
通过API Key 和Secret Key向鉴权API发送请求获取Token授权码,示例如下
private static String clientId = "百度云应用的AK";
// Secret Key
private static String clientSecret = "百度云应用的SK";
public static String getAccessToken() {
String authHost = "https://aip.baidubce.com/oauth/2.0/token";
HttpClient client = new HttpClient();
List<KeyValuePair<String, String>> paraList = new List<KeyValuePair<string, string>>();
paraList.Add(new KeyValuePair<string, string>("grant_type", "client_credentials"));
paraList.Add(new KeyValuePair<string, string>("client_id", clientId));
paraList.Add(new KeyValuePair<string, string>("client_secret", clientSecret));
HttpResponseMessage response = client.PostAsync(authHost, new FormUrlEncodedContent(paraList)).Result;
String result = response.Content.ReadAsStringAsync().Result;
Console.WriteLine(result);
return result;
}
// 返回的token示例
"24.adda70c11b9786206253ddb70affdc46.2592000.1493524354.282335-1234567";
携带token与图片的base64向识别API发起请求,返回的结果就是识别的内容啦
//string hostUrl= "https://aip.baidubce.com/rest/2.0/ocr/v1/idcard?access_token=" + token;
/// <summary>
/// 文字扫描服务的请求主体
/// </summary>
/// <param name="imgpath">图片位置</param>
/// <param name="hostUrl">请求地址+token参数</param>
/// <returns></returns>
public static string ClientRequest(string imgpath,string hostUrl)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(hostUrl);
request.Method = "post";
request.KeepAlive = true;
// 图片的base64编码
string base64 = Common.getFileBase64(imgpath);
String str = "language_type=" + "CHN_ENG" + "&result_type=" + "big" + "&image=" + HttpUtility.UrlEncode(base64);
Encoding encoding = Encoding.Default;
byte[] buffer = encoding.GetBytes(str);
request.ContentLength = buffer.Length;
request.GetRequestStream().Write(buffer, 0, buffer.Length);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream(), Encoding.Default);
string result = reader.ReadToEnd();
return result;
}
/// <summary>
/// 获取文件base64
/// </summary>
/// <param name="fileName"></param>
/// <returns></returns>
public static String getFileBase64(String fileName)
{
FileStream filestream = new FileStream(fileName, FileMode.Open);
byte[] arr = new byte[filestream.Length];
filestream.Read(arr, 0, (int)filestream.Length);
string baser64 = Convert.ToBase64String(arr);
filestream.Close();
return baser64;
}
这里完成了简单的单张图片文字识别能力,如果要批量识别并将内容保存到文本,还需要编写相应的算法完成。完整代码请到我的Github上看