最近在公司写项目时,有个导入csv格式文件数据的需求。Java读取csv文件时默认是按照 ,[英文逗号]分割的,若是数据内容不包含逗号的话就简单多了,但遇到的问题就恰巧是尴尬的地方。
如果你看到这篇文章,应该也是遇到相同的问题了吧
1.1 解决方案一(推荐)
pom.xml
<dependency>
<groupId>com.opencsv</groupId>
<artifactId>opencsv</artifactId>
<version>4.4</version>
</dependency>
1.2 代码示例
public void readCSV() {
String srcPath = "D:\\data\\line.csv";
String charset = "utf-8";
try (CSVReader csvReader = new CSVReaderBuilder(new BufferedReader(new InputStreamReader(new FileInputStream(new File(srcPath)), charset))).build()) {
Iterator<String[]> iterator = csvReader.iterator();
while (iterator.hasNext()) {
Arrays.stream(iterator.next()).forEach(System.out::print);
System.out.println();
}
} catch (Exception e) {
e.printStackTrace();
}
}
2.1 解决方案二
看到的文章中,觉得比较好的解决方案就是使用正则进行匹配,读取的csv数据默认是用双引号包起来的,在最后的截取中,如果只按照双引号外的逗号截取,不就是能得到想要的数据了。
2.1 代码片段
/**
* @param srcPath csv文件路径
*/
private void readCSVFileData(String srcPath) {
BufferedReader reader = null;
String line = null;
try {
reader = new BufferedReader(new FileReader(srcPath));
} catch (FileNotFoundException e) {
logger.error("[读取CSV文件,插入数据时,读取文件异常]");
e.printStackTrace();
}
String[] fieldsArr = null;
int lineNum = 0;
int insertResult = 0;
TableInfo tableInfo = new TableInfo();
tableInfo.setTableName(tableName);
try {
List listField;
while ((line = reader.readLine()) != null) {
if (lineNum == 0) {
//表头信息
fieldsArr = line.split(",");
} else {
//数据信息
listField = new ArrayList<>();
String str;
line += ",";
Pattern pCells = Pattern
.compile("(\"[^\"]*(\"{2})*[^\"]*\")*[^,]*,");
Matcher mCells = pCells.matcher(line);
List cells = new LinkedList();//每行记录一个list
//读取每个单元格
while (mCells.find()) {
str = mCells.group();
str = str.replaceAll(
"(?sm)\"?([^\"]*(\"{2})*[^\"]*)\"?.*,", "$1");
str = str.replaceAll("(?sm)(\"(\"))", "$2");
cells.add(str);
}
//从第2行起的数据信息list
listField.add(cells);
}
lineNum++;
}
} catch (Exception e) {
e.printStackTrace();
}
}