在实际项目的开发过程中，有一些需求是涉及到对各种文档文件的操作，由于在近期的工作中使用到一些 word 和 *pdf *的操作，因此在这篇文章中，我会简单介绍下 *word *和 pdf 的相关操作方法。

1、pdf文档预览

word、excel等文档的预览，如果是免费的，绝大多数都是基于openoffice去实现的，给大家提供一个参考：文档预览开源服务
（如果考虑付费产品的话，可以看看：永中office、office365等）。

这里介绍使用pdf.js来预览pdf，pdf.js下载链接，提取码：ew60，需要将这个pdfjs文件夹放到项目中（我这边是放在webapp/resources/plugin里面，如下图）。

调用方法：（假设pdfjs访问的路径是$prefix/pdfjs）

需要给预览页面的iframe的src属性设置如下：

    "$prefix/pdfjs/web/viewer.html?file=" + encodeURIComponent(url)

上面的url指的是文件的访问地址（如：http://localhost:8080/file/abc.txt）

ps：使用pdfjs预览时，插件中默认会有下载和打印按钮，如果项目中需要对这两个按钮设置操作权限的话，则需要在js中去处理。（如果不需要这两个按钮，则只需要在pdfjs/web/viewer.html中找到id="download"、id="print"、 id="secondaryPrint" 和 id="secondaryDownload"，分别加上style="display: none;"）

js动态隐藏下载和打印按钮，如下图：

2、文档解析-word文档占位符替换

word占位符替换一般有两种情况
①替换word段落的占位符（标题、段落等除表格以外的元素）
②替换word表格的占位符
相关方法如下：

import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.usermodel.*;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * 替换文档中的段落和表格占位符
 *
 * @author zhuLong
 * @since 2020/6/5 9:54
 */
public class WordReplaceUtil {

    /**
     * 替换段落中的占位符
     *
     * @param doc    需要替换的文档
     * @param params 替换的参数，key=占位符，value=实际值
     */
    public static void replaceInPara(XWPFDocument doc, Map<String, Object> params) {
        Iterator<XWPFParagraph> iterator = doc.getParagraphsIterator();
        XWPFParagraph para;
        while (iterator.hasNext()) {
            para = iterator.next();
            if (!StringUtils.isEmpty(para.getParagraphText())) {
                replaceInPara(para, params);
            }
        }
    }

    /**
     * 替换段落中的占位符
     *
     * @param para
     */
    public static void replaceInPara(XWPFParagraph para, Map<String, Object> params) {
        // 获取当前段落的文本
        String sourceText = para.getParagraphText();
        // 控制变量
        boolean replace = false;
        for (Map.Entry<String, Object> entry : params.entrySet()) {
            String key = entry.getKey();
            if (sourceText.indexOf(key) != -1) {
                Object value = entry.getValue();
                if (value instanceof String) {
                    // 替换文本占位符
                    sourceText = sourceText.replace(key, value.toString());
                    replace = true;
                }
            }
        }
        if (replace) {
            Integer fontSize = null;
            boolean isBold = false;
            // 获取段落中的行数
            List<XWPFRun> runList = para.getRuns();
            for (int i = runList.size() - 1; i >= 0; i--) {
                if (runList.get(i).getFontSize() > 0) {
                    fontSize = runList.get(i).getFontSize();
                }
                if (runList.get(i).isBold()) {
                    isBold = runList.get(i).isBold();
                }

                // 删除之前的行
                para.removeRun(i);
            }
            // 创建一个新的文本并设置为替换后的值 这样操作之后之前文本的样式就没有了，待改进
            XWPFRun run = para.createRun();
            run.setBold(isBold);
            run.setText(sourceText);
            if (fontSize != null) {
                run.setFontSize(fontSize);
            }
        }
    }

    /**
     * 替换表格中的占位符
     *
     * @param doc
     * @param params
     */
    public static void replaceTable(XWPFDocument doc, Map<String, Object> params) {
        // 获取文档中所有的表格
        Iterator<XWPFTable> iterator = doc.getTablesIterator();
        XWPFTable table;
        List<XWPFTableRow> rows;
        List<XWPFTableCell> cells;
        List<XWPFParagraph> paras;
        while (iterator.hasNext()) {
            table = iterator.next();
            if (table.getRows().size() > 1) {
                //判断表格是需要替换还是需要插入，判断逻辑有${为替换，
                if (matcher(table.getText()).find()) {
                    rows = table.getRows();
                    for (XWPFTableRow row : rows) {
                        cells = row.getTableCells();
                        for (XWPFTableCell cell : cells) {
                            paras = cell.getParagraphs();
                            for (XWPFParagraph para : paras) {
                                replaceInPara(para, params);
                            }
                        }
                    }
                }
            }
        }
    }

    /**
     * 正则匹配字符串
     *
     * @param str
     * @return
     */
    private static Matcher matcher(String str) {
        Pattern pattern = Pattern.compile("\\$\\{(.+?)\\}", Pattern.CASE_INSENSITIVE);
        Matcher matcher = pattern.matcher(str);
        return matcher;
    }

    /**
     * 需要替换的内容
     */
    private static Map<String, Object> createParamsMap() {
        Map<String, Object> map = new HashMap<String, Object>();
        map.put("${name}", "abc");
        map.put("${sex}", "男");
        return map;
    }

    public static void main(String[] args) throws Exception {
        File mainFile = new File("C:\\Users\\zhulong\\Desktop\\abc.docx");
        InputStream in = new FileInputStream(mainFile);
        OPCPackage srcPackage = OPCPackage.open(in);
        XWPFDocument doc = new XWPFDocument(srcPackage);
        WordReplaceUtil.replaceTable(doc, createParamsMap());
    }
}

3、文档解析-word文档添加行数据

import cn.hutool.core.util.ArrayUtil;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.usermodel.*;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.List;

/**
 * word 文档操作
 *
 * @author zhuLong
 * @since 2020/6/29 9:16
 */
public class WordUtils {
    /**
     * insertRow 在word表格中指定位置插入一行，复制指定行样式
     *
     * @param copyrowIndex 需要复制的行位置
     * @param newrowIndex  需要新增一行的位置
     */
    public static void insertRow(XWPFTable table, int copyrowIndex, int newrowIndex, String[] datas) {
        // 在表格中指定的位置新增一行
        XWPFTableRow targetRow = table.insertNewTableRow(newrowIndex);
        // 获取需要复制行对象
        XWPFTableRow copyRow = table.getRow(copyrowIndex);
        //复制行对象
        targetRow.getCtRow().setTrPr(copyRow.getCtRow().getTrPr());
        //或许需要复制的行的列
        List<XWPFTableCell> copyCells = copyRow.getTableCells();
        //复制列对象
        XWPFTableCell targetCell = null;
        for (int i = 0; i < copyCells.size(); i++) {
            XWPFTableCell copyCell = copyCells.get(i);
            String a = copyCell.getText();
            targetCell = targetRow.addNewTableCell();
            targetCell.getCTTc().setTcPr(copyCell.getCTTc().getTcPr());
            if (copyCell.getParagraphs() != null && copyCell.getParagraphs().size() > 0) {
                targetCell.getParagraphs().get(0).getCTP().setPPr(copyCell.getParagraphs().get(0).getCTP().getPPr());
                if (copyCell.getParagraphs().get(0).getRuns() != null
                        && copyCell.getParagraphs().get(0).getRuns().size() > 0) {
                    XWPFRun cellR = targetCell.getParagraphs().get(0).createRun();
                    cellR.setBold(copyCell.getParagraphs().get(0).getRuns().get(0).isBold());
                    if (ArrayUtil.isNotEmpty(datas)) {
                        cellR.setText(datas[i]);
                        cellR.setFontSize(10);
                    }
                }
            }
        }

    }

    public static void main(String[] args) throws Exception{
        File mainFile = new File("C:\\Users\\zhulong\\Desktop\\abc.docx");
        InputStream in = new FileInputStream(mainFile);
        OPCPackage srcPackage = OPCPackage.open(in);
        XWPFDocument doc = new XWPFDocument(srcPackage);
        // 动态插入一行
        List<XWPFTable> tables = doc.getTables();//获取word中所有的表格
        XWPFTable table = tables.get(0);//获取第一个表格
        String[] datas = new String[5];
        datas[0] = "a";
        datas[1] = "b";
        datas[2] = "c";
        datas[3] = "d";
        datas[4] = "e";
        WordUtils.insertRow(table, 1, 2, datas);
    }
}

4、文档解析-pdf相关操作

import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.Image;
import com.itextpdf.text.pdf.*;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.*;
import java.util.List;
import java.util.Map;

/**
 * pdf相关操作工具类
 *
 * @author zhuLong
 * @since 2020/6/18 17:03
 */
public class PdfUtil {

    private static final Logger logger = LoggerFactory.getLogger(PdfUtil.class);

    /*
     * 合并pdf文件
     * @param files 要合并文件数组(绝对路径如{ "e:\\1.pdf", "e:\\2.pdf" ,
     * "e:\\3.pdf"}),合并的顺序按照数组中的先后顺序，如2.pdf合并在1.pdf后。
     * @param newfile 合并后新产生的文件绝对路径，如 e:\\temp\\tempNew.pdf,
     * @return boolean 合并成功返回true；否则，返回false
     *
     */
    public static boolean mergePdfFiles(String[] files, String newfile) {
        boolean retValue = false;
        Document document = null;
        try {
            document = new Document(new PdfReader(files[0]).getPageSize(1));
            PdfCopy copy = new PdfCopy(document, new FileOutputStream(newfile));
            document.open();
            for (int i = 0; i < files.length; i++) {
                PdfReader reader = new PdfReader(files[i]);
                int n = reader.getNumberOfPages();
                for (int j = 1; j <= n; j++) {
                    document.newPage();
                    PdfImportedPage page = copy.getImportedPage(reader, j);
                    copy.addPage(page);
                }
            }
            retValue = true;
        } catch (Exception e) {
            System.out.println(e);
        } finally {
            System.out.println("执行结束");
            if (document != null) {
                document.close();
            }
        }
        return retValue;
    }

    /**
     * pdf转png
     *
     * @param inputStream pdf文件输入流
     * @author zhuLong
     * @since 2020/6/19 14:17
     */
    public static File pdf2png(InputStream inputStream) {
        try {
            PDDocument doc = PDDocument.load(inputStream);
            PDFRenderer renderer = new PDFRenderer(doc);
            int pageCount = doc.getNumberOfPages();
            for (int i = 0; i < pageCount; i++) {
                BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI
                // 创建临时文件
                File temp = File.createTempFile("myTempFile", ".png");
                ImageIO.write(image, "png", temp);
                return temp;
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }

    /**
     * pdf转png
     *
     * @param filePath pdf文件路径
     * @author zhuLong
     * @since 2020/6/18 23:00
     */
    public static File pdf2png(String filePath) throws FileNotFoundException {
        // 将pdf装图片 并且自定义图片得格式大小
        File file = new File(filePath);
        InputStream inputStream = new FileInputStream(file);
        return pdf2png(inputStream);
    }

    /**
     * pdf转png
     *
     * @param file pdf文件
     * @author zhuLong
     * @since 2020/6/18 23:00
     */
    public static File pdf2png(File file) throws FileNotFoundException {
        // 将pdf装图片 并且自定义图片得格式大小
        InputStream inputStream = new FileInputStream(file);
        return pdf2png(inputStream);
    }

    /**
     * @param imgPath 图片路径
     * @param pdf     生成的pdf
     * @author zhuLong
     * @since 2020/6/18 23:00
     */
    public static void image2pdf(String imgPath, File pdf) throws DocumentException, IOException {
        Document document = new Document();
        OutputStream os = new FileOutputStream(pdf);
        PdfWriter.getInstance(document, os);
        document.open();
        createPdf(document, imgPath);
        document.close();
    }

    private static void createPdf(Document document, String imgPath) {
        try {
            Image image = Image.getInstance(imgPath);
            float documentWidth = document.getPageSize().getWidth() - document.leftMargin() - document.rightMargin();
            System.out.println(documentWidth + "");
            float documentHeight = documentWidth / 580 * 850;//重新设置宽高
            System.out.println(documentHeight + "");
            image.scaleAbsolute(documentWidth, documentHeight);//重新设置宽高
            document.add(image);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            if (document != null) {
                document.close();
            }
        }
    }

    /**
     * 给pdf上添加水印文本
     *
     * @param contentList x、y坐标及文本的集合
     * @param file        源pdf文件
     * @param fontSize    字体大小
     * @author zhuLong
     * @since 2020/6/19 9:40
     */
    public static File setWatermark(List<Map<String, Object>> contentList, File file, float fontSize) throws FileNotFoundException {
        InputStream in = new FileInputStream(file);
        return setWatermark(contentList, in, fontSize);
    }

    /**
     * 给pdf上添加水印文本
     *
     * @param contentList x、y坐标及文本的集合
     * @param inputStream 源pdf文件输入流
     * @param fontSize    字体大小
     * @author zhuLong
     * @since 2020/6/19 9:40
     */
    public static File setWatermark(List<Map<String, Object>> contentList, InputStream inputStream, float fontSize) {
        PdfReader reader = null;
        PdfStamper stamper = null;
        try {
            reader = new PdfReader(inputStream);

            // 创建临时文件
            File dest = File.createTempFile("pdf", ".pdf");
            stamper = new PdfStamper(reader, new FileOutputStream(dest));
            //不可遮挡文字，只操作第一页
            PdfContentByte content = stamper.getOverContent(1);
            content.saveState();
            content.fill();
            content.restoreState();
            BaseFont base = BaseFont.createFont("STSong-Light", "UniGB-UCS2-H", BaseFont.NOT_EMBEDDED);
            //开始写入文本
            content.beginText();
            //字体大小
            content.setFontAndSize(base, fontSize);
            if (!contentList.isEmpty()) {
                for (Map<String, Object> map : contentList) {
                    //设置字体的输出位置
                    int x = Integer.parseInt(map.get("x").toString());
                    int y = Integer.parseInt(map.get("y").toString());
                    content.setTextMatrix(x, y);
                    content.showText(map.get("contentText").toString());
                }
            }
            content.endText();
            return dest;
        } catch (Exception e) {
            logger.error("操作pdf文件异常", e);
        } finally {
            try {
                if (stamper != null) {
                    stamper.close();
                }
                if (reader != null) {
                    reader.close();
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        return null;
    }

    public static void main(String[] args) throws IOException {
        File file = new File("C:\\Users\\zhulong\\Desktop\\1.pdf");
        List<Map<String, Object>> list = new ArrayList<Map<String, Object>>();
        Map<String, Object> map1 = new HashMap<String, Object>();
        map1.put("x", 480);
        map1.put("y", 690);
        map1.put("contentText", "10001");
        list.add(map1);
        Map<String, Object> map2 = new HashMap<String, Object>();
        map2.put("x", 425);
        map2.put("y", 690);
        map2.put("contentText", LocalDateTime.now().getYear());
        list.add(map2);
        File file1 = setWatermark(list, file, 14);
        System.out.println(file1.getAbsolutePath());
    }

}

5、图片裁剪拼接

import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.File;

/**
 * 图片相关工具类
 *
 * @author zhuLong
 * @since 2020/6/18 23:15
 */
public class ImageUtil {

    /**
     * 裁剪图片
     *
     * @param imgIn  待裁剪的图片
     * @param imgOut 裁剪后的图片
     */
    public static void cutImg(File imgIn, File imgOut, int x, int y, int width, int height) {
        try {
            BufferedImage bufferedImage = ImageIO.read(imgIn);
            BufferedImage back = bufferedImage.getSubimage(x, y, width, height);
            ImageIO.write(back, "png", imgOut);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    /**
     * Java拼接多张图片
     *
     * @param pics    图片路径数组
     * @param type    图片类型
     * @param dstFile 目标文件
     */
    public static void merge(File[] pics, String type, File dstFile) {

        int len = pics.length;
        if (len < 1) {
            System.out.println("pics len < 1");
        }
        BufferedImage[] images = new BufferedImage[len];
        int[][] ImageArrays = new int[len][];
        for (int i = 0; i < len; i++) {
            try {
                images[i] = ImageIO.read(pics[i]);
            } catch (Exception e) {
                e.printStackTrace();
            }
            int width = images[i].getWidth();
            int height = images[i].getHeight();
            ImageArrays[i] = new int[width * height];// 从图片中读取RGB
            ImageArrays[i] = images[i].getRGB(0, 0, width, height,
                    ImageArrays[i], 0, width);
        }

        int dst_height = 0;
        int dst_width = images[0].getWidth();
        for (int i = 0; i < images.length; i++) {
            dst_width = dst_width > images[i].getWidth() ? dst_width
                    : images[i].getWidth();

            dst_height += images[i].getHeight();
        }
        if (dst_height < 1) {
            System.out.println("dst_height < 1");
        }

        // 生成新图片
        try {
            BufferedImage ImageNew = new BufferedImage(dst_width, dst_height,
                    BufferedImage.TYPE_INT_RGB);
            int height_i = 0;
            for (int i = 0; i < images.length; i++) {
                ImageNew.setRGB(0, height_i, dst_width, images[i].getHeight(),
                        ImageArrays[i], 0, dst_width);
                height_i += images[i].getHeight();
            }
            // 写图片
            ImageIO.write(ImageNew, type, dstFile);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

6、word文档转换

Free Spire.Doc for Java 是一款免费、专业的Java Word组件，开发人员使用它可以轻松地将Word文档创建、读取、编辑、转换和打印等功能集成到自己的Java应用程序中。作为一款完全独立的组件，Free Spire.Doc for Java的运行环境无需安装Microsoft Office。

Free Spire.Doc for Java能执行多种Word文档处理任务，包括生成、读取、转换和打印Word文档，插入图片，添加页眉和页脚，创建表格，添加表单域和邮件合并域，添加书签，添加文本和图片水印，设置背景颜色和背景图片，添加脚注和尾注，添加超链接，加密和解密Word文档，添加批注，添加形状等。

友情提示：免费版有篇幅限制。在加载或保存Word 文档时，要求 Word 文档不超过 500 个段落，25 个表格。同时将 Word 文档转换为 PDF 和 XPS 等格式时，仅支持转换前三页。

在这里我们只介绍利用Free Spire.Doc for Java 将word转换为pdf。
1> pom文件引入依赖

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <url>http://repo.e-iceblue.cn/repository/maven-public/</url>
    </repository>
</repositories>

<dependencies>
        ......
        <!-- spire操作word -->
        <dependency>
            <groupId>e-iceblue</groupId>
            <artifactId>spire.doc.free</artifactId>
            <version>2.7.3</version>
        </dependency>
</dependencies>

2>方法调用

public static void main(String[] args) {
        //加载word示例文档
        Document document = new Document();
        document.loadFromFile("C:\\Users\\zhulong\\Desktop\\test.docx");
        //保存结果文件
        document.saveToFile("C:\\Users\\zhulong\\Desktop\\test.pdf", FileFormat.PDF); 
}

当然，excel转pdf也是支持的，可以使用Free Spire.XLS for Java，免费版有一定的限制，官网地址：https://www.e-iceblue.cn/Introduce/Free-Spire-XLS-JAVA.html

另外，也可以考虑使用openoffice组件进行文档的转换。

7、文档转换后字体乱码问题

一般文档转换经常会出现一种现象：本地（windows）上测试没问题，但是到了服务器（linux）上测试就出现中文乱码等问题，这种现象基本都是因为linux服务器上没有相关字体导致的。

解决方法：

第一步：在linux服务器上安装中文字体库
安装参考链接：linux安装中文字体库
安装成功后，记得把你的应用服务重启再试一下，如果还是不行，说明你的源文档中的相关字体在linux服务器上找不到，进行第二步。
第二步：定位你的文档中乱码那块的字体名称，然后到 C:\Windows\Fonts 这个目录下（windows系统）找到对应字体文件，复制到linux服务器上的/usr/share/fonts文件夹里面，然后依次执行如下命令：

mkfontscale //字体扩展
mkfontdir   //新增字体目录
fc-cache    //刷新缓存

注意：执行完之后，依然需要重启应用服务。

一般由于字体问题导致的乱码通过这种方法基本都可以得到解决。

java对文档的相关操作