Vue.js - 使用Mammoth.js实现Word文档转换成HTML详解2（提取图片上传到服务器）

作者：hangge | 2025-12-29 08:37

在前面的文章中我演示了 convertImage: mammoth.images.inline(...) 的用法，将 DOCX 内的图片变为内联（base64 data URI）形式在页面上显示（点击查看）。有时我们希望把 DOCX 内的图片改为上传到服务器/云存储并返回可访问 URL，或者按大小策略决定内联或上传，这个也是可以做到的。下面通过样例进行演示。

二、提取图片上传到服务器

1，样例代码

（1）下面样例可以自由选择是否自动上传图片，并且会根据阈值决定内联或上传：

小图（<= 20 KB）：内联 base64（快速、少请求）
大图（> 20 KB）：上传到服务器或云存储，返回外部 URL（节省内存、加速首次加载）

（2）样例代码如下：

提示：在将转换后的 html 内容显示在页面上之前，还使用了 DOMPurify 库对其进行处理。它是目前 Web 前端最常用、最可靠的 HTML XSS 过滤库。DOMPurify.sanitize 能够对 HTML 进行“安全过滤”，移除所有可能造成 XSS 攻击的恶意内容。

<template>
  <div class="docx-converter">
    <div>
      <label>
        <input type="checkbox" v-model="autoUploadLargeImages" />
         自动上传大图（> {{ thresholdKB }}KB）
      </label>
    </div>

    <input
      type="file"
      accept=".docx,application/vnd.openxmlformats-officedocument.wordprocessingml.document"
      @change="onFileChange"
    />

    <div v-if="loading">正在转换……</div>

    <div v-if="messages.length">
      <h4>转换消息</h4>
      <ul>
        <li v-for="(m, idx) in messages" :key="idx">{{ m.type }}: {{ m.message }}</li>
      </ul>
    </div>

    <h4>预览（HTML）</h4>
    <!-- 请务必在生产环境中对 html 进行 XSS 清理（例如用 DOMPurify） -->
    <div class="preview" v-html="sanitizedHtml"></div>
  </div>
</template>

<script>
// 推荐导入方式：如果报错请改为 `import mammoth from 'mammoth'` 
// 或 `import * as mammoth from 'mammoth'`
import * as mammoth from "mammoth";
import DOMPurify from "dompurify"; // 请安装 dompurify：npm install dompurify

export default {
  name: "DocxConverter",
  data() {
    return {
      messages: [],
      loading: false,
      rawHtml: "",
      sanitizedHtml: "",
      autoUploadLargeImages: true,
      thresholdKB: 20, // 超过 20KB 则上传（可调整）
    };
  },
  computed: {
    thresholdBytes() {
      return this.thresholdKB * 1024;
    },
  },
  methods: {
    // 监听文件选择事件
    async onFileChange(event) {
      const file = event.target.files && event.target.files[0];
      if (!file) return;

      if (!file.name.endsWith(".docx")) {
        alert("请上传 .docx 文件");
        return;
      }

      try {
        this.loading = true;
        this.error = "";
        this.rawHtml = "";
        this.sanitizedHtml = "";
        const arrayBuffer = await file.arrayBuffer();

        const options = {
          convertImage: mammoth.images.inline(async (element) => {
            // element.contentType like "image/png"
            // element.read("base64") returns Promise<string>
            const base64 = await element.read("base64");
            const contentType = element.contentType || "image/png";

            // 把 base64 转 Blob，检查大小
            const blob = this.base64ToBlob(base64, contentType);

            // 策略：小图内联，大图上传返回 URL
            if (this.autoUploadLargeImages && blob.size > this.thresholdBytes) {
              // 上传到后端（示例API：/api/upload），返回 { url: "https://..." }
              try {
                const url = await this.uploadImageBlobToServer(blob, contentType);
                return { src: url };
              } catch (uploadErr) {
                console.error("图片上传失败，回退到内联：", uploadErr);
                // 回退到内联
                return { src: "data:" + contentType + ";base64," + base64 };
              }
            } else {
              // 内联（小图或不开启上传）
              return { src: "data:" + contentType + ";base64," + base64 };
            }
          }),
          styleMap: [
              // 把 Word 的 "Title" 样式映射为 h1
              "p[style-name='Title'] => h1:fresh",
              // 把强调文本映射为 <em>
              "i => em",
              // 需要的话可以添加更多映射
          ],
        };

        const result = await mammoth.convertToHtml({ arrayBuffer }, options);
        this.rawHtml = result.value || "";
        this.messages = result.messages || [];
        // 安全：使用 DOMPurify 清理
        this.sanitizedHtml = DOMPurify.sanitize(this.rawHtml);
      } catch (err) {
        console.error("转换失败：", err);
        alert("转换失败，请检查控制台错误信息。");
      } finally {
        this.loading = false;
      }
    },
    // base64 转 Blob
    base64ToBlob(base64, contentType = "") {
      const binary = atob(base64);
      const len = binary.length;
      const buffer = new Uint8Array(len);
      for (let i = 0; i < len; i++) {
        buffer[i] = binary.charCodeAt(i);
      }
      return new Blob([buffer], { type: contentType });
    },
    // 上传 Blob 到服务器（示例实现：POST multipart/form-data -> 返回图片 URL）
    async uploadImageBlobToServer(blob, contentType) {
      const formData = new FormData();
      // 服务器端可以按需命名，示例使用 timestamp
      const filename = "img_" + Date.now();
      const file = new File([blob], filename, { type: contentType });
      formData.append("file", file);

      const resp = await fetch("/api/upload", {
        method: "POST",
        body: formData,
        // credentials: 'include' // 如果需要鉴权
      });

      if (!resp.ok) {
        throw new Error("上传失败: " + resp.statusText);
      }
      const json = await resp.json();
      // 约定返回 { url: "https://..." }
      return json.url;
    }
  },
};
</script>

<style scoped>
.preview {
  border: 1px solid #ddd;
  padding: 12px;
  margin-top: 12px;
  max-height: 60vh;
  overflow: auto;
  background: #fff;
}

/* ====== 为标题样式添加样式 ====== */
::v-deep .preview h2,
::v-deep .preview h3,
::v-deep .preview h4 {
  margin-top: 24px;
  margin-bottom: 12px;
}

::v-deep .preview h2 {
  font-size: 1.5rem;
  font-weight: 600;
}

::v-deep .preview h3 {
  font-size: 1.3rem;
  font-weight: 600;
}

::v-deep .preview h4 {
  font-size: 1.2rem;
  font-weight: 600;
}

/* ====== 为 table 添加样式 ====== */
::v-deep .preview table {
  width: 100%;
  border-collapse: collapse;
  table-layout: auto;
  margin: 6px 0;
}

/* 单元格边框与内边距 */
::v-deep .preview table th,
::v-deep .preview table td {
  border: 1px solid #d1d5db; /* 灰色边框，可按需调整 #e5e7eb #cbd5e1 等 */
  padding: 8px 10px;
  vertical-align: middle;
}

/* 表头样式（可选） */
::v-deep .preview table th {
  background: #f9fafb;
  font-weight: 600;
}

/* 表格在窄屏时不要换行太多（可选） */
::v-deep .preview table td {
  word-break: break-word;
}

/* 如果使用 Tailwind 的 prose 可能会有默认的 table 样式，下面加个更高权重以覆盖 */
::v-deep .preview table,
::v-deep .preview table th,
::v-deep .preview table td {
  /* 提高优先级避免被 reset 覆盖 */
  border-color: #d1d5db;
}
</style>

2，运行测试

（1）我们打开页面选择 docx 文件后会自动将大于 20KB 的图片通过后台接口进行上传。

（2）假设我们后台图片上传接口返回的数据数据格式如下：

{
  "url": "https://www.hangge.com/blog/images/logo.png"
}

（3）可以看到前端显示的图片地址也被替换成接口返回的 url 地址了。

Vue.js - 使用Mammoth.js实现Word文档转换成HTML详解2（提取图片上传到服务器）

二、提取图片上传到服务器

1，样例代码

2，运行测试

全部评论（0）