Word 轉 MarkdownWord 轉 Markdown 轉換器
返回部落格
🚀 產品發佈

word2md-cli 發佈:在終端機中將 Docx 轉換為 Markdown

Word2MD 推出命令列版本,專為批量處理、CI 流水線和腳本自動化場景打造。npx 一行命令即可使用,內建可選 AI 圖片 OCR。

2026 年 4 月

Word2MD.net has always focused on making docx-to-markdown conversion fast and private in the browser. But over the last few months, we kept hearing the same ask: "Can I automate this?" Developers wanted to convert hundreds of files in CI, writers wanted to drop a script into their publishing pipeline, and AI teams wanted to preprocess documentation into markdown for their RAG systems. So we shipped word2md-cli — a tiny Node.js command-line tool that brings the same conversion engine to your terminal.

Install and run in one command

No setup, no config. npx fetches and runs it on demand:

npx word2md-cli input.docx

That's it — you get input.md in the same directory. Prefer a global install?

npm install -g word2md-cli
word2md input.docx

What you can do with it

Convert a single file

word2md input.docx                  # → input.md next to source
word2md input.docx -o custom.md     # custom output path
word2md input.docx --stdout         # pipe to another command

The --stdout flag is great for chaining:

word2md report.docx --stdout | pandoc -f markdown -t html -o report.html

Batch convert a whole folder

word2md ./docs/*.docx -d ./markdown/

Ideal for migrating SharePoint exports, Confluence archives, or Google Docs downloads into a modern static site.

Extract text from embedded images (OCR)

Pass --ocr to enable image OCR via PaddleX. Screenshots, diagrams, and scanned pages get their text extracted and inlined into the markdown:

export PADDLEX_OCR_URL="https://..."
export PADDLEX_OCR_TOKEN="..."
word2md input.docx --ocr --ocr-concurrency 4

Or pass credentials as flags:

word2md input.docx --ocr \
  --paddlex-url "https://..." \
  --paddlex-token "xxx"

Plain text output

Strip markdown syntax for clean prose — useful when feeding docs into LLM pipelines:

word2md input.docx --format text -o plain.txt

CI/CD integration

Drop it into a GitHub Action to auto-convert every docx committed to your repo:

- name: Convert Word docs to Markdown
  run: npx word2md-cli docs/*.docx -d site/content/

Combine with Astro, Hugo, or Next.js and you have a self-updating documentation site that accepts Word files as input. Non-technical contributors keep writing in Word. Engineers keep shipping markdown. Everyone wins.

CLI vs. the web app

Feature Web app CLI
Base conversion Browser (client-side) Local Node.js
Batch processing Drag multiple files Glob patterns, scripts
Image OCR Built-in API BYO PaddleX credentials
Automation ✅ Pipes, cron, CI
Live preview ❌ (pipe to a viewer)

Same conversion engine (mammoth + custom post-processing), same output. The CLI is just the scriptable surface.

Open source

word2md-cli is MIT-licensed on GitHub. Issues, feature requests, and PRs welcome. The code is intentionally small — around 150 lines of TypeScript — so it's easy to audit, fork, or extend with your own rules.

What's next

  • --watch mode that auto-converts files on save
  • --api-key flag that uses your Word2MD.net account for OCR (no PaddleX setup needed)
  • More input formats: PDF, RTF, ODT

Opinions on priority? Drop a note on GitHub.

Meanwhile — go try it:

npx word2md-cli some.docx

Thirty seconds from now you have markdown.

word2md-cli 發佈:在終端機中將 Docx 轉換為 Markdown | Word轉Markdown部落格 | Word 到 Markdown 轉換器 | 快速、安全、線上 DOCX 到 Markdown