word2md-cli リリース:ターミナルから Docx を Markdown に変換
Word2MD がコマンドライン版をリリース。バッチ処理、CI パイプライン、スクリプト自動化向けに設計。npx 一行で即使用、オプションの AI 画像 OCR も搭載。
2026年4月
Word2MD.net はブラウザでの高速・プライベートな docx-to-markdown 変換に特化してきました。しかし過去数ヶ月、同じ要望を繰り返し受けていました:「自動化できますか?」開発者は CI で数百ファイルを変換したい、ライターは公開パイプラインにスクリプトを組み込みたい、AI チームは社内ドキュメントを一括で Markdown に前処理して RAG システムに供給したい。そこで word2md-cli をリリースしました。同じ変換エンジンをターミナルで使える、軽量な Node.js コマンドラインツールです。
Install and run in one command
No setup, no config. npx fetches and runs it on demand:
npx word2md-cli input.docx
That's it — you get input.md in the same directory. Prefer a global install?
npm install -g word2md-cli
word2md input.docx
What you can do with it
Convert a single file
word2md input.docx # → input.md next to source
word2md input.docx -o custom.md # custom output path
word2md input.docx --stdout # pipe to another command
The --stdout flag is great for chaining:
word2md report.docx --stdout | pandoc -f markdown -t html -o report.html
Batch convert a whole folder
word2md ./docs/*.docx -d ./markdown/
Ideal for migrating SharePoint exports, Confluence archives, or Google Docs downloads into a modern static site.
Extract text from embedded images (OCR)
Pass --ocr to enable image OCR via PaddleX. Screenshots, diagrams, and scanned pages get their text extracted and inlined into the markdown:
export PADDLEX_OCR_URL="https://..."
export PADDLEX_OCR_TOKEN="..."
word2md input.docx --ocr --ocr-concurrency 4
Or pass credentials as flags:
word2md input.docx --ocr \
--paddlex-url "https://..." \
--paddlex-token "xxx"
Plain text output
Strip markdown syntax for clean prose — useful when feeding docs into LLM pipelines:
word2md input.docx --format text -o plain.txt
CI/CD integration
Drop it into a GitHub Action to auto-convert every docx committed to your repo:
- name: Convert Word docs to Markdown
run: npx word2md-cli docs/*.docx -d site/content/
Combine with Astro, Hugo, or Next.js and you have a self-updating documentation site that accepts Word files as input. Non-technical contributors keep writing in Word. Engineers keep shipping markdown. Everyone wins.
CLI vs. the web app
| Feature | Web app | CLI |
|---|---|---|
| Base conversion | Browser (client-side) | Local Node.js |
| Batch processing | Drag multiple files | Glob patterns, scripts |
| Image OCR | Built-in API | BYO PaddleX credentials |
| Automation | ❌ | ✅ Pipes, cron, CI |
| Live preview | ✅ | ❌ (pipe to a viewer) |
Same conversion engine (mammoth + custom post-processing), same output. The CLI is just the scriptable surface.
Open source
word2md-cli is MIT-licensed on GitHub. Issues, feature requests, and PRs welcome. The code is intentionally small — around 150 lines of TypeScript — so it's easy to audit, fork, or extend with your own rules.
What's next
--watchmode that auto-converts files on save--api-keyflag that uses your Word2MD.net account for OCR (no PaddleX setup needed)- More input formats: PDF, RTF, ODT
Opinions on priority? Drop a note on GitHub.
Meanwhile — go try it:
npx word2md-cli some.docx
Thirty seconds from now you have markdown.