Commit Graph

7 Commits

Author SHA1 Message Date
nite 1c1c68a214 refactor: 统一使用 OpenAI 兼容 API,支持自定义 base_url/key/model
- 移除 Gemini 和 Ollama 独立适配,统一使用 ChatOpenAI + base_url
- config.ini 简化为 BASE_URL / API_KEY / MODEL / TEMPERATURE / MAX_RETRIES
- 新增 config.example.ini 示例配置
- 移除 langchain-google-genai / langchain-ollama / pymupdf 依赖
- main.py 新增断点续跑:跳过已有 index.md / index_refined.md
- LLM 请求支持 max_retries 自动重试(默认 3 次)
- 优化 README
2026-04-18 18:42:42 +10:00
nite 3b62c0f478 mod README 2025-11-12 03:22:50 +11:00
nite 40ff3756a5 update: README 2025-10-30 05:14:51 +11:00
nite 3eef042111 refactor(app): Extract PDF conversion logic into a separate module
The main.py script was becoming monolithic, containing all the logic for PDF conversion, image path simplification, and content refinement. This change extracts these core functionalities into a new `pdf_convertor` module.

This refactoring improves the project structure by:
- Enhancing modularity and separation of concerns.
- Making the main.py script a cleaner, high-level orchestrator.
- Improving code readability and maintainability.

The functions `convert_pdf_to_markdown`, `save_md_images`, and `refine_content` are now imported from the `pdf_convertor` module and called from the main execution block.
2025-10-27 20:02:02 +11:00
nite 4f29d5c814 feat(llm): Send images to model and enhance processing prompt 2025-10-25 22:51:54 +11:00
nite 37d4facee3 feat: Enable batch processing of PDF files and update README 2025-10-22 20:56:17 +11:00
nite ad212a35af init 2025-10-22 17:10:29 +11:00