slide-translate

Go to file

nite f1214be148 feat(llm-integration): Enhance prompt clarity and unify PDF attachment

This commit improves the structure and clarity of the prompt sent to the LLM (Gemini/OpenAI) in the `refine_content` function.

Changes include:
*   Adding explicit introductory text for the Markdown, individual images, and PDF sections to guide the LLM on the purpose of each input.
*   Introducing clear "START OF IMAGE" and "END OF IMAGE" delimiters for each image to better define their boundaries.
*   Unifying the PDF attachment mechanism for both Gemini and OpenAI providers, simplifying the code and ensuring consistent handling of PDF input.

These changes aim to improve the LLM's understanding of the provided content, leading to more accurate and relevant refinements.

2025-11-12 19:14:19 +11:00

.gitignore

feat: Improve content refinement with SystemMessage and prompt updates

2025-11-11 23:39:47 +11:00

.python-version

refactor(app): Extract PDF conversion logic into a separate module

2025-10-27 20:02:02 +11:00

convert.py

feat: Introduce OpenAI LLM provider and update API key handling

2025-11-12 02:51:18 +11:00

llm.py

feat: Introduce OpenAI LLM provider and update API key handling

2025-11-12 02:51:18 +11:00

main.py

feat(llm): Add Ollama provider and PyMuPDF image extraction

2025-11-11 22:35:23 +11:00

pdf_convertor_prompt.md

docs: Clarify image processing rules in PDF conversion prompt

2025-11-12 18:42:59 +11:00

pdf_convertor.py

feat(llm-integration): Enhance prompt clarity and unify PDF attachment

2025-11-12 19:14:19 +11:00

pyproject.toml

feat: Introduce OpenAI LLM provider and update API key handling

2025-11-12 02:51:18 +11:00

README.md

mod README

2025-11-12 03:22:50 +11:00

refine.py

feat: Introduce OpenAI LLM provider and update API key handling

2025-11-12 02:51:18 +11:00

uv.lock

feat: Introduce OpenAI LLM provider and update API key handling

2025-11-12 02:51:18 +11:00

README.md

留子课程幻灯片整理翻译工具

本工具旨在为海外留学生提供一个高效、智能的课程资料处理解决方案，以应对他们在学习过程中遇到的语言障碍和复杂的幻灯片整理挑战。

许多留学生在面对英文或其他语言的课程幻灯片时，不仅需要理解专业内容，还要克服语言隔阂，并且手动整理和翻译耗时费力，容易遗漏关键信息，尤其是在处理含有大量图表的幻灯片时。

程序功能

自动化内容提取与转换： 将 PDF 格式的课程幻灯片自动转换为结构化的 Markdown 格式，便于后续编辑和阅读。
智能格式优化与增强： 利用大型语言模型 (LLM) 进行深度处理，对转换后的 Markdown 内容进行微调，优化版面格式，并智能地为图片增加注解，提升理解效率。
精准专业翻译： 将内容翻译成简体中文，同时智能识别并保留专业名词的英文原文注解，确保专业术语的准确性，避免翻译歧义，让学生在中文语境下理解内容的同时，也能熟悉和掌握专业英文表达。

前置要求

Nvidia GPU
LLMs API Key
- Gemini
- OpenAI
- Ollama

安装

安装 uv： 如果您尚未安装 uv，请按照官方文档进行安装。通常可以使用 pip 安装：
```
pip install uv
```
安装依赖： 在项目根目录下，使用 uv 安装所有必要的依赖：
```
uv venv
uv sync
```

配置

本项目使用 config.ini 文件来管理 API 密钥。请确保在运行程序之前，在项目根目录下创建 config.ini 文件，并按照以下格式配置:

[llm]
# openai/gemini/ollama
PROVIDER = openai
GEMINI_MODEL_NAME = gemini-2.5-flash
OPENAI_MODEL_NAME = gpt-5-mini
OLLAMA_MODEL_NAME = gemma3:latest
OLLAMA_BASE_URL = http://localhost:11434
TEMPERATURE = 0.7
GOOGLE_API_KEY =
OPENAI_API_KEY=

使用方法

将需要处理的 PDF 文件放入 input 目录下。
运行 main.py 脚本。程序将自动处理 input 目录下的所有 PDF 文件。请使用 uv run 命令来执行脚本，以确保在正确的虚拟环境中运行：
```
uv run python main.py
```

引用

常见问题

docling 转换 PDF 时报错

可能是 PDF 文件不规范导致的，可以尝试使用 ghostscript 规范文件。

gs -o <output.pdf> -sDEVICE=pdfwrite -dPDFSETTINGS=/default <input.pdf>

README.md Unescape Escape

留子课程幻灯片整理翻译工具

程序功能

前置要求

安装

配置

使用方法

引用

常见问题

docling 转换 PDF 时报错

README.md