Refactor image data passing in `pdf_convertor.py` to use a direct base64 and mime_type format, aligning with updated API requirements for vision models.
Additionally, the `pdf_convertor_prompt.md` has been significantly refined to improve the clarity and specificity of instructions for the AI model, particularly concerning:
- **Image Content Explanation:** Added detailed rules to ensure the AI only processes existing image references, preserves paths, and focuses on descriptive text.
- **Mathematical Formulas:** Clarified conversion to LaTeX notation.
- **Heading Structure:** Enhanced rules and examples for adjusting heading levels and merging adjacent or duplicate headings to ensure logical document flow.
This commit refactors the content refinement process to leverage `SystemMessage` for the primary prompt, enhancing clarity and adherence to LLM best practices.
The `pdf_convertor.py` file was updated to:
- Import `SystemMessage` from `langchain_core.messages`.
- Modify the `refine_content` function to use `SystemMessage` for the main prompt, moving the prompt content from `human_message_parts`.
- Adjust `human_message_parts` to only contain the Markdown and image data for the `HumanMessage`.
The `pdf_convertor_prompt.md` file was updated to:
- Reformat the prompt with clearer headings and instructions for each task.
- Improve the clarity and conciseness of the instructions for cleaning up characters, explaining image content, and correcting list formatting.
Additionally, `.gitignore` was updated to include `.vscode/` to prevent IDE-specific files from being committed.
These changes improve the structure of the LLM interaction and make the prompt more readable and maintainable.