Skip to content

Commit

Permalink
community: Include PDF ID in MathPix metadata (#15629)
Browse files Browse the repository at this point in the history
- **Description:** Includes the PDF ID in the MathPix document metadata.
This is useful in case you need to re-request a processed PDF from the
MathPix API later.
  • Loading branch information
Chad Norvell authored Jan 7, 2024
1 parent d2a686b commit f6226d4
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion libs/community/langchain_community/document_loaders/pdf.py
Original file line number Diff line number Diff line change
Expand Up @@ -518,7 +518,7 @@ def load(self) -> List[Document]:
contents = self.get_processed_pdf(pdf_id)
if self.should_clean_pdf:
contents = self.clean_pdf(contents)
metadata = {"source": self.source, "file_path": self.source}
metadata = {"source": self.source, "file_path": self.source, "pdf_id": pdf_id}
return [Document(page_content=contents, metadata=metadata)]


Expand Down

0 comments on commit f6226d4

Please sign in to comment.