How To Write Code To Extract A Specific Text And Integer On The Same Line From A Pdf File Using Python?
The below is the data I am having in a pdf file where I would like to extract the integer number 100 in the line 'US stock price 100' using Keyword as 'US stock price' using py
Solution 1:
You can try using the package tika.
from tika import parser
raw = parser.from_file('test.pdf')
print(raw['myText'])
Solution 2:
Below is the code to search for the keyword in PDF file.
import PyPDF2
import re
object = PyPDF2.PdfFileReader("test.pdf")
numPages = object.getNumPages()
string = "US stock price"for i inrange(0, numPages):
pageObj = object.getPage(i)
print("this is page " + str(i))
txt = pageObj.extractText()
resSearch = re.search(string, txt)
print(resSearch)
Post a Comment for "How To Write Code To Extract A Specific Text And Integer On The Same Line From A Pdf File Using Python?"