Skip to content Skip to sidebar Skip to footer

How To Write Code To Extract A Specific Text And Integer On The Same Line From A Pdf File Using Python?

The below is the data I am having in a pdf file where I would like to extract the integer number 100 in the line 'US stock price 100' using Keyword as 'US stock price' using py

Solution 1:

You can try using the package tika.

from tika import parser

raw = parser.from_file('test.pdf')
print(raw['myText'])

Solution 2:

Below is the code to search for the keyword in PDF file.

import PyPDF2
import re

object = PyPDF2.PdfFileReader("test.pdf")
numPages = object.getNumPages()
string = "US stock price"for i inrange(0, numPages):
    pageObj = object.getPage(i)
    print("this is page " + str(i)) 
    txt = pageObj.extractText() 
    resSearch = re.search(string, txt)
    print(resSearch)

Post a Comment for "How To Write Code To Extract A Specific Text And Integer On The Same Line From A Pdf File Using Python?"