Adobe PDF Text Grabber Excel

In this article, I will walk through how to grab text from Adobe PDFs with Excel VBA. It's a simple process of using the mouse clicks, clipboard, and copy and paste to pull information from the PDF into Microsoft Excel.

Code:

Private Declare PtrSafe Sub mouse_event Lib "user32" (ByVal dwflags As Long, ByVal dx As Long, ByVal dy As Long, ByVal cButtons As Long, ByVal dwextrainfo As LongPtr)
Private Declare PtrSafe Function GetCursorPos Lib "user32" (lpPoint As POINTAPI) As Long
Private Declare PtrSafe Function SetCursorPos Lib "user32" (ByVal X As Long, ByVal Y As Long) As Long

 Private Declare PtrSafe Function GetAsyncKeyState Lib "user32" _
            (ByVal vKey As Long) As Integer
            
Private Const VK_LBUTTON = &H1
Private Const VK_F9 = &H78
Private Const keyeventf_keyup = 2

Public Declare PtrSafe Sub keybd_event Lib "user32" (ByVal BYK As Byte, ByVal bscan As Byte, dwflags As Long, ByVal dwextrainfo As Long)

Const MOUSEEVENTF_LEFTDOWN = &H2
Const MOUSEEVENTF_LEFTUP = &H4
Const MOUSEEVENTF_MIDDLEDOWN = &H20
Const MOUSEEVENTF_MIDDLEUP = &H40
Const MOUSEEVENTF_MOVE = &H1
Const MOUSEEVENTF_ABSOLUTE = &H8000
Const MOUSEEVENTF_RIGHTDOWN = &H8
Const MOUSEEVENTF_RIGHTUP = &H10

Private Type POINTAPI
X As Long
Y As Long
End Type

Public Declare PtrSafe Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)


Public Declare PtrSafe Function OpenClipboard Lib "user32" (ByVal hwnd As Long) As Long
Public Declare PtrSafe Function EmptyClipboard Lib "user32" () As Long
Public Declare PtrSafe Function CloseClipboard Lib "user32" () As Long


Sub Form_KeyDown()
Dim Point As POINTAPI
Dim Text As String
Dim X As Long, Y As Long

Dim DataObj As MSForms.DataObject
 
Call ClearClipboard

Dim numberholder As Double
numberholder = Range("E1").Value

line9:
Application.Wait (Now + TimeValue("00:00:02"))
AppActivate Application.Caption
DoEvents
answer = MsgBox("Do you want to record another mouse position to left click?", vbYesNo)
If answer = 6 Then
numberholder = numberholder + 1

Do Until GetAsyncKeyState(VK_F9)
    'exit when F9 key is pressed
    DoEvents
    If (GetAsyncKeyState(VK_LBUTTON) < 0) Then
       GetCursorPos Point
    X = Point.X
    Y = Point.Y
    Range("F" & numberholder).Value = X
    Range("G" & numberholder).Value = Y
    Range("E1").Value = Range("E1").Value + 1
    keybd_event VK_LBUTTON, 1, keyeventf_keyup, 0
    Beep
    GoTo line9:
    End If
Loop
Else
GoTo line10:
End If





line10:
numberholder = 1
secondloop = 2
Do Until numberholder > Range("E1").Value
X = Range("F" & numberholder).Value
Y = Range("G" & numberholder).Value
SetCursorPos X, Y
mouse_event MOUSEEVENTF_LEFTDOWN, 0&, 0&, 0&, 0&
mouse_event MOUSEEVENTF_LEFTUP, 0&, 0&, 0&, 0&
mouse_event MOUSEEVENTF_LEFTDOWN, 0&, 0&, 0&, 0&
mouse_event MOUSEEVENTF_LEFTUP, 0&, 0&, 0&, 0&

Application.SendKeys ("^c"), True
 Sleep 100
Application.SendKeys ("{NUMLOCK}")
Set DataObj = New MSForms.DataObject
DataObj.GetFromClipboard
strpaste = DataObj.GetText(1)

Range("H" & numberholder).Value = strpaste
numberholder = numberholder + 1
 Sleep 100
line11:
Loop
End Sub

Public Function ClearClipboard()
  OpenClipboard (0&)
  EmptyClipboard
  CloseClipboard
End Function


Quick recap of the code:


This section runs when the code reaches the Call Clearclipboard line, which runs this function that clears the clipboard and closes it.

This section sends ^c (Copy), waits 100 milliseconds with Sleep 100, sends Num LOCK since send keys hits num lock and prevents another send keys unless you press num lock again, creates a MSForms dataobject, and sets a variable equal to the clipboard's text. This variable is then filled into Range("H" & numberholder).value which is Range("H1") or Range("H2") depending on the iteration as indicated by cell E1 which iterates up as you do more and more clicks while column F and column G iterate up also recording the respective X and Y pixel locations on the PC monitor.

Comments are closed.