Skip to content Skip to sidebar Skip to footer

Python Ctypes How To Read A Byte From A Character Array Passed To Nasm

UPDATE: I solved this problem with the help of Mark Tolonen's answer below. Here is the solution (but I'm puzzled by one thing): I begin with the encoding string shown in Mark

Solution 1:

Your name-reading code would return a list of Unicode strings. The following would encode a list of Unicode strings into an array of strings to be passed to a function taking a POINTER(c_char_p):

>>>import ctypes>>>names = ['Mark','John','Craig']>>>ca = (ctypes.c_char_p * len(names))(*(name.encode() for name in names))>>>ca
<__main__.c_char_p_Array_3 object at 0x000001DB7CF5F6C8>
>>>ca[0]
b'Mark'
>>>ca[1]
b'John'
>>>ca[2]
b'Craig'

If ca is passed to your function as the first parameter, the address of that array would be in rcx per x64 calling convention. The following C code and its disassembly shows how the VS2017 Microsoft compiler reads it:

DLL code (test.c)

#define API __declspec(dllexport)

int API func(const char** instr)
{
    return (instr[0][0] << 16) + (instr[1][0] << 8) + instr[2][0];
}

Disassembly (compiled optimized to keep short, my comments added)

; Listing generated byMicrosoft (R) Optimizing Compiler Version 19.00.24215.1

include listing.inc

INCLUDELIB LIBCMT
INCLUDELIB OLDNAMES

PUBLIC  func
; Function compile flags: /Ogtpy
; File c:\test.c
_TEXT   SEGMENT
instr$ = 8
func    PROC

; 5    :     return (instr[0][0] << 16) + (instr[1][0] << 8) + instr[2][0];

  00000488b5108      mov     rdx, QWORD PTR [rcx+8]  ; address of 2nd string00004488b01         mov     rax, QWORD PTR [rcx]    ; address of 1st string00007488b4910      mov     rcx, QWORD PTR [rcx+16] ; address of 3rd string0000b440f be 02      movsx   r8d, BYTE PTR [rdx]     ; 1st char of 2nd string, r8d=4a
  0000f0f be 00         movsx   eax, BYTE PTR [rax]     ; 1st char of 1st string, eax=4d
  000120f be 11         movsx   edx, BYTE PTR [rcx]     ; 1st char of 3rd string, edx=4300015 c1 e0 08         shl     eax, 8                  ; eax=4d00
  000184103 c0         add     eax, r8d                ; eax=4d4a
  0001b c1 e0 08         shl     eax, 8                  ; eax=4d4a00
  0001e 03 c2            add     eax, edx                ; eax=4d4a43

; 6    : }

  00020 c3               ret     0
func    ENDP
_TEXT   ENDS
END

Python code (test.py)

from ctypes import *

dll = CDLL('test')
dll.func.argtypes = POINTER(c_char_p),
dll.restype = c_int

names = ['Mark','John','Craig']
ca = (c_char_p * len(names))(*(name.encode() for name in names))
print(hex(dll.func(ca)))

Output:

0x4d4a43

That's the correct ASCII codes for 'M', 'J', and 'C'.

Post a Comment for "Python Ctypes How To Read A Byte From A Character Array Passed To Nasm"