[ What does %d mean in struct.pack? ]
I was reading though a library of python code, and I'm stumped by this statement:
struct.pack( "<ii%ds"%len(value), ParameterTypes.String, len(value), value.encode("UTF8") )
I understand everything but%d
, and I'm not sure why the length of value
is being packed in twice.
As I understand it, the structure will have little endian encoding (<
) and will contain two integers (ii
) followed by %d, followed by a string (s
).
What is the significance of %d
?
Answer 1
Aarrrgh the mind boggles ....
@S.Lott: """I don't think the number is particularly important, since Python will tend to pack correctly without it.""" -1. Don't think; investigate. Without a number means merely that the number defaults to 1. Tends to pack correctly??? Perhaps you think that struct.pack("s", foo)
works the same way as "%s" % foo
? It doesn't; docs say """For the 's' format character, the count is interpreted as the size of the string, not a repeat count like for the other format characters; for example, '10s' means a single 10-byte string, while '10c' means 10 characters. For packing, the string is truncated or padded with null bytes as appropriate to make it fit."""
@Brendan: -1. value
is not an array (whatever that is); it is patently obviously intended to be a unicode string ... lookee here: value.encode("UTF8")
@Matt Ellen: The line of code that you quote is severely broken. If there are any non-ASCII characters in value
, data will be lost.
Let's break it down:
`struct.pack("<ii%ds"%len(value), ParameterTypes.String, len(value), value.encode("UTF8"))`
Reduce problem space by removing the first item
struct.pack("<i%ds"%len(value), len(value), value.encode("UTF8"))
Now let's suppose that value
is u'\xff\xff'
, so len(value)
is 2.
Let v8
= value.encode('UTF8')
i.e. '\xc3\xbf\xc3\xbf'
.
Note that len(v8)
is 4. Is the penny dropping yet?
So what we now have is
struct.pack("<i2s", 2, v8)
The number 2 is packed as 4 bytes, 02 00 00 00
. The 4-byte string v8
is TRUNCATED (by the length 2 in "2s") to length two. DATA LOSS. FAIL.
The correct way to do what is presumably wanted is:
v8 = value.encode('UTF8')
struct.pack("<ii%ds" % len(v8), ParameterTypes.String, len(v8), v8)
Answer 2
It is an ordinary string format which is being used to create the struct format
Try reading it to begin with as an ordinary string (forget struct
for the moment) ...
"<ii%ds" % len(value)
If, for example, the length of the value iterable is 4 then the string will be, <ii4s
. This is then passed to struct.pack
ready to pack two integers followed by a string of length four bytes from the value
iterable
Answer 3
The %d
means this works in two steps.
Step 1.
"<ii%ds"%len(value)
Creates the struct formatting string of "<ii...some number...s"
.
Step 2.
The resulting formatting string is applied to three values
ParameterTypes.String, len(value), value.encode("UTF8")
Answer 4
It's used to specify that a string (value
) of len(value)
characters is to be packed after those two integers.
If, for instance, value
contained "boo"
then the actual format specifier for pack
would be "<ii3s"
.
Answer 5
The significance of %d
is that it's a formatting parameter for strings:
String Formatting Operations
When broken apart, "<ii%ds" % len(value)
is a bit easier to understand. It is replacing the %d conversion indicator in the string with the return value of len(value)
, typecast appropriately.
>>> str = "<ii%ds"
>>> str % 5
'<ii5s'
>>> str % 3
'<ii3s'