Skip to content

Commit 54d296f

Browse files
committed
add support for msgpack timestamp format
1 parent 70f3daf commit 54d296f

File tree

4 files changed

+158
-41
lines changed

4 files changed

+158
-41
lines changed

README.md

Lines changed: 34 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# u-msgpack-python [![Build Status](https://travis-ci.org/vsergeev/u-msgpack-python.svg?branch=master)](https://travis-ci.org/vsergeev/u-msgpack-python) [![GitHub release](https://img.shields.io/github/release/vsergeev/u-msgpack-python.svg?maxAge=7200)](https://github.com/vsergeev/u-msgpack-python) [![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/vsergeev/u-msgpack-python/blob/master/LICENSE)
22

3-
u-msgpack-python is a lightweight [MessagePack](http://msgpack.org/) serializer and deserializer module written in pure Python, compatible with both Python 2 and 3, as well CPython and PyPy implementations of Python. u-msgpack-python is fully compliant with the latest [MessagePack specification](https://github.com/msgpack/msgpack/blob/master/spec.md). In particular, it supports the new binary, UTF-8 string, and application-defined ext types.
3+
u-msgpack-python is a lightweight [MessagePack](http://msgpack.org/) serializer and deserializer module written in pure Python, compatible with both Python 2 and 3, as well CPython and PyPy implementations of Python. u-msgpack-python is fully compliant with the latest [MessagePack specification](https://github.com/msgpack/msgpack/blob/master/spec.md). In particular, it supports the new binary, UTF-8 string, application-defined ext, and timestamp types.
44

55
u-msgpack-python is currently distributed on [PyPI](https://pypi.python.org/pypi/u-msgpack-python) and as a single file: [umsgpack.py](https://raw.github.com/vsergeev/u-msgpack-python/master/umsgpack.py).
66

@@ -81,18 +81,18 @@ b'\x01\x02\x03'
8181

8282
Serializing and deserializing application-defined types with Ext handlers:
8383
``` python
84-
>>> umsgpack.packb([complex(1,2), datetime.datetime.now()],
84+
>>> umsgpack.packb([complex(1,2), decimal.Decimal("0.31")],
8585
... ext_handlers = {
8686
... complex: lambda obj: umsgpack.Ext(0x30, struct.pack("ff", obj.real, obj.imag)),
87-
... datetime.datetime: lambda obj: umsgpack.Ext(0x40, obj.strftime("%Y%m%dT%H:%M:%S.%f").encode()),
88-
... })
89-
b'\x92\xd70\x00\x00\x80?\x00\x00\x00@\xc7\x18@20161017T00:12:53.719377'
87+
... decimal.Decimal: lambda obj: umsgpack.Ext(0x40, str(obj).encode()),
88+
... })
89+
b'\x92\xd70\x00\x00\x80?\x00\x00\x00@\xd6@0.31'
9090
>>> umsgpack.unpackb(_,
9191
... ext_handlers = {
9292
... 0x30: lambda ext: complex(*struct.unpack("ff", ext.data)),
93-
... 0x40: lambda ext: datetime.datetime.strptime(ext.data.decode(), "%Y%m%dT%H:%M:%S.%f"),
94-
... })
95-
[(1+2j), datetime.datetime(2016, 10, 17, 0, 12, 53, 719377)]
93+
... 0x40: lambda ext: decimal.Decimal(ext.data.decode()),
94+
... })
95+
[(1+2j), Decimal('0.31')]
9696
>>>
9797
```
9898

@@ -120,37 +120,35 @@ custom types to callables that pack the type into an Ext object. The callable
120120
should accept the custom type object as an argument and return a packed
121121
`umsgpack.Ext` object.
122122

123-
Example for packing `set`, `complex`, and `datetime.datetime` types into Ext
123+
Example for packing `set`, `complex`, and `decimal.Decimal` types into Ext
124124
objects with type codes 0x20, 0x30, and 0x40, respectively:
125125

126126
``` python
127-
>>> umsgpack.packb([1, True, {"foo", 2}, complex(3, 4), datetime.datetime.now()],
127+
>>> umsgpack.packb([1, True, {"foo", 2}, complex(3, 4), decimal.Decimal("0.31")],
128128
... ext_handlers = {
129129
... set: lambda obj: umsgpack.Ext(0x20, umsgpack.packb(list(obj))),
130130
... complex: lambda obj: umsgpack.Ext(0x30, struct.pack("ff", obj.real, obj.imag)),
131-
... datetime.datetime: lambda obj: umsgpack.Ext(0x40, obj.strftime("%Y%m%dT%H:%M:%S.%f").encode()),
132-
... })
133-
b'\x95\x01\xc3\xc7\x06 \x92\xa3foo\x02\xd70\x00\x00@@\x00\x00\x80@\xc7\x18@20161015T02:28:35.666425'
131+
... decimal.Decimal: lambda obj: umsgpack.Ext(0x40, str(obj).encode()),
132+
... })
133+
b'\x95\x01\xc3\xc7\x06 \x92\xa3foo\x02\xd70\x00\x00@@\x00\x00\x80@\xd6@0.31'
134134
>>>
135135
```
136-
137136
Similarly, the unpacking functions accept an optional `ext_handlers` dictionary
138137
that maps Ext type codes to callables that unpack the Ext into a custom object.
139138
The callable should accept a `umsgpack.Ext` object as an argument and return an
140139
unpacked custom type object.
141140

142141
Example for unpacking Ext objects with type codes 0x20, 0x30, and 0x40, into
143-
`set`, `complex`, and `datetime.datetime` typed objects, respectively:
142+
`set`, `complex`, and `decimal.Decimal` typed objects, respectively:
144143

145144
``` python
146-
>>> umsgpack.unpackb(b'\x95\x01\xc3\xc7\x06 \x92\xa3foo\x02\xd70\x00\x00@@\x00\x00\x80@' \
147-
... b'\xc7\x18@20161015T02:28:35.666425',
145+
>>> umsgpack.unpackb(b'\x95\x01\xc3\xc7\x06 \x92\xa3foo\x02\xd70\x00\x00@@\x00\x00\x80@\xd6@0.31',
148146
... ext_handlers = {
149147
... 0x20: lambda ext: set(umsgpack.unpackb(ext.data)),
150148
... 0x30: lambda ext: complex(*struct.unpack("ff", ext.data)),
151-
... 0x40: lambda ext: datetime.datetime.strptime(ext.data.decode(), "%Y%m%dT%H:%M:%S.%f"),
152-
... })
153-
[1, True, {'foo', 2}, (3+4j), datetime.datetime(2016, 10, 15, 2, 28, 35, 666425)]
149+
... 0x40: lambda ext: decimal.Decimal(ext.data.decode()),
150+
... })
151+
[1, True, {'foo', 2}, (3+4j), Decimal('0.31')]
154152
>>>
155153
```
156154

@@ -341,6 +339,20 @@ If a non-byte-string argument is passed to `umsgpack.unpackb()`, it will raise a
341339
>>>
342340
```
343341

342+
* `UnsupportedTimestampException`: Unsupported timestamp encountered during unpacking.
343+
344+
The official timestamp extension type supports 32-bit, 64-bit and 96-bit
345+
formats. This exception is thrown if a timestamp extension type with an
346+
unsupported format is encountered.
347+
348+
``` python
349+
# Attempt to unpack invalid timestamp
350+
>>> umsgpack.unpackb(b"\xd5\xff\x01\x02")
351+
...
352+
umsgpack.UnsupportedTimestampException: unsupported timestamp with data length 2
353+
>>>
354+
```
355+
344356
* `ReservedCodeException`: Reserved code encountered during unpacking.
345357

346358
``` python
@@ -387,6 +399,8 @@ If a non-byte-string argument is passed to `umsgpack.unpackb()`, it will raise a
387399
* The msgpack array format is unpacked into a Python list, unless it is the key of a map, in which case it is unpacked into a Python tuple
388400
* Python tuples and lists are both packed into the msgpack array format
389401
* Python float types are packed into the msgpack float32 or float64 format depending on the system's `sys.float_info`
402+
* The Python `datetime.datetime` type is packed into, and unpacked from, the msgpack `timestamp` format
403+
* Note that this Python type only supports microsecond resolution, while the msgpack `timestamp` format supports nanosecond resolution. Timestamps with finer than microsecond resolution will lose precision during unpacking.
390404

391405
## Testing
392406

msgpack.org.md

Lines changed: 15 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -82,26 +82,22 @@ b'\x01\x02\x03'
8282

8383
Serializing and deserializing application-defined types with Ext handlers:
8484
``` python
85-
>>> umsgpack.packb([complex(1,2), datetime.datetime.now()],
86-
... ext_handlers = {
87-
... complex: lambda obj: umsgpack.Ext(0x30,
88-
... struct.pack("ff", obj.real, obj.imag)),
89-
... datetime.datetime: lambda obj: umsgpack.Ext(0x40,
90-
... obj.strftime("%Y%m%dT%H:%M:%S.%f").encode()),
91-
... })
92-
b'\x92\xd70\x00\x00\x80?\x00\x00\x00@\xc7\x18@20161017T00:12:53.7'
93-
b'19377'
85+
>>> umsgpack.packb([complex(1,2), decimal.Decimal("0.31")],
86+
... ext_handlers = {
87+
... complex: lambda obj:
88+
... umsgpack.Ext(0x30, struct.pack("ff", obj.real, obj.imag)),
89+
... decimal.Decimal: lambda obj:
90+
... umsgpack.Ext(0x40, str(obj).encode()),
91+
... })
92+
b'\x92\xd70\x00\x00\x80?\x00\x00\x00@\xd6@0.31'
9493
>>> umsgpack.unpackb(_,
95-
... ext_handlers = {
96-
... 0x30: lambda ext:
97-
... complex(*struct.unpack("ff", ext.data)),
98-
... 0x40: lambda ext:
99-
... datetime.datetime.strptime(
100-
... ext.data.decode(),
101-
... "%Y%m%dT%H:%M:%S.%f"
102-
... ),
103-
... })
104-
[(1+2j), datetime.datetime(2016, 10, 17, 0, 12, 53, 719377)]
94+
... ext_handlers = {
95+
... 0x30: lambda ext:
96+
... complex(*struct.unpack("ff", ext.data)),
97+
... 0x40: lambda ext:
98+
... decimal.Decimal(ext.data.decode()),
99+
... })
100+
[(1+2j), Decimal('0.31')]
105101
>>>
106102
```
107103

test_umsgpack.py

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
import sys
1212
import struct
1313
import unittest
14+
import datetime
1415
import io
1516
from collections import OrderedDict, namedtuple
1617

@@ -116,6 +117,27 @@
116117
["empty array", [], b"\x90"],
117118
# Empty Map
118119
["empty map", {}, b"\x80"],
120+
# 32-bit Timestamp
121+
["32-bit timestamp", datetime.datetime(1970, 1, 1, 0, 0, 0, 0, umsgpack._utc_tzinfo),
122+
b"\xd6\xff\x00\x00\x00\x00"],
123+
["32-bit timestamp", datetime.datetime(2000, 1, 1, 10, 5, 2, 0, umsgpack._utc_tzinfo),
124+
b"\xd6\xff\x38\x6d\xd1\x4e"],
125+
# 64-bit Timestamp
126+
["64-bit timestamp", datetime.datetime(2000, 1, 1, 10, 5, 2, 1234, umsgpack._utc_tzinfo),
127+
b"\xd7\xff\x00\x4b\x51\x40\x38\x6d\xd1\x4e"],
128+
["64-bit timestamp", datetime.datetime(2200, 1, 1, 10, 5, 2, 0, umsgpack._utc_tzinfo),
129+
b"\xd7\xff\x00\x00\x00\x01\xb0\x9e\xa6\xce"],
130+
["64-bit timestamp", datetime.datetime(2200, 1, 1, 10, 5, 2, 1234, umsgpack._utc_tzinfo),
131+
b"\xd7\xff\x00\x4b\x51\x41\xb0\x9e\xa6\xce"],
132+
# 96-bit Timestamp
133+
["96-bit timestamp", datetime.datetime(1900, 1, 1, 10, 5, 2, 0, umsgpack._utc_tzinfo),
134+
b"\xc7\x0c\xff\x00\x00\x00\x00\xff\xff\xff\xff\x7c\x56\x0f\x4e"],
135+
["96-bit timestamp", datetime.datetime(1900, 1, 1, 10, 5, 2, 1234, umsgpack._utc_tzinfo),
136+
b"\xc7\x0c\xff\x00\x12\xd4\x50\xff\xff\xff\xff\x7c\x56\x0f\x4e"],
137+
["96-bit timestamp", datetime.datetime(3000, 1, 1, 10, 5, 2, 0, umsgpack._utc_tzinfo),
138+
b"\xc7\x0c\xff\x00\x00\x00\x00\x00\x00\x00\x07\x91\x5f\x59\xce"],
139+
["96-bit timestamp", datetime.datetime(3000, 1, 1, 10, 5, 2, 1234, umsgpack._utc_tzinfo),
140+
b"\xc7\x0c\xff\x00\x12\xd4\x50\x00\x00\x00\x07\x91\x5f\x59\xce"],
119141
]
120142

121143
composite_test_vectors = [
@@ -262,6 +284,9 @@
262284
# Reserved code (0xc1)
263285
["reserved code", b"\xc1",
264286
umsgpack.ReservedCodeException],
287+
# Unsupported timestamp (unsupported data length)
288+
["unsupported timestamp", b"\xc7\x02\xff\xaa\xbb",
289+
umsgpack.UnsupportedTimestampException],
265290
# Invalid string (non utf-8)
266291
["invalid string", b"\xa1\x80",
267292
umsgpack.InvalidStringException],
@@ -318,6 +343,7 @@
318343
"UnsupportedTypeException",
319344
"InsufficientDataException",
320345
"InvalidStringException",
346+
"UnsupportedTimestampException",
321347
"ReservedCodeException",
322348
"UnhashableKeyException",
323349
"DuplicateKeyException",
@@ -519,7 +545,7 @@ def test_namespacing(self):
519545
exported_vars = list(filter(lambda x: not x.startswith("_"),
520546
dir(umsgpack)))
521547
# Ignore imports
522-
exported_vars = list(filter(lambda x: x != "struct" and x != "collections" and x !=
548+
exported_vars = list(filter(lambda x: x != "struct" and x != "collections" and x != "datetime" and x !=
523549
"sys" and x != "io" and x != "xrange", exported_vars))
524550

525551
self.assertTrue(len(exported_vars) == len(exported_vars_test_vector))

umsgpack.py

Lines changed: 82 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@
4545
"""
4646
import struct
4747
import collections
48+
import datetime
4849
import sys
4950
import io
5051

@@ -168,6 +169,11 @@ class InvalidStringException(UnpackException):
168169
pass
169170

170171

172+
class UnsupportedTimestampException(UnpackException):
173+
"Unsupported timestamp format encountered during unpacking."
174+
pass
175+
176+
171177
class ReservedCodeException(UnpackException):
172178
"Reserved code encountered during unpacking."
173179
pass
@@ -341,6 +347,29 @@ def _pack_ext(obj, fp, options):
341347
raise UnsupportedTypeException("huge ext data")
342348

343349

350+
def _pack_ext_timestamp(obj, fp, options):
351+
delta = obj - _epoch
352+
seconds = delta.seconds + delta.days * 86400
353+
microseconds = delta.microseconds
354+
355+
if microseconds == 0 and 0 <= seconds <= 2**32 - 1:
356+
# 32-bit timestamp
357+
fp.write(b"\xd6\xff" +
358+
struct.pack(">I", seconds))
359+
elif 0 <= seconds <= 2**34 - 1:
360+
# 64-bit timestamp
361+
value = ((microseconds * 1000) << 34) | seconds
362+
fp.write(b"\xd7\xff" +
363+
struct.pack(">Q", value))
364+
elif -2**63 <= abs(seconds) <= 2**63 - 1:
365+
# 96-bit timestamp
366+
fp.write(b"\xc7\x0c\xff" +
367+
struct.pack(">I", microseconds * 1000) +
368+
struct.pack(">q", seconds))
369+
else:
370+
raise UnsupportedTypeException("huge timestamp")
371+
372+
344373
def _pack_array(obj, fp, options):
345374
if len(obj) <= 15:
346375
fp.write(struct.pack("B", 0x90 | len(obj)))
@@ -428,6 +457,8 @@ def _pack2(obj, fp, **options):
428457
_pack_array(obj, fp, options)
429458
elif isinstance(obj, dict):
430459
_pack_map(obj, fp, options)
460+
elif isinstance(obj, datetime.datetime):
461+
_pack_ext_timestamp(obj, fp, options)
431462
elif isinstance(obj, Ext):
432463
_pack_ext(obj, fp, options)
433464
elif ext_handlers:
@@ -498,6 +529,8 @@ def _pack3(obj, fp, **options):
498529
_pack_array(obj, fp, options)
499530
elif isinstance(obj, dict):
500531
_pack_map(obj, fp, options)
532+
elif isinstance(obj, datetime.datetime):
533+
_pack_ext_timestamp(obj, fp, options)
501534
elif isinstance(obj, Ext):
502535
_pack_ext(obj, fp, options)
503536
elif ext_handlers:
@@ -703,7 +736,15 @@ def _unpack_ext(code, fp, options):
703736
else:
704737
raise Exception("logic error, not ext: 0x%02x" % ord(code))
705738

706-
ext = Ext(ord(_read_except(fp, 1)), _read_except(fp, length))
739+
ext_type = struct.unpack("b", _read_except(fp, 1))[0]
740+
ext_data = _read_except(fp, length)
741+
742+
# Timestamp extension
743+
if ext_type == -1:
744+
return _unpack_ext_timestamp(code, ext_data, options)
745+
746+
# Application extension
747+
ext = Ext(ext_type, ext_data)
707748

708749
# Unpack with ext handler, if we have one
709750
ext_handlers = options.get("ext_handlers")
@@ -713,6 +754,28 @@ def _unpack_ext(code, fp, options):
713754
return ext
714755

715756

757+
def _unpack_ext_timestamp(code, data, options):
758+
if len(data) == 4:
759+
# 32-bit timestamp
760+
seconds = struct.unpack(">I", data)[0]
761+
microseconds = 0
762+
elif len(data) == 8:
763+
# 64-bit timestamp
764+
value = struct.unpack(">Q", data)[0]
765+
seconds = value & 0x3ffffffff
766+
microseconds = (value >> 34) // 1000
767+
elif len(data) == 12:
768+
# 96-bit timestamp
769+
seconds = struct.unpack(">q", data[4:12])[0]
770+
microseconds = struct.unpack(">I", data[0:4])[0] // 1000
771+
else:
772+
raise UnsupportedTimestampException(
773+
"unsupported timestamp with data length %d" % len(data))
774+
775+
return _epoch + datetime.timedelta(seconds=seconds,
776+
microseconds=microseconds)
777+
778+
716779
def _unpack_array(code, fp, options):
717780
if (ord(code) & 0xf0) == 0x90:
718781
length = (ord(code) & ~0xf0)
@@ -801,6 +864,8 @@ def _unpack2(fp, **options):
801864
Insufficient data to unpack the serialized object.
802865
InvalidStringException(UnpackException):
803866
Invalid UTF-8 string encountered during unpacking.
867+
UnsupportedTimestampException(UnpackException):
868+
Unsupported timestamp format encountered during unpacking.
804869
ReservedCodeException(UnpackException):
805870
Reserved code encountered during unpacking.
806871
UnhashableKeyException(UnpackException):
@@ -843,6 +908,8 @@ def _unpack3(fp, **options):
843908
Insufficient data to unpack the serialized object.
844909
InvalidStringException(UnpackException):
845910
Invalid UTF-8 string encountered during unpacking.
911+
UnsupportedTimestampException(UnpackException):
912+
Unsupported timestamp format encountered during unpacking.
846913
ReservedCodeException(UnpackException):
847914
Reserved code encountered during unpacking.
848915
UnhashableKeyException(UnpackException):
@@ -888,6 +955,8 @@ def _unpackb2(s, **options):
888955
Insufficient data to unpack the serialized object.
889956
InvalidStringException(UnpackException):
890957
Invalid UTF-8 string encountered during unpacking.
958+
UnsupportedTimestampException(UnpackException):
959+
Unsupported timestamp format encountered during unpacking.
891960
ReservedCodeException(UnpackException):
892961
Reserved code encountered during unpacking.
893962
UnhashableKeyException(UnpackException):
@@ -934,6 +1003,8 @@ def _unpackb3(s, **options):
9341003
Insufficient data to unpack the serialized object.
9351004
InvalidStringException(UnpackException):
9361005
Invalid UTF-8 string encountered during unpacking.
1006+
UnsupportedTimestampException(UnpackException):
1007+
Unsupported timestamp format encountered during unpacking.
9371008
ReservedCodeException(UnpackException):
9381009
Reserved code encountered during unpacking.
9391010
UnhashableKeyException(UnpackException):
@@ -966,13 +1037,23 @@ def __init():
9661037
global load
9671038
global loads
9681039
global compatibility
1040+
global _epoch
1041+
global _utc_tzinfo
9691042
global _float_precision
9701043
global _unpack_dispatch_table
9711044
global xrange
9721045

9731046
# Compatibility mode for handling strings/bytes with the old specification
9741047
compatibility = False
9751048

1049+
if sys.version_info[0] == 3:
1050+
_utc_tzinfo = datetime.timezone.utc
1051+
else:
1052+
_utc_tzinfo = None
1053+
1054+
# Calculate epoch datetime
1055+
_epoch = datetime.datetime(1970, 1, 1, tzinfo=_utc_tzinfo)
1056+
9761057
# Auto-detect system float precision
9771058
if sys.float_info.mant_dig == 53:
9781059
_float_precision = "double"

0 commit comments

Comments
 (0)