Skip to content

Concurrent logging w/cl-syslog results in memory corruption #92

@appleby

Description

@appleby

While attempting to run the pyquil test suite in parallel via the pytest-xdist plugin, I noticed occasional "Unhandled memory fault" errors like the following.

Based on the error message, this looked similar to quil-lang/qvm#110, so I tried disabling logging in quilc and, sure enough, the errors disappeared.

pyquil/tests/test_operator_estimation.py:1303:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pyquil/operator_estimation.py:938: in measure_observables
    calibr_results, d_calibr_qub_idx = _exhaustive_symmetrization(qc, qubs_calibr, calibr_shots, calibr_prog)
pyquil/operator_estimation.py:1083: in _exhaustive_symmetrization
    total_prog_symm_native = qc.compiler.quil_to_native_quil(total_prog_symm)
pyquil/api/_error_reporting.py:238: in wrapper
    val = func(*args, **kwargs)
pyquil/api/_compiler.py:340: in quil_to_native_quil
    response = self.client.call('quil_to_native_quil', request, protoquil=protoquil).asdict()  # type: Dict
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <rpcq._client.Client object at 0x1235b57f0>, method_name = 'quil_to_native_quil', rpc_timeout = 10
args = (NativeQuilRequest(_type='NativeQuilRequest', quil='PRAGMA READOUT-POVM 0 "(0.95 0.18000000000000005 0.050000000000000...]\nMEASURE 0 ro[0]\n', target_device=TargetDevice(_type='TargetDevice', isa={'1Q': {'0': {}}, '2Q': {}}, specs=None)),)
kwargs = {'protoquil': None}
request = RPCRequest(_type='RPCRequest', id='bc69a5f1-57e2-47f5-96af-0eab678bebc4', jsonrpc='2.0', method='quil_to_native_quil',...nMEASURE 0 ro[0]\n', target_device=TargetDevice(_type='TargetDevice', isa={'1Q': {'0': {}}, '2Q': {}}, specs=None)),)})
start_time = 1567443910.04178, timeout = 9999.999046325684
raw_reply = b'\x85\xa7jsonrpc\xa32.0\xa5error\xbeUnhandled memory fault at #x0.\xa2id\xda\x00$bc69a5f1-57e2-47f5-96af-0eab678bebc4\xa8warnings\x90\xa5_type\xa8RPCError'
reply = RPCError(error='Unhandled memory fault at #x0.', id='bc69a5f1-57e2-47f5-96af-0eab678bebc4', jsonrpc='2.0', warnings=[])

The work-around in qvm-app was to add a WITH-LOCKED-LOG macro and use it to acquire a global lock around any locking calls.

In the case of RPCQ, it's not so simple since the logger instance is passed in by the caller of RPCQ:START-SEVER.

Ideally, this would be resolved in CL-SYSLOG, if possible, but we might want to implement a workaround in QUILC/RPCQ in case that turns out to be impossible / impractical / slow to get merged.

Here is a minimal-ish testcase that reproduces the issue.

(ql:quickload :rpcq)

(defun test-method ()
  "hey")

(let* ((number-of-workers 4)
       (addr (format nil "inproc://~a" (uuid:make-v4-uuid)))
       (server-function
         (lambda ()
           (let ((dt (rpcq:make-dispatch-table)))
             (rpcq:dispatch-table-add-handler dt 'test-method)
             (rpcq:start-server :dispatch-table dt
                                :listen-addresses (list addr)
                                :logger (make-instance 'cl-syslog:rfc5424-logger
                                                       :app-name "logtest"
                                                       :facility ':local0
                                                       :maximum-priority ':debug
                                                       :log-writer
                                                       #-windows (cl-syslog:tee-to-stream
                                                                  (cl-syslog:syslog-log-writer "logtest" :local0)
                                                                  *error-output*))))))
       (server-thread (bt:make-thread server-function)))
  (sleep 1)
  (let ((threads '()))
    (unwind-protect
         (loop :repeat number-of-workers :do
           (push (bt:make-thread (lambda ()
                                   (loop :repeat 20 :do
                                     (rpcq:with-rpc-client (client addr)
                                       (rpcq:rpc-call client "test-method")))))
                 threads))
      (progn
        (dolist (thread threads)
          (bt:join-thread thread))
        (bt:destroy-thread server-thread)))))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions