Skip to content

Order book sync logic vulnerable to edge cases #1082

Open
@Joshuaalbert

Description

@Joshuaalbert

I will use Bitstamp L2 as an example, but I think the same flow is used in others.

Currently, you

  1. subscribe,
  2. wait a few seconds,
  3. fetch a snapshot,
  4. and then drop all messages older than the snapshot.

There are a few problems with this:

  1. After step 1, you immediately start handling deltas, applying them to an empty L2 book, and initiating callbacks. These delta's mean nothing unless applied correctly to a snapshot. The callbacks will be triggered with garbage order books and the user has no way on knowing when/if they stop being garbage.
  2. A delta can arrive while fetching a snapshot that should be applied AFTER the snapshot. But your code would reset the order book and those deltas would never be applied, leading to an inconsistent state.

The only perfectly correct way is as follows:

  1. collect at least one delta.
  2. if there is no snapshot yet, then fetch in a loop until snapshot['timestamp'] > delta['timestamp']
  3. drop all delta['timestamp'] <= snapshot['timestamp']

In pseudo-code

book = None
subscribe()
while True:
  delta = recv()
  if book is None:
    book = fetch_book(minimum_timestamp=delta['timestamp'])
  if delta['timestamp'] <= book['timestamp']:
    continue
  book += delta

This approach is 100% sure to be consistent regardless of straggler deltas, but requires at least one delta to trigger fetching the book. Low-volume markets don't always have order activity which is a problem. Thus, a second way is the following:

  1. subscribe
  2. put all deltas onto a queue
  3. wait fetch wait
  4. start handling queue

In pseudo-code

q=queue()
def task_consume():
  subscribe()
  while True:
    delta = recv()
    q.put(delta)

def task_apply():
  while book is None:
    sleep(1)
  while True:
    delta = q.get()
    if delta['timestamp'] <= book['timestamp']:
      continue
    book += delta

create_task(task_consume())
create_task(task_apply())

sleep(safety) # allow any deltas to arrive before fetch
temp_book = fetch_book()
sleep(safety) # allow any stragglers to arrive after fetch
book = temp_book

This approach queues up the deltas and waits for some safety period before and after fetching the book, before applying the deltas. This is different from your current approaches which does not queue deltas initially. It's virtually perfect, because there are physical limits to straggler delay.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions