Edit on GitHub

Move Q and N to eges

Idea

For all children nodes, have a copy of N and Q values in a parent node. Those values would be called edge N and edge Q as opposed to node N and node Q that are stored in the node.

Before:
Before

After:
After

Actually there’s no need to store node N and node Q as they can be computed from edge N and edge Q, but it may have performance impact (although likely not!), so I left them for now.

N-in-flight is not shown because it won’t be needed.

Why?

  • All data needed for visit routing is located within node data structure, improving data locality for better cache utilization. That should help with node performance. Currently, for a node to route a visit, algorithm has to access N and Q values of all child nodes, and they are located in remote places of memory.

  • It’s much easier to parallelize tree traversal and make it scalable when you only have to lock a single node (parallelization ideas are built on that).

  • It allows node eviction (e.g. killing arbitrary nodes from memory):

    • If particular node is deleted from memory, edge leading to that node still keeps valid Q and N. It means that it’s possible to detect evicted node and for example make it a temporary second “root” for it to catch up if it’s needed again.
    • If edge N is smaller than node N it points to, it means that child node had more visits than parent (e.g. because parent was evicted). In that case no eval is needed and “V” can be propagated to a parent out-of-order.
  • It allows transposition support:
    Similarly to “parent is evicted” case, when a node is hit through alternative transposition, child’s node N happens to be larger than parent’s edge N. In that case out-of-order V propagation can be done.

  • Finally, N and Q are properties of edge, not node. It’s just when our graph was tree, there was no difference where to store them.

Thoughts

  • As there is data duplication, more memory is needed per node. However, hopefully those changes will be offset by
    • Compressed node encoding (probably it will be even smaller than now).
    • Being able to evict cold nodes to free up RAM.
    • Being able to offload nodes to external storage (database etc.)
    • Support of transpositions will make Lc0 have more “value” per node, although it’s orthogonal.
Last Updated: 2020-02-02