Skip to content

Cori KNL Processor Modes

The Xeon-Phi "Knights-Landing" 7250 processors in Cori have 68 CPU cores where are organized into 34 "tiles" (each tile comprising two CPU cores and a shared 1MB L2 cache) which are placed in a 2D mesh, connected via an on-chip interconnect as shown in the following figure:

KNLOverview

As shown in the figure, the KNL processor has 6 DDR channels, with controllers to the right and left of the mesh 8 MCDRAM channels, with controllers spread across 4 "corners" of the mesh.

NUMA on KNL

A KNL processor maintains cache coherency with a set of tag directories distributed across the tiles such that any memory address corresponds to the tag directory cache on a particular tile. KNL supports several modes of memory access organization, which are well-described in this article.

The Cori KNL nodes are in "quadrant" mode, in which the chip is divided into four quadrants, and the tag directories in a quadrant map to memory accessed via a memory controller in that quadrant. In quadrant mode, the whole chip is presented as a single NUMA domain. The diagram below illustrates how a cache miss on one tile is resolved in quadrant mode.

cluster-mode-quadrant

MCDRAM Memory Options on KNL

There is no shared L3 cache on the KNL processor. However, the 16 GB of MCDRAM (spread over 8 channels) can be configured either as a direct-mapped cache or as addressable memory. On Cori KNL nodes, the MCDRAM is configured as a direct-mapped cache.

In this configuration recently accessed data is automatically cached in MCDRAM, similarly to an L3 cache on a Xeon processor. However, there are somenotable differences:

  • The cache (16GB) is significantly larger than a typical L3 cache on a Xeon processor (usually in the tens of MB).
  • The cache is direct-mapped. Meaning it is non-associative - each cache-line worth of data in DRAM has one location it can be cached in MCDRAM. This can lead to possible conflicts for apps with greater than 16GB working sets.
  • Data is not prefetched into the MCDRAM cache