Method to reduce register access latency in split-die soc designs

a register access and split-die technology, applied in concurrent instruction execution, instruments, computing, etc., can solve the problems of increasing the effective per/die cost, difficult and costly to redesign the soc, and the complexity of the design process

Pending Publication Date: 2022-04-07
INTEL CORP
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

The patent text describes a method to reduce latency in split-die SoC designs, which can be used to offer variants within a processor family at a lower cost. The method involves using smaller-size dies called "dielets" that are integrated into the SoC using a fabrication method called silicon-interconnect fabric. However, this design introduces new challenges, such as longer transaction latencies when there are cross-links between dielets. The patent text proposes various solutions to address these challenges, such as avoiding cross-links or optimizing the placement of dielets. Overall, the invention aims to offer a faster and more efficient way to design and manufacture SoCs with improved performance and cost-effectiveness.

Problems solved by technology

However, the dies for these SoCs require very expensive manufacturing equipment and design costs and the effective per / die cost is increased by reduced yields that statistically result as a function of transistor and core counts.
It is also difficult and costly to redesign SoCs, as changes also have to be made to manufacturing processes and equipment associated with such redesigns.
While split-die SoCs provide advantages, working with them presents new challenges that aren't present with single die SoCs.
For example, split-die SoC designs suffer from longer transaction latencies when there are one or more Embedded Multi-Die Interconnect Bridge (EMIB) crossings required to complete a transaction.
This is an industry-wide problem inherent to current split-die SoCs designs as well as Server CPUs that contain multiple dielets within a CPU Package interconnected by EMIB's or an equivalent interface.
In particular, Non-Coherent (NC) transactions such as Configuration Space Register (CSR) Reads / Writes in such designs suffer from larger latency penalties compared to other coherent transaction types.
Hence, while executing code that is dominant in NC CSR Transactions, the aggregate latency penalty increases significantly.
For example, for one recent SoC under development the DDR5 training algorithms necessitate 1 Billion+CSR accesses per socket, which increases memory training time by 300%-400% compared to prior processor generations, adversely affecting overall Platform Cold Boot Time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0014]Embodiments of methods to reduce register access latency in Split-Die SoC designs and associated apparatus are described herein. In the following description, numerous specific details are set forth to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the invention.

[0015]Reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification ar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Methods and apparatus to reduce register access latency in split-die SoC designs. The method is implemented on a platform including a legacy socket and one or more non-legacy (NL) sockets comprising split-die System-on-Chips (SoC)s including multiple dielets interconnected with a plurality of Embedded Multi-Die Interconnect Bridges (EMIBs). The dielets include core dielets having cores, cache controllers and memory controllers. The method provides an affinity between a control and status registers (CSRs) memory range for the NL sockets such that CSRs in the memory controllers for multiple core dielets are programmed using transactions forwarded along core-to-cache controller datapaths that avoid crossing EMIBs. In one aspect, a transient map of address ranges is created that includes a respective Sub-NUMA Cluster (SNC) range allocated for the NL sockets, with a range of CSR addresses for accessing CSRs in the memory controllers for the NL sockets being stored in the respective SNC ranges.

Description

BACKGROUND INFORMATION[0001]Historically, central processing units (CPUs), aka processors have employed a monolithic die design. Under early generations, operations such as memory access and Input / Output (IO) access were separated from the CPU using a chipset, such as a Northbridge-Southbridge chipset. As CPUs designs evolved, more of this functionality was added to the CPU using a System-on-Chip (SoC) design.[0002]As core counts continue to scale and integrated circuit technology advancements produce finer-grained features, the transistor counts on a single die have reached 10's of billions. However, the dies for these SoCs require very expensive manufacturing equipment and design costs and the effective per / die cost is increased by reduced yields that statistically result as a function of transistor and core counts. Also, chip yield drops roughly exponentially as the chip area grows. It is also difficult and costly to redesign SoCs, as changes also have to be made to manufacturing...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F9/30G06F9/38
CPCG06F9/30101G06F9/3863G06F13/404G06F13/1652G06F13/161
Inventor ENAMANDRAM, ANAND K.NALLUSAMY, ESWARAMOORTHIKRITHIVAS, RAMAMURTHYLIN, CHENG-WEINJOHANSEN, IRENE
Owner INTEL CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products