GPU resource scheduling method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A resource scheduling and resource technology, applied in the field of GPU resource scheduling, can solve the problem that GPU resources cannot be fully utilized, and achieve the effects of improving utilization efficiency, improving performance, and reducing costs

Pending Publication Date: 2020-10-20

贝式计算(天津)信息技术有限公司

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Benefits of technology

This technology allows for more efficient usage of CPU resources by allowing different types or versions of computer programs running simultaneously within one system (Kubernetes). It also optimises how tasks run across multiple systems based on their respective needs at once without overloading any machine's RAM capabilities. Overall this results improved overall network throughput speeding up data processing times while reducing costs associated with hardware devices used during execution time.

Problems solved by technology

This patented technical problem addressed in this patents relates to efficiently managing computer systems with different types of processing capabilities without overloading their limited computational capacity or causing delays when accessing shared data sources like graphics from external servers due to insufficient usage of available compute power.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0040] Such as figure 1 and 2 As shown, for an application that only needs one GPU, the resource allocation method is supported according to the required GPU memory and the required number of cores, instead of allocating a complete GPU to the application. The default GPU resource manager does not support the allocation of resources required by the application, but directly locks the entire GPU and allocates it to the required application.

[0041] A method for scheduling GPU resources, comprising steps:

[0042] S1. First, collect the basic information of the GPU from the cluster, and provide the gpu-usages interface, and enter step S2; in step S1, collect the basic information of the GPU, including the model of the GPU, the video memory, and the core of the GPU. It is convenient for the scheduler to obtain cluster GPU resource information.

[0043] S2. Create a GPU application, and send an application request to the Kubernetes scheduler, and enter step S3; in step S2, during

Embodiment 2

[0050] Such as figure 1 , 3 , 4, 5, and 6; for applications that require multiple GPUs: allocate according to the GPU group with the highest communication efficiency. The connection structure of GPUs in the machine is different, and the communication speed between GPUs will also be different. as attached image 3 As shown, the DGX-1 machine contains 8 GPUs, among which GPU0, GPU1, GPU2, GPU3, and GPU4 can be directly connected through NVLink, and its communication bandwidth can reach 40GB / s. The connection between GPU0 and GPU5, GPU6, and GPU7 needs to be completed through PCIe Switch and QPI, which greatly reduces the communication efficiency compared with NVLink. When allocating multiple GPUs to an application, the connection structure between the allocated multiple GPUs, also referred to as the GPU topology, should be considered. The topology structure between the GPUs can be obtained through the GPU driver, and the communication efficiency between the GPUs can be connec

Embodiment 3

[0063] Such as figure 1 , 2 , 3, 4, 5, and 6; for applications that require multiple GPUs: allocate according to the GPU group with the highest communication efficiency. The connection structure of GPUs in the machine is different, and the communication speed between GPUs will also be different. as attached image 3 As shown, the DGX-1 machine contains 8 GPUs, among which GPU0, GPU1, GPU2, GPU3, and GPU4 can be directly connected through NVLink, and its communication bandwidth can reach 40GB / s. The connection between GPU0 and GPU5, GPU6, and GPU7 needs to be completed through PCIe Switch and QPI, which greatly reduces the communication efficiency compared with NVLink. When allocating multiple GPUs to an application, the connection structure between the allocated multiple GPUs, also referred to as the GPU topology, should be considered. The topology structure between the GPUs can be obtained through the GPU driver, and the communication efficiency between the GPUs can be con

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to view more

PUM

Login to view more

Abstract

The invention relates to the technical field of communication application, and discloses a GPU resource scheduling method, which comprises the following steps of: S1, firstly, collecting basic information of a GPU from a cluster, providing a gpu-usages interface, and entering a step S2; S2, creating a GPU application, sending an application request to a Kubernetes scheduler, and entering the stepS3; S3, after the Kubernetes scheduler receives the application request, traversing all GPU applications in the cluster, and entering the step S4; S4, calculating the GPU meeting the scheduling requirement of the application through a gpu-usages interface, and entering the step S5; and S5, binding, by the GPU manager, the specified GPU resources to the application according to the machine where the GPU is located on the application. The sharing of a single GPU in multiple applications according to a GPU video memory and a GPU computing power percentage is realized, the utilization efficiency of the single GPU is greatly improved, and the GPU application cost is reduced.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to view more

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to view more

Application Information

Patent Timeline

Login to view more

Owner 贝式计算(天津)信息技术有限公司

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Try Eureka

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.

GPU resource scheduling method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Benefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology