Disaggregated computing for distributed confidential computing environment
Abstract
An apparatus to facilitate disaggregated computing for a distributed confidential computing environment is disclosed. The apparatus includes one or more processors to facilitate receiving a manifest corresponding to graph nodes representing regions of memory of a remote client machine, the graph nodes corresponding to a command buffer and to associated data structures and kernels of the command buffer used to initialize a hardware accelerator and execute the kernels, and the manifest indicating a destination memory location of each of the graph nodes and dependencies of each of the graph nodes; identifying, based on the manifest, the command buffer and the associated data structures to copy to the host memory; identifying, based on the manifest, the kernels to copy to local memory of the hardware accelerator; and patching addresses in the command buffer copied to the host memory with updated addresses of corresponding locations in the host memory.
Claims
exact text as granted — not AI-modifiedWhat is claimed is:
1. An apparatus comprising:
a host memory;
a hardware accelerator; and
one or more processors communicably coupled to the host memory and the hardware accelerator, the one or more processors to facilitate:
receiving a manifest corresponding to graph nodes representing regions of memory of a remote client machine, the graph nodes corresponding to at least one command buffer and to associated data structures and kernels of the at least one command buffer used to initialize the hardware accelerator and execute the kernels, and the manifest indicating a destination memory location of each of the graph nodes and dependencies of each of the graph nodes;
identifying, based on the manifest, the at least one command buffer and the associated data structures to copy to the host memory;
identifying, based on the manifest, the kernels to copy to local memory of the hardware accelerator; and
patching addresses in the at least one command buffer copied to the host memory with updated addresses of corresponding locations in the host memory.
2. The apparatus of claim 1 , wherein the manifest comprises a data structure storing at least one of a description, an identifier, a source address, a size, a destination, or a dependency for each of the graph nodes.
3. The apparatus of claim 1 , wherein the manifest is received from the remote client machine, and wherein the at least one command buffer referenced by the manifest comprises commands to initialize an environment inside the hardware accelerator and execute the kernels.
4. The apparatus of claim 1 , wherein the hardware accelerator comprises a graphics processing unit (GPU).
5. The apparatus of claim 1 , wherein the remote client machine comprises userspace components of an accelerator stack of the hardware accelerator, and wherein a remainder of the accelerator stack executes on the apparatus.
6. The apparatus of claim 5 , wherein a middleware component is to expose an abstraction of the apparatus to the userspace components of the accelerator stack on the remote client machine, and is to mediate transfer of data between the remote client machine and the hardware accelerator.
7. The apparatus of claim 1 , wherein the associated data structures comprise one or more descriptor heaps.
8. The apparatus of claim 1 , wherein patching the addresses comprises:
identifying the addresses in the at least one command buffer; and
identifying the updated addresses of the corresponding locations in the host memory; and replacing the addresses with the updated addresses in the at least one command buffer copied to the host memory.
9. The apparatus of claim 1 , wherein the one or more processors comprise one or more of a GPU, a central processing unit (CPU), or a hardware accelerator.
10. A method comprising:
receiving, by one or more processors communicably coupled to a host memory and a hardware accelerator, a manifest corresponding to graph nodes representing regions of memory of a remote client machine, the graph nodes corresponding to at least one command buffer and to associated data structures and kernels of the at least one command buffer used to initialize the hardware accelerator and execute the kernels, and the manifest indicating a destination memory location of each of the graph nodes and dependencies of each of the graph nodes;
identifying, by the one or more processors based on the manifest, the at least one command buffer and the associated data structures to copy to the host memory;
identifying, by the one or more processors based on the manifest, the kernels to copy to local memory of the hardware accelerator; and
patching, by the one or more processors, addresses in the at least one command buffer copied to the host memory with updated addresses of corresponding locations in the host memory.
11. The method of claim 10 , wherein the manifest comprises a data structure storing at least one of a description, an identifier, a source address, a size, a destination, or a dependency for each of the graph nodes.
12. The method of claim 10 , wherein the manifest is received from the remote client machine, and wherein the at least one command buffer referenced by the manifest comprises commands to initialize an environment inside the hardware accelerator and execute the kernels.
13. The method of claim 10 , wherein the remote client machine comprises userspace components of an accelerator stack of the hardware accelerator, and wherein a remainder of the accelerator stack executes on the hardware accelerator.
14. The method of claim 13 , wherein a middleware component is to expose an abstraction of the hardware accelerator to the userspace components of the accelerator stack on the remote client machine, and is to mediate transfer of data between the remote client machine and the hardware accelerator.
15. The method of claim 10 , wherein the associated data structures comprise one or more descriptor heaps.
16. The method of claim 10 , wherein patching the addresses comprises:
identifying the addresses in the at least one command buffer; and
identifying the updated addresses of the corresponding locations in the host memory; and replacing the addresses with the updated addresses in the at least one command buffer copied to the host memory.
17. A non-transitory machine readable storage medium comprising instructions that, when executed, cause at least one processor to at least:
receive, by the at least one processor communicably coupled to a host memory and a hardware accelerator, a manifest corresponding to graph nodes representing regions of memory of a remote client machine, the graph nodes corresponding to at least one command buffer and to associated data structures and kernels of the at least one command buffer used to initialize the hardware accelerator and execute the kernels, and the manifest indicating a destination memory location of each of the graph nodes and dependencies of each of the graph nodes;
identify, by the at least one processor based on the manifest, the at least one command buffer and the associated data structures to copy to the host memory;
identify, by the at least one processor based on the manifest, the kernels to copy to local memory of the hardware accelerator; and
patch, by the at least one processor, addresses in the at least one command buffer copied to the host memory with updated addresses of corresponding locations in the host memory.
18. The non-transitory machine readable storage medium of claim 17 , wherein the manifest comprises a data structure storing at least one of a description, an identifier, a source address, a size, a destination, or a dependency for each of the graph nodes.
19. The non-transitory machine readable storage medium of claim 17 , wherein the manifest is received from the remote client machine, and wherein the at least one command buffer referenced by the manifest comprises commands to initialize an environment inside the hardware accelerator and execute the kernels.
20. The non-transitory machine readable storage medium of claim 17 , wherein patching the addresses comprises:
identifying the addresses in the at least one command buffer;
identifying the updated addresses of the corresponding locations in the host memory; and replacing the addresses with the updated addresses in the at least one command buffer copied to the host memory.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.