Backend Interoperability

Reinders, James; Ashbaugh, Ben; Brodman, James; Kinsner, Michael; Pennycook, John; Tian, Xinmin

doi:10.1007/978-1-4842-9691-2_20

James Reinders⁷,
Ben Ashbaugh⁸,
James Brodman⁹,
Michael Kinsner¹⁰,
John Pennycook¹¹ &
…
Xinmin Tian¹²

5135 Accesses

Abstract

Chapter 20 describes backend interoperability, a SYCL feature that can be used to incrementally add SYCL to an application that is already using other data-parallel techniques or APIs, or to use other data-parallel APIs directly from our SYCL applications.

You have full access to this open access chapter, Download chapter PDF

In this chapter we will learn about backend interoperability, a SYCL feature that can incrementally add SYCL to an application that is already using other data parallel techniques or APIs.

We will also learn how backend interoperability can be used by expert programmers familiar with low-level APIs to “peek behind the curtain” and use underlying data parallel APIs from SYCL programs directly. This provides direct access to API-specific features, when necessary, while retaining the portability and ease-of-use benefits of SYCL otherwise.

What Is Backend Interoperability?

So far in this book we have referred to SYCL programs running on SYCL devices, but in practice many SYCL implementations build upon lower-level APIs such as OpenCL, Level Zero, CUDA, or others to access the parallel hardware in a system. When a SYCL implementation is built upon a lower-level API, we refer to the target API as a SYCL backend. Figure 20-1 shows the relationship between SYCL backends, platforms, and devices. Most SYCL implementations can run SYCL programs on multiple SYCL backends simultaneously to utilize all the parallel hardware in a system.

Two blocks. The left block labeled S Y C L backend 1 contains S Y C L platform A with S Y C L devices X and Y and S Y C L platform B with S Y C L device Z. The right block, S Y C L backend 2, contains S Y C L platform C with S Y C L device W. — **Figure 20-1**

We can query the SYCL backends in a system by first querying the SYCL platforms and then querying the SYCL backend associated with each platform, as shown in Figure 20-2. The output from this program will depend on the number and type of SYCL devices in a system. If the same device is supported by different SYCL backends, it may enumerate as a SYCL device for each backend.

A program of the S Y C L platform associated with the S Y C L backend. The highlighted functions are main, get platforms, and get backend. An example output with seven lines of the S Y C L platform is at the bottom. — **Figure 20-2**

The associated backend can be queried for most SYCL objects, not just for SYCL platforms. For example, we can also query the associated backend for a SYCL device, a SYCL context, or a SYCL queue.

Backend interoperability lets us use knowledge of the associated backend to interact with and manipulate underlying native backend objects that represent SYCL objects for the associated backend.

When Is Backend Interoperability Useful?

Many SYCL programmers will never need to use backend interoperability. In fact, using backend interoperability may be undesirable; backend interoperability will frequently either make a program more complex because it requires multiple code paths for multiple SYCL backends, or it will make a program less portable because it will restrict execution to devices with a single associated backend.

Still, backend interoperability is a useful tool to have in our toolbox to solve some specific problems. In this section we will explore several common use cases where backend interoperability is useful.

Backend Interoperability Is Like An Inline Assembler

A useful mental model for backend interoperability is that backend interoperability is to SYCL as inline assembler is to C++ host code: backend interoperability is not necessary for learning SYCL or being productive with SYCL, and backend interoperability is often undesirable because it increases complexity or decreases portability. Nevertheless, it is a useful tool to have in our toolbox to solve specific problems.

Adding SYCL to an Existing Codebase

The SYCL programs in this book are designed to teach specific SYCL concepts so they are intentionally straightforward and short. By contrast, most real-world software is large and complex, consisting of thousands or millions of lines of code, perhaps developed by many people over many years. Even if we wanted to do so, completely rewriting a large application to use SYCL may not be feasible.

One of the key benefits provided by backend interoperability is the ability to incrementally add SYCL to an existing codebase that is already using a low-level API, by creating SYCL objects from native backend objects for that API. For example, let’s say we have a large OpenCL application that creates an OpenCL context and OpenCL memory objects. Backend interoperability has templated functions like make_context and make_buffer which let us seamlessly create SYCL objects from these OpenCL objects. After creating SYCL objects from the OpenCL objects, they can be used by SYCL queues and SYCL kernels just like any other SYCL object, as shown in Figure 20-3.

A program to create S Y C L objects from the native backend objects and use them to create a queue and submit a kernel. The highlighted functions include make context, make device, make buffer, submit, parallel for, and wait. — **Figure 20-3**

The SYCL 2020 specification only defines interoperability with OpenCL backends, but SYCL implementations may provide interoperability with other backends via extensions. Figure 20-4 shows how SYCL objects may be created from Level Zero objects using the sycl_ext_oneapi_backend_level_zero extension.

Notice that the parameters that are passed to create the SYCL objects are slightly different for the Level Zero backend. This will generally be true for any supported backend interoperability because each backend may require different information to properly create the SYCL object. Otherwise, the same make_device, make_context, and make_buffer functions are used for both OpenCL and Level Zero backend interoperability.

Notice also that ownership is handled differently by each backend. For the OpenCL backend, the SYCL implementation uses the reference counting provided by OpenCL to manage the lifetimes of the native backend objects. For the Level Zero backend, the SYCL implementation must be explicitly told whether it should take ownership of the native backend object, or whether our application will keep ownership. If the SYCL implementation takes ownership of the native backend object, then the native backend object will be destroyed when the SYCL object is destroyed; otherwise, our application is responsible for freeing the native backend object directly.

Using Existing Libraries with SYCL

Backend interoperability can also be used to extract native backend objects from SYCL objects. This can be useful to use existing low-level libraries or other helper functions with our SYCL applications. There are two methods to do this: the first uses get_native free functions to get native backend objects from SYCL objects. The second uses a host_task and an interop_handle to get native backend objects from SYCL objects from code that is scheduled by the SYCL runtime.

Getting Backend Objects with Free Functions

For example, let’s say we have an optimized OpenCL library that we would like to use with our SYCL application. We can call the backend interoperability get_native functions to get native OpenCL objects from our SYCL objects, which can then be used with the OpenCL library. For simplicity, the code in Figure 20-5 just performs a query and allocates some memory with the native OpenCL objects, but they could also be used to perform more complicated operations like creating command queues, compiling programs, and executing kernels.

A program to query device name from open C L, allocate some memory from open C L, and clean up open C L objects when done. The highlighted functions include get native, c l get device info, open c l device name, c l create buffer, c l release device, c l release context, and c l release mem object. — **Figure 20-5**

The same get_native functions are also added for the Level Zero backend as part of the sycl_ext_oneapi_backend_level_zero extension, as shown in Figure 20-6.

A program to query the device name from level zero, allocate some memory from level zero, and clean up level zero objects when done. The highlighted functions include get native, z e device get properties, z e memory allocation host, and z e memory free. — **Figure 20-6**

Getting Backend Objects via an Interop Handle

Using the get_native free functions is an effective way to get backend-specific objects for large sections of code that will use backend APIs directly. In many cases, though, we only want to perform a specific operation in the SYCL task graph using a backend API. In these cases, we can perform the backend-specific operation using a SYCL host_task with a special interop_handle parameter. The interop_handle represents the state of the SYCL runtime when the host task is invoked and provides access to native backend objects representing the SYCL queue, device, context, and any buffers that were captured for the host task.

Figure 20-7 shows how to use the interop_handle to get native OpenCL objects from a host_task that is scheduled by the SYCL runtime. For simplicity, this sample also only performs some queries using the native OpenCL objects, but real application code would commonly enqueue a kernel or call into a library using the native OpenCL objects. Because these operations are performed from a host task, they will be properly scheduled with any other operations in the SYCL queue.

A program to get the open C L device from the interop handle, query the device name from the open C L device, get the open C L buffer from the interop handle, and query the size of the open C L buffer. The functions include submit, host task, c l get device info, and c l mem object info. — **Figure 20-7**

Notice that when getting native OpenCL objects for our accessor, the get_native_mem member function of the interop_handle returns a vector of cl_mem memory objects. This is a requirement in the SYCL 2020 specification, where the return type of member functions of the interop_handle must match the get_native free functions, but for the interop_handle usage we can simply use the first element of the vector.

As with the get_native free functions, similar functionality may also be provided for other SYCL backends via extensions. Figure 20-8 shows how to perform similar operations with the Level Zero backend using the sycl_ext_oneapi_backend_level_zero extension.

A program to get the level zero device from the interop handle, query the device name from level zero, get the level zero context and memory allocation, and query the size of the memory allocation. The functions include submit, host task, z e device get properties, and z e memory get address range. — **Figure 20-8**

Using Backend Interoperability for Kernels

This section describes how to use backend interoperability to compile kernels and manipulate kernel bundles. This is an area that was significantly redesigned in SYCL 2020 to increase robustness and to add the flexibility that is required to support different SYCL backends.

Earlier versions of SYCL supported two interoperability mechanisms for kernels. The first mechanism enabled creation of a kernel from an API-defined handle. The second enabled creation of a kernel from an API-defined source or intermediate representation, such as OpenCL C source or SPIR-V intermediate representation. These two mechanisms still exist in SYCL 2020, though the syntax for both mechanisms has been updated and now uses backend interoperability.

Interoperability with API-Defined Kernel Objects

With this form of interoperability, the kernel objects themselves are created using the low-level API and then imported into SYCL using backend interoperability. The code in Figure 20-9 shows how get an OpenCL context from a SYCL context, how to create an OpenCL kernel using this OpenCL context, and then how to create and use a SYCL kernel from the OpenCL kernel object.

A program to get the native open C L context from the S Y C L context, create an open C L kernel using this context, create a S Y C L kernel from the open C L kernel, use the open C L kernel with a S Y C L queue, and clean up open C L objects when done. — **Figure 20-9**

Because the SYCL compiler does not have visibility into a SYCL kernel that was created using the low-level API directly, any kernel arguments must explicitly be passed using the set_arg() or set_args() interface. Additionally, the SYCL runtime and the low-level API kernel must agree on a convention to pass objects as kernel arguments. This convention should be described as part of the backend interoperability specification. In this example, the accessor data_acc is passed as the global pointer kernel argument data.

The SYCL 2020 standard leaves the precise semantics of set_arg() and set_args() interfaces to be defined by each SYCL backend specification. This allows flexibility but is another way how the code using backend interoperability that we write is likely to be specific to the backends we target.

Interoperability with Non-SYCL Source Languages

With this form of interoperability, the contents of the kernel are described as source code or as an intermediate representation that is not defined by SYCL. This form of interoperability allows reuse of kernel libraries written in other source languages or use of domain-specific languages (DSLs) that generate code in an intermediate representation.

Previous versions of SYCL included functions like build_with_source to directly create a SYCL program from an API-defined source language but this functionality was removed in SYCL 2020. When a backend directly supports an API-defined source language, such as the OpenCL C kernel used by the OpenCL backend in Figure 20-9, this removal is not a problem, but what should we do if a backend does not directly support a specific source language?

Some SYCL implementations may provide an explicit online compiler to compile from a source language that cannot be used directly by a backend to a different format supported by a backend. Figure 20-10 shows how to use the experimental sycl_ext_intel_online_compiler extension to compile from OpenCL C source, which is not supported by the Level Zero backend, to SPIR-V intermediate representation, which is supported by the Level Zero backend. Using this method, a kernel can be used by any backend so long as it can be compiled by the online compiler into a format supported by the backend.

Caution, Experimental Extension!

The sycl_ext_intel_online_compiler extension is an experimental extension, so it is subject to change or removal! We have included it in this book because it provides a way to achieve similar functionality as the previous SYCL build_with_source function and because it is a convenient way to demonstrate how domain-specific languages may interface with SYCL backends to execute kernels.

A program to compile open C L, C kernel source to S P I R V intermediate representation using the online compiler, get the native level zero context and device, create a level zero kernel using this context, create a S Y C L kernel from the level zero kernel, and use it with a S Y C L queue. — **Figure 20-10**

In this example, the kernel source string is represented as a C++ raw string literal in the same file as the SYCL host API calls, but there is no requirement that this is the case, and some applications may read the kernel source string from a file or even generate it just-in-time.

As before, because the SYCL compiler does not have visibility into a SYCL kernel written in an API-defined source language, any kernel arguments must explicitly be passed using the set_arg() or set_args() interface.

Backend Interoperability Hints and Tips

This section describes practical hints and tips to effectively use backend interoperability.

Choosing a Device for a Specific Backend

The first requirement to properly use backend interoperability is to choose a SYCL device associated with the required SYCL backend. There are several ways to accomplish this.

The first is to integrate the required SYCL backend into existing custom device selection logic, by querying the associated backend while scoring each device. If our application is already using custom device selection logic, this should be a straightforward addition. This mechanism is also portable because it uses only standard SYCL queries.

For applications that do not already use custom device selection logic, we can write a short C++ lambda expression to iterate over all devices to find a device with the requested backend, as shown in Figure 20-11. Because this version of find_device does not request a specific device type, it is effectively a replacement for the standard default_selector_v.

A S Y C L program with try-catch blocks. An example output reads as follows. Found an open C L, S Y C L device, p thread twelfth generation Intel R core T M, i 9 12900 K. Found a level zero S Y C L device, Intel R, U H D Graphics 770, 0 x 4680. — **Figure 20-11**

Finally, for fast prototyping some SYCL implementations can use external mechanisms, such as environment variables, to influence the SYCL devices they enumerate. As an example, the DPC++ SYCL runtime can use the ONEAPI_DEVICE_SELECTOR environment variable to limit enumerated devices to specific device types or associated device backends (refer to Chapter 13). This is not an ideal solution for production code because it requires external configuration, but it is a useful mechanism for prototype code to ensure that an application is using a specific device from a specific backend.

Be Careful About Contexts!

Recall from Chapters 6 and 13 that many SYCL objects, such as kernels and USM allocations, are generally not accessible by a SYCL context if they were created in a different SYCL context. This is still true when using backend interoperability; therefore, a backend-specific context created using a backend API generally will not have access to objects created in a different SYCL context (and vice versa) even if the SYCL context is associated with the same backend.

To safely share objects between SYCL and a backend, we should always either create our SYCL context from a native backend context using make_context, or we should get a native backend context from a SYCL context using get_native.

Always create a SYCL context from a native backend context or get a native backend context from a SYCL context to safely share objects between SYCL and a backend!

Access Low-Level API-Specific Features

Occasionally a cutting-edge feature will be available in a low-level API before it is available in SYCL, even as a SYCL extension. Some features may even be so backend-specific or so device-specific that they will never be exposed through SYCL. For example, some native backend APIs may provide access to queues with specific properties or unique kernel instructions for specific accelerator hardware. Although we hope and expect these cases to be rare, when these types of features exist, we may still gain access to them using backend interoperability.

Support for Other Backends

The examples in this chapter demonstrated backend interoperability with OpenCL and Level Zero backends, but SYCL is a growing ecosystem and SYCL implementations are regularly adding support for additional backends and devices. For example, several SYCL implementations supporting CUDA and HIP backends already have some support for interoperability with these backends. Check the documentation for a SYCL implementation to determine which SYCL backends are supported and whether they support backend interoperability!

Summary

In this chapter, we discovered how each SYCL object is associated with an underlying SYCL backend and how to query the SYCL backends in a system. We described how backend interoperability provides a mechanism for our SYCL application to directly interact with an underlying backend API. We discussed how this enables us to incrementally add SYCL to an application that is directly using a backend API, or to reuse libraries or utility functions written specifically for a backend API. We also discussed how backend interoperability reduces application portability, by restricting which SYCL devices the application will run on.

We specifically explored how backend interoperability for kernels provides similar functionality in SYCL 2020 that was present in earlier versions of SYCL. We examined how an online compiler extension can enable the use of some source languages for kernels, even if they are not directly understood by some SYCL backends.

Finally, we reviewed practical hints and tips to effectively use backend interoperability in our programs, such as how to choose a SYCL device for a specific SYCL backend, how to set up a SYCL context for backend interoperability, and how backend interoperability can provide access to features even if they have not been added to SYCL.

Author information

Authors and Affiliations

Beaverton, OR, USA
James Reinders
Folsom, CA, USA
Ben Ashbaugh
Marlborough, MA, USA
James Brodman
Halifax, NS, Canada
Michael Kinsner
San Jose, CA, USA
John Pennycook
Fremont, CA, USA
Xinmin Tian

Authors

James Reinders
View author publications
You can also search for this author in PubMed Google Scholar
Ben Ashbaugh
View author publications
You can also search for this author in PubMed Google Scholar
James Brodman
View author publications
You can also search for this author in PubMed Google Scholar
Michael Kinsner
View author publications
You can also search for this author in PubMed Google Scholar
John Pennycook
View author publications
You can also search for this author in PubMed Google Scholar
Xinmin Tian
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Reinders, J., Ashbaugh, B., Brodman, J., Kinsner, M., Pennycook, J., Tian, X. (2023). Backend Interoperability. In: Data Parallel C++. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-9691-2_20

Download citation

DOI: https://doi.org/10.1007/978-1-4842-9691-2_20
Published: 04 October 2023
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-9690-5
Online ISBN: 978-1-4842-9691-2
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)

Publish with us

Policies and ethics