TaskGraph

As Vulkan and Daxa require manual synchronization, using Daxa and Vulkan can become quite complex and error-prone.

A common way to abstract and improve synchronization with low-level APIs is using a RenderGraph. Daxa provides a render graph called TaskGraph.

With TaskGraph, you can create task resource handles and names for the resources you have in your program. You can then list a series of tasks. Each task contains a list of used resources and a callback to the operations the task should perform.

A core idea of TaskGraph (and other render graphs) is that you record a high-level description of a series of operations and execute these operations later. In TaskGraph, you record tasks, “complete” (compile), and later run them. The callbacks in each task are called during execution.

This “two-phase” design allows render graphs to optimize the operations, unlike how a compiler would optimize a program before execution. It also allows render graphs to determine optimal synchronization automatically based on the declared resource used in each task. In addition, task graphs are reusable. You can, for example, record your main render loop as a task graph and let the task graph optimize the tasks only once and then reuse the optimized execution plan every frame. All in all, this allows for automatically optimized, low CPU cost synchronization generation.

Overview of the workflow for the task graph:

Create tasks
Create task resources
Add tasks to graph
Complete task graph
Execute task graph
(optional) Repeatedly reassign resources to task resources
(optional) Repeatedly execute task graph

Task Resources

When constructing a task graph, it’s essential not to use the real resource IDs used in execution but virtual representatives at record time. This is the simple reason that the task graph is reusable between executions. Making the reusability viable is only possible when resources can change between executions. The graph takes virtual resources, TaskImage and TaskBuffer. ImageId and BufferIds can be assigned to these TaskImages and Taskbuffers and changed between executions of task graphs.

Task Resource Views

Referring to only a part of an image or buffer is convenient. For example, to specify specific mip levels in an image in a mip map generator.

For this purpose, Daxa has TaskImageViews. A TaskImageView, similarly to an ImageView, contains a slice of the TaskImage, specifying the subresource.

All Tasks take in views instead of the resources themselves. Resources implicitly cast to the views but have the explicit conversion function .view(). Views also have a .view() function to create a new view from an existing one.

Task Resource Data Dependencies

One of the main functions of the task graph is automatically generate sync and to automatically optimize execution by reordering task callbacks and their recorded commands.

To know how to generate sync and when its safe to reorder commands, the tg builds a literl graph of resource uses between the tasks. Based on this graph we can infer optimal ordering and sync.

Usage Implications

When a task is added, tg will immediately form new access dependencies for all resources assigned to attachments of that task.

Example:

TaskA(write Img0), TaskB(read Img0, write Img1), TaskC(read Img1) Here, TaskB reads img0, tg will see that the TaskA writing Img0 was recorded before TaskB. Tg will form a write -> read dependency when adding TaskB, forcing TaskB to be executed AFTER TaskA. Same for TaskC to TaskB. This leaves the only execution order: TaskA -> TaskB -> TaskC

TaskA(write Img0), TaskB(read Img0, write Img1), TaskC(write Img2), TaskD(read Img2) In this example there is no dependency between TaskC to A and B, no dependency between TaskD to A and B. This allows task graph to move the execution of TaskC and TaskD earlier, this will reduce the number of barriers. TaskA TaskC -> TaskB TaskD

Dependencies are always formed immediately when a task is added based on the given task resource views.

Concurrent Access

Sometimes it is undesirable that two tasks that write the same resource form a write -> write order depdendency. For example it could be the case that TaskA writes the left half of an image and TaskB the right half or that the access is synchronized via atomics.

Tg will form dependencies here per default. To avoid this, use a concurrent task access for all tasks that you want to allow them to execute at the same time.

Example: TaskA(write ImgA), TaskB(concurrent write ImgA), TaskC(concurrent write ImgA), TaskD(write ImgA), Likely execution order generated by tg: TaskA -> TaskB TaskC -> TaskD

Notice here that there are still dependencies formed to writes not marked as concurrent. So while B and C execute together, A and D form strong ordering dependencies to the concurrent writes.

NOTE: There is no extra concurrent read access, as all reads are implicitly concurrent already. Multiple reads will be sheduled independently to each other.

Task

The core part of any render graph is the nodes in the graph. In the case of Daxa, these nodes are called tasks.

A task is a unit of work. It might be a single compute dispatch or multiple dispatches/render passes/raytracing dispatches. What limits the size of a task is resource dependencies that require synchronization.

Synchronization is only inserted between tasks. If dispatch A writes an image and dispatch B needs to read the finished content, both dispatches must be within different tasks, so task graph is able to synchronize.

A Task consists of four parts:

A description of how graph resources are used, the so called “Attachments”.
A task resource view for each attachment, telling the graph which resource belongs to which attachment.
User data, such as a pointer to some context, pipeline pointer and general parameters for the task.
The callback, describing how the work should be recorded for the task.

Notably, the graph works in two phases: the recording and the execution. The callbacks of tasks are only ever called in the execution of the graph, not the recording.

Example of a task:

daxa::TaskImageView src = ...;
daxa::TaskImageView dst = ...;
int blur_width = ...;
graph.add_task(daxa::Task::Transfer("example task")
    .reads(src)     // adds attachment for src to the task
    .writes(dst)    // adds attachment for dst to the task
    .executes([=](daxa::TaskInterface ti){
        copy_image_to_image(ti.recorder, ti.id(src), ti.id(dst), blur_width);
    }));

Task Attachments

Attachments describe a list of used graph resources that might require synchronization between tasks.

Note: Only make attachments for resources that need sync. Textures that are uploaded and synched once after upload for example should be ignored in the graph.

Each attachment consists of:

the resources type (image/buffer/acceleration structure)
the resources access (stage + read/write/sampled)
the resources shader usage (id/index/ptr + image view type)

TaskGraph will use this information to automatically generate sync, reorder tasks and automatically fill push constants with your resources.

the automatic push constant/buffer fill is only available via TaskHeads (described later)

TaskInterface

The resources assigned to each attachment of tasks are not available or even created yet when recording the task. They might also change between graph executions!

So the only up to date and correct information about each task resource and attachment is available ONLY when the task callback is executed and ONLY accessible via the task interface.

The interface has functions to query all information on the resources behind the attachments, such as: id, image view, buffer device/host address, image layout, resource inf, task view.

Aside from gett9ing attachment information, the interface is used to get:

current device
current command recorder
current buffer suballocator (may be used to allocate small sections of a ring buffer in each task)
current task metadata (name,index,queue)
current attachment shader blob (described later)

TaskHead and Attachment Shader Blob

When using shader resources like buffers and images, one must transport the image id or buffer pointer to the shader. In traditional apis one would bind buffers and images to an index but in daxa these need to be in a struct that is either stored inside another buffer or directly within a push constant.

This means that in traditional apis you must list the attachments many times:

once in shaders, either as indices/pointers in a push constant OR direct bindings
once in the attachments of the task
when assigning the indices/bindings for the api
once when assigning task buffer/task image views to the attachments

Daxa can help you a lot here by reducing the reduncancy with task heads. Task heads allow you to declare a struct in shader containing all indices/pointers to resources AS WELL AS the attachments for a task in one go! With task heads you only need to:

list resource in attachment
assign view to attachment

Thats it. Daxa will do all the other logic for you.

But how do task heads work?

Essentially task head declarations consist of a set of macros that are valid in shaders as well as c/c++. In each language the macros have different definitions. The task head declaration either describes a struct with indices/pointers in the shader OR a namespace containing constexpr metadata about the attachments and their use in the shader. The metadata is enoug to properly fill the shader struct in the task graph internals.

An example of a task head:

// within the shared file
DAXA_DECL_TASK_HEAD_BEGIN(MyTaskHead)
DAXA_TH_IMAGE_ID( COMPUTE_SHADER_READ,  daxa_BufferPtr(daxa_u32), src_buffer)
DAXA_TH_BUFFER_ID(COMPUTE_SHADER_WRITE, REGULAR_2D,               dst_image)
DAXA_DECL_TASK_HEAD_END

NOTE: COMPUTE_SHADER_READ is from the enum daxa::TaskBufferAccess

NOTE: For each of these access enum values, there is also a shortened version, example: CS_READ

This task head declaration will translate to the following glsl shader struct:

struct MyTaskHead
{
    daxa_BufferPtr(daxa_u32)  src_buffer;
    daxa_ImageViewId          dst_buffer;
};

Or the following Slang-HLSL:

struct MyTaskHead
{
    daxa::u32*         src_buffer;
    daxa::ImageViewId  dst_buffer;
};

Extended example using a task head:

// within shared file

DAXA_DECL_COMPUTE_TASK_HEAD_BEGIN(ExampleTaskHead)
    DAXA_TH_BUFFER_PTR(     READ,       daxa_BufferPtr(daxa_u32),       src_buffer)
    DAXA_TH_IMAGE_ID(       WRITE,      REGULAR_2D,                     dst_image)
DAXA_DECL_TASK_HEAD_END

// This push constant is shared in shader and c++!
struct MyPushStruct
{
    daxa_u32vec2 size;
    daxa_u32 settings_bitfield;
    // The head field is an aligned byte array in c++ and the attachment struct in shader:
    DAXA_TH_BLOB(ExampleTaskHead, attachments);
};

daxa::BufferViewId src = ...;
daxa::ImageViewId dst = ...;

graph.add_task(daxa::HeadTask<ExampleTaskHead::Info>("example task")
    .head_views({.src_buffer = src})     // assign the view to the attachment, access is defined in head
    .head_views({.dst_image = dst})      // assign the view to the attachment, access is defined in head
    .executes([=](daxa::TaskInterface ti){
        auto const AT =
        ti.recorder.set_pipeline(...);
        ti.recorder.push_constant(MyPushStruct{
            .size = ...,
            .settings_bitfield = ...,
            // Here you assign the graph generated attachment shader blob into your pushconstant
            .attachments = ti.attachment_shader_blob,
        });
        ti.dispatch(...);
    }));

TaskInterface and Attachment Information

The ATTACHMENTS or AT constants declared within the task head contain all metadata about the attachments. But they also contain named indices for each attachment!

In the above code these named indices are used to refer to the attachments. You can refer to any attachment with HEAD_NAME::AT.attachment_name.

Note that all these functions also take views directly instead of attachments indices in order to be compatible with inline tasks.

These indices can also be used to access information of attachments within the task callback:

void example_task_callback(daxa::TaskInterface ti)
{
    auto const & AI = ExampleTaskHead::ATTACHMENT_INDICES;

    // There are two ways to get the info for any attachment:
    {
        // daxa::TaskBufferAttachmentIndex index:
        [[maybe_unused]] daxa::TaskBufferAttachmentInfo const & buffer0_attachment0 = ti.get(AI.buffer0);
        // daxa::TaskBufferView assigned to the buffer attachment:
        [[maybe_unused]] daxa::TaskBufferAttachmentInfo const & buffer0_attachment1 = ti.get(buffer0_attachment0.view);
    }
    // The Buffer Attachment info contents:
    {
        [[maybe_unused]] daxa::BufferId id = ti.get(AI.buffer0).ids[0];
        [[maybe_unused]] char const * name = ti.get(AI.buffer0).name;
        [[maybe_unused]] daxa::TaskAccess access = ti.get(AI.buffer0).task_access;
        [[maybe_unused]] u8 shader_array_size = ti.get(AI.buffer0).shader_array_size;
        [[maybe_unused]] bool shader_as_address = ti.get(AI.buffer0).shader_as_address;
        [[maybe_unused]] daxa::TaskBufferView view = ti.get(AI.buffer0).view;
        [[maybe_unused]] std::span<daxa::BufferId const> ids = ti.get(AI.buffer0).ids;
    }
    // The Image Attachment info contents:
    {
        [[maybe_unused]] char const * name = ti.get(AI.image0).name;
        [[maybe_unused]] daxa::TaskAccess access = ti.get(AI.image0).task_access;
        [[maybe_unused]] daxa::ImageViewType view_type = ti.get(AI.image0).view_type;
        [[maybe_unused]] u8 shader_array_size = ti.get(AI.image0).shader_array_size;
        [[maybe_unused]] daxa::TaskHeadImageArrayType shader_array_type = ti.get(AI.image0).shader_array_type;
        [[maybe_unused]] daxa::ImageLayout layout = ti.get(AI.image0).layout;
        [[maybe_unused]] daxa::TaskImageView view = ti.get(AI.image0).view;
        [[maybe_unused]] std::span<daxa::ImageId const> ids = ti.get(AI.image0).ids;
        [[maybe_unused]] std::span<daxa::ImageViewId const> view_ids = ti.get(AI.image0).view_ids;
    }
    // The interface has multiple convenience functions for easier access to the underlying resources attributes:
    {
        // Overloaded for buffer, blas, tlas, image
        [[maybe_unused]] daxa::BufferInfo info = ti.info(AI.buffer0).value();
        // Overloaded for buffer, blas, tlas
        [[maybe_unused]] daxa::DeviceAddress address = ti.device_address(AI.buffer0).value();

        [[maybe_unused]] std::byte * host_address = ti.buffer_host_address(AI.buffer0).value();
        [[maybe_unused]] daxa::ImageViewInfo img_view_info = ti.image_view_info(AI.image0).value();

        // In case the task resource has an array of real resources, one can use the optional second parameter to access those:
        [[maybe_unused]] daxa::BufferInfo info2 = ti.info(AI.buffer0, 123 /*resource index*/).value();
    }
    // The attachment infos are also provided, directly via a span:
    for ([[maybe_unused]] daxa::TaskAttachmentInfo const & attach : ti.attachment_infos)
    {
    }
    // The tasks shader side struct of ids and addresses is automatically filled and serialized to a blob:
    [[maybe_unused]] auto generated_blob = ti.attachment_shader_blob;
    // The head also declared an aligned struct with the right size as a dummy on the c++ side.
    // This can be used to declare shader/c++ shared structs containing this blob :
    [[maybe_unused]] ExampleTaskHead::AttachmentShaderBlob blob = {};
    // The blob also declares a constructor and assignment operator to take in the byte span generated by the taskgraph:
    blob = generated_blob;
    [[maybe_unused]] ExampleTaskHead::AttachmentShaderBlob blob2{ti.attachment_shader_blob};
}

TaskHead Attachment Declarations

There are multiple ways to declare how a resource is used within the shader:

// CPU only attachments. These are not present in the attachment byte blob:
#define DAXA_TH_IMAGE(TASK_ACCESS, VIEW_TYPE, NAME)
#define DAXA_TH_BUFFER(TASK_ACCESS, NAME)
#define DAXA_TH_BLAS(TASK_ACCESS, NAME)
#define DAXA_TH_TLAS(TASK_ACCESS, NAME)

// _ID Attachments will be represented by the first id.
#define DAXA_TH_IMAGE_ID(TASK_ACCESS, VIEW_TYPE, NAME)
#define DAXA_TH_BUFFER_ID(TASK_ACCESS, NAME)
#define DAXA_TH_TLAS_ID(TASK_ACCESS, NAME)

// _INDEX Attachments will be represented by the index of hte first id.
// This is useful for having lots of image attachments.
// Index attachments take only 4 bytes, id attachments need 8 bytes.
#define DAXA_TH_IMAGE_INDEX(TASK_ACCESS, VIEW_TYPE, NAME)

// _TYPED Attachments will be represented either as a (RW)TextureXId<T> or (RW)TextureXIndex<T>.
// These Typed id/index handles are Slang only.
#define DAXA_TH_IMAGE_TYPED(TASK_ACCESS, TEX_TYPE, NAME)

// _MIP_ARRAY Attachments will be represented as an array of ids/indices where each array element
// views a mip level of the first image in the runtime array.
// This can be useful for mip map generation, as storage image views can only see one mip at a time.
// It is allowed to have an image bound to the attachment that has less mips then the array is in size,
// The remaining array elements will be filled with 0s.
#define DAXA_TH_IMAGE_ID_MIP_ARRAY(TASK_ACCESS, VIEW_TYPE, NAME, SIZE)
#define DAXA_TH_IMAGE_INDEX_MIP_ARRAY(TASK_ACCESS, VIEW_TYPE, NAME, SIZE)
#define DAXA_TH_IMAGE_TYPED_MIP_ARRAY(TASK_ACCESS, TEX_TYPE, NAME, SIZE)

// Ptr Attachments are represented by a device address.
#define DAXA_TH_BUFFER_PTR(TASK_ACCESS, PTR_TYPE, NAME)
#define DAXA_TH_TLAS_PTR(TASK_ACCESS, NAME)

Note: Some permutations are missing here. BLAS for example has no _ID, _INDEX or _PTR version. This is intentional, as some resources can not be used in certain ways inside shaders.

There are some additional valid usage rules

A task may use the same image multiple times, as long as the TaskImagView’s slices don’t overlap.
A task may only ever have one use of a TaskBuffer
All task uses must have a valid TaskResource or TaskResourceView assigned to them when adding a task.
All task resources must have valid image and buffer IDs assigned to them on execution.