The stream buffer before this commit once it was full (no more bytes to
write before looping) waiting for all previous operations to finish.
This was a temporary solution and had a noticeable performance penalty
in performance (from what a profiler showed).
To avoid this mark with fences usages of the stream buffer and once it
loops wait for them to be signaled. On average this will never wait.
Each fence knows where its usage finishes, resulting in a non-paged
stream buffer.
On the other side, the buffer cache is reimplemented using the generic
buffer cache. It makes use of the staging buffer pool and the new
stream buffer.
* Allocate memory in discrete exponentially increasing chunks until the
128 MiB threshold. Allocations larger thant that increase linearly by
256 MiB (depending on the required size). This allows to use small
allocations for small resources.
* Move memory maps to a RAII abstraction. To optimize for debugging
tools (like RenderDoc) users will map/unmap on usage. If this ever
becomes a noticeable overhead (from my profiling it doesn't) we can
transparently move to persistent memory maps without harming the API,
getting optimal performance for both gameplay and debugging.
* Improve messages on exceptional situations.
* Fix typos "requeriments" -> "requirements".
* Small style changes.
MakeCurrent is a costly (according to Nsight's profiler it takes a tenth
of a millisecond to complete), and we don't have a reason to call it
because:
- Qt no longer signals a warning if it's not called
- yuzu no longer supports macOS
Create a large descriptor pool where we allocate all our descriptors
from. It has to be wide enough to support any pipeline, hence its large
numbers.
If the descritor pool is filled, we allocate more memory at that moment.
This way we can take advantage of permissive drivers like Nvidia's that
allocate more descriptors than what the spec requires.
This commit introduces a mechanism by which shader IR code can be
amended and extended. This useful for track algorithms where certain
information can derived from before the track such as indexes to array
samplers.
This function is called rarely and blocks quite often for a long time.
So don't waste power and let the CPU sleep.
This might also increase the performance as the other cores might be allowed to clock higher.
The job of this abstraction is to provide staging buffers for temporary
operations. Think of image uploads or buffer uploads to device memory.
It automatically deletes unused buffers.
This object's job is to contain an image and manage its transitions.
Since Nvidia hardware doesn't know what a transition is but Vulkan
requires them anyway, we have to state track image subresources
individually.
To avoid the overhead of tracking each subresource in images with many
subresources (think of cubemap arrays with several mipmaps), this commit
tracks when subresources have diverged. As long as this doesn't happen
we can check the state of the first subresource (that will be shared
with all subresources) and update accordingly.
Image transitions are deferred to the scheduler command buffer.
The intention behind this hasheable structure is to describe the state
of fixed function pipeline state that gets compiled to a single graphics
pipeline state object. This is all dynamic state in OpenGL but Vulkan
wants it in an immutable state, even if hardware can edit it freely.
In this commit the structure is defined in an optimized state (it uses
booleans, has paddings and many data entries that can be packed to
single integers). This is intentional as an initial implementation that
is easier to debug, implement and review. It will be optimized in later
stages, or it might change if Vulkan gets more dynamic states.