Commit graph

6276 commits

Author SHA1 Message Date
Fernando Sahmkow
603c861532 Shader_IR: Implement initial code for tracking indexed samplers. 2020-01-24 16:43:30 -04:00
Fernando Sahmkow
64496f2456 Shader_IR: Address Feedback 2020-01-24 16:43:30 -04:00
Fernando Sahmkow
b97608ca64 Shader_IR: Allow constant access of guest driver. 2020-01-24 16:43:30 -04:00
Fernando Sahmkow
dc5cfa8d28 Shader_IR: Address Feedback 2020-01-24 16:43:29 -04:00
Fernando Sahmkow
74aa7de5e3 Guest_driver: Correct compiling errors in GCC. 2020-01-24 16:43:29 -04:00
Fernando Sahmkow
1e4b6bef6f Shader_IR: Store Bound buffer on Shader Usage 2020-01-24 16:43:29 -04:00
Fernando Sahmkow
c921e496eb GPU: Implement guest driver profile and deduce texture handler sizes. 2020-01-24 16:43:29 -04:00
bunnei
a104b985a8
Merge pull request #3273 from FernandoS27/txd-array
Shader_IR: Implement TXD Array.
2020-01-24 14:02:40 -05:00
ReinUsesLisp
1690f1adba vk_shader_decompiler: Disable default values on unwritten render targets
Some games like The Legend of Zelda: Breath of the Wild assign
render targets without writing them from the fragment shader. This
generates Vulkan validation errors, so silence these I previously
introduced a commit to set "vec4(0, 0, 0, 1)" for these attachments. The
problem is that this is not what games expect. This commit reverts that
change.
2020-01-24 01:16:21 -03:00
ReinUsesLisp
3ce28342a2 gl_shader_cache: Disable fastmath on Nvidia 2020-01-21 19:08:08 -03:00
Fernando Sahmkow
79e0991d9b
Merge pull request #3330 from ReinUsesLisp/vk-blit-screen
vk_blit_screen: Initial implementation
2020-01-20 22:32:16 -04:00
ReinUsesLisp
a665581684 vk_blit_screen: Address feedback 2020-01-20 18:43:11 -03:00
bunnei
69b44392a7
Merge pull request #3328 from ReinUsesLisp/vulkan-atoms
vk_shader_decompiler: Implement UAtomicAdd (ATOMS) on SPIR-V
2020-01-20 00:01:52 -05:00
bunnei
5a077c95ce
Merge pull request #3322 from ReinUsesLisp/vk-front-face
vk_graphics_pipeline: Set front facing properly
2020-01-19 23:22:34 -05:00
ReinUsesLisp
f5dfe68a94 vk_blit_screen: Initial implementation
This abstraction takes care of presenting accelerated and
non-accelerated or "framebuffer" images to the Vulkan swapchain.
2020-01-19 21:12:43 -03:00
bunnei
41373d212e
Merge pull request #3313 from ReinUsesLisp/vk-rasterizer
vk_rasterizer: Implement Vulkan's rasterizer
2020-01-19 18:09:01 -05:00
ReinUsesLisp
b2c976ad0e vk_shader_decompiler: Implement UAtomicAdd (ATOMS) on SPIR-V
Also updates sirit to include atomic instructions.
2020-01-19 16:40:31 -03:00
Fernando Sahmkow
51c8aea979
Merge pull request #3317 from ReinUsesLisp/gl-decomp-cc-decomp
gl_shader_decompiler: Fix decompilation of condition codes
2020-01-18 19:56:55 -04:00
ReinUsesLisp
d110a371bb gl_state: Use bool instead of GLboolean
This fixes template resolution considering GLboolean an integer instead
of a bool.
2020-01-18 19:10:34 -03:00
ReinUsesLisp
94915d4ea1 vk_graphics_pipeline: Set front facing properly
Front face was being forced to a certain value when cull face is
disabled. Set a default value on initialization and drop the forcefully
set front facing value with culling disabled.
2020-01-18 18:50:47 -03:00
bunnei
9bf4850f74
Merge pull request #3305 from ReinUsesLisp/point-size-program
gl_state: Implement PROGRAM_POINT_SIZE
2020-01-18 01:56:32 -05:00
bunnei
15163edaaa
Merge pull request #3312 from ReinUsesLisp/atoms-u32
shader/memory: Implement ATOMS.ADD.U32
2020-01-18 00:54:07 -05:00
ReinUsesLisp
09b1d762d7 vk_rasterizer: Address feedback 2020-01-17 21:40:01 -03:00
ReinUsesLisp
f34e519da3 gl_shader_decompiler: Fix decompilation of condition codes
Use Visit instead of reimplementing it. Fixes unimplemented negations
for condition codes.
2020-01-17 21:23:01 -03:00
bunnei
48863afb65
Merge pull request #3306 from ReinUsesLisp/gl-texture
gl_texture_cache: Minor fixes and style changes
2020-01-17 15:44:02 -05:00
bunnei
657b3a366e
Merge pull request #3311 from ReinUsesLisp/z32fx24s8
format_lookup_table: Fix ZF32_X24S8 component types
2020-01-17 08:22:32 -05:00
ReinUsesLisp
fe5356d223 vk_rasterizer: Implement Vulkan's rasterizer
This abstraction is Vulkan's equivalent to OpenGL's rasterizer. It takes
care of joining all parts of the backend and rendering accordingly on
demand.
2020-01-16 23:05:15 -03:00
ReinUsesLisp
38e789c761 renderer_vulkan: Add header as placeholder 2020-01-16 22:54:15 -03:00
bunnei
e041f33569
Merge pull request #3300 from ReinUsesLisp/vk-texture-cache
vk_texture_cache: Implement generic texture cache on Vulkan
2020-01-16 19:19:26 -05:00
ReinUsesLisp
f09cd52980 vk_texture_cache: Address feedback 2020-01-16 18:23:10 -03:00
ReinUsesLisp
63ba41a26d shader/memory: Implement ATOMS.ADD.U32 2020-01-16 17:30:55 -03:00
ReinUsesLisp
0caab54b5d format_lookup_table: Fix ZF32_X24S8 component types
Component types for ZF32_X24S8 were using UNORM. Drivers will set FLOAT,
UINT, UNORM, UNORM; causing a format mismatch. This commit addresses
that.
2020-01-16 17:29:13 -03:00
Rodrigo Locatti
82e1285c1e
vk_texture_cache: Fix typo in commentary
Co-Authored-By: MysticExile <30736337+MysticExile@users.noreply.github.com>
2020-01-16 16:59:46 -03:00
bunnei
30faf6a964
Merge pull request #3308 from lioncash/private
maxwell_3d: Make dirty_pointers private
2020-01-16 13:26:35 -05:00
bunnei
d23869811d
Merge pull request #3304 from lioncash/fwd-decl
renderer_opengl/utils: Forward declare private structs
2020-01-16 11:21:18 -05:00
Lioncash
9e874898f5 maxwell_3d: Make dirty_pointers private
This isn't used outside of the class itself, so we can make it private
for the time being.
2020-01-16 04:07:15 -05:00
ReinUsesLisp
c375d735e6 gl_state: Implement PROGRAM_POINT_SIZE
For gl_PointSize to have effect we have to activate
GL_PROGRAM_POINT_SIZE.
2020-01-15 16:14:17 -03:00
Lioncash
7af56dfa76 renderer_opengl/utils: Remove unused header inclusions
Nothing from these headers are used, so they can be removed.
2020-01-15 06:31:23 -05:00
Lioncash
06d30fbcca renderer_opengl/utils: Forward declare private structs
Keeps the definitions hidden and allows changes to the structs without
needing to recompile all users of classes containing said structs.
2020-01-15 06:30:01 -05:00
ReinUsesLisp
66a1c777c9 gl_texture_cache: Use local variables to simplify DownloadTexture 2020-01-14 17:39:48 -03:00
ReinUsesLisp
cdb00546f0 gl_texture_cache: Fix format for RGBX16F 2020-01-14 17:38:33 -03:00
ReinUsesLisp
2d09467f6f gl_texture_cache: Use Snorm internal format for RG8S 2020-01-14 17:37:58 -03:00
ReinUsesLisp
02624c35ec gl_texture_cache: Use Snorm internal format for ABGR8S 2020-01-14 17:37:23 -03:00
Rodrigo Locatti
64cd46579b
Merge pull request #3303 from lioncash/reorder
control_flow: Silence -Wreorder warning for CFGRebuildState
2020-01-14 16:15:18 -03:00
Lioncash
a1eee1749e control_flow: Silence -Wreorder warning for CFGRebuildState
Organizes the initializer list in the same order that the variables
would actually be initialized in.
2020-01-14 13:28:48 -05:00
Lioncash
f10ea944e0 gl_shader_cache: Remove unused STAGE_RESERVED_UBOS constant
Given this isn't used, this can be removed entirely.
2020-01-14 13:16:52 -05:00
Lioncash
4cd5ad90f3 gl_shader_cache: std::move entries in CachedShader constructor
Avoids several reallocations of std::vector instances where applicable.
2020-01-14 13:14:16 -05:00
Lioncash
15a6840e7a gl_shader_cache: Remove unused entries variable in BuildShader()
Eliminates a few unnecessary constructions of std::vectors.
2020-01-14 13:11:49 -05:00
bunnei
55f95e7f26
Merge pull request #3287 from ReinUsesLisp/ldg-stg-16
shader_ir/memory: Implement u16 and u8 for STG and LDG
2020-01-14 09:57:08 -05:00
bunnei
15788ffcde
Merge pull request #3288 from ReinUsesLisp/uncurse-aoffi
shader_ir/texture: Simplify AOFFI code
2020-01-13 23:52:12 -05:00
bunnei
6985eea519
Merge pull request #3290 from ReinUsesLisp/gl-clamp
maxwell_to_vk: Implement GL_CLAMP hacking Nvidia's driver
2020-01-13 19:16:06 -05:00
ReinUsesLisp
09e17fbb0f vk_texture_cache: Implement generic texture cache on Vulkan
It currently ignores PBO linearizations since these should be dropped as
soon as possible on OpenGL.
2020-01-13 20:37:50 -03:00
ReinUsesLisp
2b2712fa95 texture_cache/surface_params: Make GetNumLayers public 2020-01-13 20:35:43 -03:00
Rodrigo Locatti
b1138e5ea1
vk_compute_pass: Address feedback
Comment hardcoded SPIR-V modules.
2020-01-10 22:46:34 -03:00
ReinUsesLisp
3d46709b7f maxwell_to_vk: Implement GL_CLAMP hacking Nvidia's driver
Nvidia's driver defaults invalid enumerations to GL_CLAMP. Vulkan
doesn't expose GL_CLAMP through its API, but we can hack it on Nvidia's
driver using the internal driver defaults.
2020-01-10 17:12:50 -03:00
ReinUsesLisp
13021b534c shader_ir/texture: Simplify AOFFI code 2020-01-09 03:50:37 -03:00
ReinUsesLisp
e2a2a556b9 shader_ir/memory: Implement u16 and u8 for STG and LDG
Using the same technique we used for u8 on LDG, implement u16.

In the case of STG, load memory and insert the value we want to set
into it with bitfieldInsert. Then set that value.
2020-01-09 02:12:29 -03:00
ReinUsesLisp
908e085d02 vk_compute_pass: Add compute passes to emulate missing Vulkan features
This currently only supports quad arrays and u8 indices.

In the future we can remove quad arrays with a table written from the
CPU, but this was used to bootstrap the other passes helpers and it
was left in the code.

The blob code is generated from the "shaders/" directory. Read the
instructions there to know how to generate the SPIR-V.
2020-01-08 19:24:26 -03:00
ReinUsesLisp
82a64da077 vk_shader_util: Add helper to build SPIR-V shaders 2020-01-08 19:22:20 -03:00
ReinUsesLisp
6888d776ff vk_pipeline_cache: Initial implementation
Given a pipeline key, this cache returns a pipeline abstraction (for
graphics or compute).
2020-01-06 22:02:26 -03:00
ReinUsesLisp
2effdeb924 vk_graphics_pipeline: Initial implementation
This abstractio represents the state of the 3D engine at a given draw.
Instead of changing individual bits of the pipeline how it's done in
APIs like D3D11, OpenGL and NVN; on Vulkan we are forced to put
everything together into a single, immutable object.

It takes advantage of the few dynamic states Vulkan offers.
2020-01-06 22:02:26 -03:00
ReinUsesLisp
dc96a59fa0 vk_compute_pipeline: Initial implementation
This abstraction represents a Vulkan compute pipeline.
2020-01-06 22:02:26 -03:00
ReinUsesLisp
b392a5986e vk_pipeline_cache: Add file and define descriptor update template filler
This function allows us to share code between compute and graphics
pipelines compilation.
2020-01-06 22:02:26 -03:00
ReinUsesLisp
3142f1b597 fixed_pipeline_state: Add depth clamp 2020-01-06 22:02:26 -03:00
ReinUsesLisp
9c548146ca vk_rasterizer: Add placeholder 2020-01-06 22:02:26 -03:00
bunnei
5be00cba15
Merge pull request #3276 from ReinUsesLisp/pipeline-reqs
vk_update_descriptor/vk_renderpass_cache: Add pipeline cache dependencies
2020-01-06 17:03:34 -05:00
ReinUsesLisp
5aeff9aff5 vk_renderpass_cache: Initial implementation
The renderpass cache is used to avoid creating renderpasses on each
draw. The hashed structure is not currently optimized.
2020-01-06 18:28:32 -03:00
ReinUsesLisp
322d6a0311 vk_update_descriptor: Initial implementation
The update descriptor is used to store in flat memory a large chunk of
staging data used to update descriptor sets through templates. It
provides a push interface to easily insert descriptors following the
current pipeline. The order used in the descriptor update template has
to be implicitly followed. We can catch bugs here using validation
layers.
2020-01-06 18:28:32 -03:00
ReinUsesLisp
5b01f80a12 vk_stream_buffer/vk_buffer_cache: Avoid halting and use generic cache
The stream buffer before this commit once it was full (no more bytes to
write before looping) waiting for all previous operations to finish.
This was a temporary solution and had a noticeable performance penalty
in performance (from what a profiler showed).

To avoid this mark with fences usages of the stream buffer and once it
loops wait for them to be signaled. On average this will never wait.
Each fence knows where its usage finishes, resulting in a non-paged
stream buffer.

On the other side, the buffer cache is reimplemented using the generic
buffer cache. It makes use of the staging buffer pool and the new
stream buffer.
2020-01-06 18:13:41 -03:00
ReinUsesLisp
ceb851b590 vk_memory_manager: Misc changes
* Allocate memory in discrete exponentially increasing chunks until the
128 MiB threshold. Allocations larger thant that increase linearly by
256 MiB (depending on the required size). This allows to use small
allocations for small resources.

* Move memory maps to a RAII abstraction. To optimize for debugging
tools (like RenderDoc) users will map/unmap on usage. If this ever
becomes a noticeable overhead (from my profiling it doesn't) we can
transparently move to persistent memory maps without harming the API,
getting optimal performance for both gameplay and debugging.

* Improve messages on exceptional situations.

* Fix typos "requeriments" -> "requirements".

* Small style changes.
2020-01-06 18:13:41 -03:00
ReinUsesLisp
85bb6a6f08 vk_buffer_cache: Temporarily remove buffer cache
This is intended for a follow up commit to avoid circular dependencies.
2020-01-06 17:58:46 -03:00
bunnei
89fc75d769
Merge pull request #3257 from degasus/no_busy_loops
video_core: Block in WaitFence.
2020-01-06 00:09:57 -05:00
Fernando Sahmkow
56e450a3f7
Merge pull request #3264 from ReinUsesLisp/vk-descriptor-pool
vk_descriptor_pool: Initial implementation
2020-01-05 15:54:41 -04:00
bunnei
cd0a7dfdbc
Merge pull request #3258 from FernandoS27/shader-amend
Shader_IR: add the ability to amend code in the shader ir.
2020-01-04 14:05:17 -05:00
Fernando Sahmkow
3dd6b55851 Shader_IR: Address Feedback 2020-01-04 14:40:57 -04:00
Fernando Sahmkow
a1667a7b46 Shader_IR: Implement TXD Array.
This commit extends the compilation of TXD to support array samplers on
TXD.
2020-01-04 13:28:02 -04:00
Rodrigo Locatti
6e347d8d1b
Update src/video_core/renderer_vulkan/vk_descriptor_pool.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2020-01-03 17:34:30 -03:00
ReinUsesLisp
0d6d8129c4 yuzu: Remove Maxwell debugger
This was carried from Citra and wasn't really used on yuzu. It also adds
some runtime overhead. This commit removes it from yuzu's codebase.
2020-01-02 23:09:44 -03:00
bunnei
ae0e481677
Merge pull request #3243 from ReinUsesLisp/topologies
maxwell_to_gl: Implement missing primitive topologies
2020-01-01 20:33:33 -05:00
ReinUsesLisp
1fe7df4517 vk_descriptor_pool: Initial implementation
Create a large descriptor pool where we allocate all our descriptors
from. It has to be wide enough to support any pipeline, hence its large
numbers.

If the descritor pool is filled, we allocate more memory at that moment.
This way we can take advantage of permissive drivers like Nvidia's that
allocate more descriptors than what the spec requires.
2020-01-01 16:44:06 -03:00
bunnei
028b2718ed
Merge pull request #3239 from ReinUsesLisp/p2r
shader/p2r: Implement P2R Pr
2019-12-31 20:37:16 -05:00
Fernando Sahmkow
b3371ed09e Shader_IR: add the ability to amend code in the shader ir.
This commit introduces a mechanism by which shader IR code can be
amended and extended. This useful for track algorithms where certain
information can derived from before the track such as indexes to array
samplers.
2019-12-30 15:31:48 -04:00
Fernando Sahmkow
7bd447355f
Merge pull request #3248 from ReinUsesLisp/vk-image
vk_image: Add an image object abstraction
2019-12-30 14:25:14 -04:00
Rodrigo Locatti
4cbb363d3f
vk_image: Avoid unnecesary equals 2019-12-30 13:28:23 -03:00
Fernando Sahmkow
287d5921cf
Merge pull request #3249 from ReinUsesLisp/vk-staging-buffer-pool
vk_staging_buffer_pool: Add a staging pool for temporary operations
2019-12-30 12:25:59 -04:00
Markus Wick
cb9dd01ffd video_core: Block in WaitFence.
This function is called rarely and blocks quite often for a long time.
So don't waste power and let the CPU sleep.

This might also increase the performance as the other cores might be allowed to clock higher.
2019-12-30 13:04:53 +01:00
Rodrigo Locatti
f2c61bbe13
vk_staging_buffer_pool: Initialize last epoch to zero 2019-12-29 19:19:43 -03:00
Fernando Sahmkow
f846e3d6d0
Merge pull request #3250 from ReinUsesLisp/empty-fragment
gl_rasterizer: Allow rendering without fragment shader
2019-12-28 14:33:53 -04:00
bunnei
8a76f816a4
Merge pull request #3228 from ReinUsesLisp/ptp
shader/texture: Implement AOFFI and PTP for TLD4 and TLD4S
2019-12-26 21:43:44 -05:00
ReinUsesLisp
5b989f189f
gl_rasterizer: Allow rendering without fragment shader
Rendering without a fragment shader is usually used in depth-only
passes.
2019-12-26 16:38:49 -03:00
ReinUsesLisp
3813af2f3c
vk_staging_buffer_pool: Add a staging pool for temporary operations
The job of this abstraction is to provide staging buffers for temporary
operations. Think of image uploads or buffer uploads to device memory.

It automatically deletes unused buffers.
2019-12-25 18:12:17 -03:00
ReinUsesLisp
c83bf7cd1e
vk_image: Add an image object abstraction
This object's job is to contain an image and manage its transitions.
Since Nvidia hardware doesn't know what a transition is but Vulkan
requires them anyway, we have to state track image subresources
individually.

To avoid the overhead of tracking each subresource in images with many
subresources (think of cubemap arrays with several mipmaps), this commit
tracks when subresources have diverged. As long as this doesn't happen
we can check the state of the first subresource (that will be shared
with all subresources) and update accordingly.

Image transitions are deferred to the scheduler command buffer.
2019-12-25 18:00:16 -03:00
Fernando Sahmkow
5619d24377
Merge pull request #3244 from ReinUsesLisp/vk-fps
fixed_pipeline_state: Define structure and loaders
2019-12-25 14:31:29 -04:00
bunnei
4af569ee47
Merge pull request #3236 from ReinUsesLisp/rasterize-enable
gl_rasterizer: Implement RASTERIZE_ENABLE
2019-12-24 22:54:10 -05:00
ReinUsesLisp
b9e3f5eb36
fixed_pipeline_state: Define symetric operator!= and mark as noexcept
Marks as noexcept Hash, operator== and operator!= for consistency.
2019-12-24 18:24:08 -03:00
ReinUsesLisp
4a3026b16b
fixed_pipeline_state: Define structure and loaders
The intention behind this hasheable structure is to describe the state
of fixed function pipeline state that gets compiled to a single graphics
pipeline state object. This is all dynamic state in OpenGL but Vulkan
wants it in an immutable state, even if hardware can edit it freely.

In this commit the structure is defined in an optimized state (it uses
booleans, has paddings and many data entries that can be packed to
single integers). This is intentional as an initial implementation that
is easier to debug, implement and review. It will be optimized in later
stages, or it might change if Vulkan gets more dynamic states.
2019-12-22 22:59:11 -03:00
ReinUsesLisp
5770418fb3
maxwell_3d: Add depth bounds registers 2019-12-22 22:55:06 -03:00
ReinUsesLisp
91d35559e5
maxwell_to_gl: Implement missing primitive topologies
Many of these topologies are exclusively available in OpenGL.
2019-12-22 22:33:01 -03:00
bunnei
e976d0e924
Merge pull request #3241 from ReinUsesLisp/gl-shader-cache
gl_shader_cache: Style changes
2019-12-22 16:23:46 -05:00
bunnei
1e76655f83
Merge pull request #3238 from ReinUsesLisp/vk-resource-manager
vk_resource_manager: Catch device losses and other changes
2019-12-22 15:57:16 -05:00
bunnei
0f3ac9cfeb
Merge pull request #3203 from FernandoS27/tex-cache-fixes
Texture Cache: Add HLE methods for building 3D textures
2019-12-22 14:25:13 -05:00
Fernando Sahmkow
3dc585d011
Merge pull request #3237 from ReinUsesLisp/vk-shader-decompiler
vk_shader_decompiler: Misc changes
2019-12-22 12:36:56 -04:00
Fernando Sahmkow
218ee18417 Texture Cache: Improve documentation 2019-12-22 12:29:23 -04:00
Fernando Sahmkow
a3916588b6 Texture Cache: Address Feedback 2019-12-22 12:24:34 -04:00
Fernando Sahmkow
51c9e98677 Texture Cache: Add HLE methods for building 3D textures within the GPU in certain scenarios.
This commit adds a series of HLE methods for handling 3D textures in
general. This helps games that generate 3D textures on every frame and
may reduce loading times for certain games.
2019-12-22 12:24:34 -04:00
Fernando Sahmkow
aea978e037
Merge pull request #3230 from ReinUsesLisp/vk-emu-shaders
renderer_vulkan/shader: Add helper GLSL shaders
2019-12-22 11:23:09 -04:00
Fernando Sahmkow
27efcc15e9
Merge pull request #3240 from ReinUsesLisp/decomp-cond-code
vk_shader_decompiler: Use Visit instead of reimplementing it
2019-12-22 11:20:55 -04:00
bunnei
16dcfacbfc
Merge pull request #3235 from ReinUsesLisp/ldg-u8
shader/memory: Implement LDG.U8 and unaligned U8 loads
2019-12-21 22:50:28 -05:00
ReinUsesLisp
1e16023d60
gl_shader_cache: Update commentary for shared memory
Remove false commentary. Not dividing by 4 the size of shared memory is
not a hack; it describes the number of integers, not bytes.

While we are at it sort the generated code to put preprocessor lines on
the top.
2019-12-20 22:51:21 -03:00
ReinUsesLisp
486c6a5316
gl_shader_cache: Remove unused entry in GetPrimitiveDescription 2019-12-20 22:49:30 -03:00
ReinUsesLisp
af93909c9c
vk_shader_decompiler: Use Visit instead of reimplementing it
ExprCondCode visit implements the generic Visit. Use this instead of
that one.

As an intended side effect this fixes unwritten memory usages in cases
when a negation of a condition code is used.
2019-12-20 21:36:25 -03:00
ReinUsesLisp
38d3a48873
shader/p2r: Implement P2R Pr
P2R dumps predicate or condition codes state to a register. This is
useful for unit testing.
2019-12-20 18:02:41 -03:00
ReinUsesLisp
cf27b59493
shader/r2p: Refactor P2R to support P2R 2019-12-20 17:55:42 -03:00
bunnei
7be65c6a68
Merge pull request #3234 from ReinUsesLisp/i2f-u8-selector
shader/conversion: Implement byte selector in I2F
2019-12-19 22:36:26 -05:00
bunnei
6d55b14cc0
Merge pull request #3233 from ReinUsesLisp/mismatch-sizes
shader/texture: Properly shrink unused entries in size mismatches
2019-12-19 20:40:27 -05:00
ReinUsesLisp
e41da22c8d
vk_resource_manager: Add entry to VKFence to test its usage 2019-12-19 16:31:34 -03:00
ReinUsesLisp
ec983a2451
vk_reosurce_manager: Add assert for releasing fences
Notify the programmer when a request to release a fence is invalid
because the fence is already free.
2019-12-19 16:31:34 -03:00
ReinUsesLisp
6ddffa010a
vk_resource_manager: Implement VKFenceWatch move constructor
This allows us to put VKFenceWatch inside a std::vector without storing
it in heap. On move we have to signal the fences where the new protected
resource is, adding some overhead.
2019-12-19 16:31:34 -03:00
ReinUsesLisp
54747d60bc
vk_device: Add entry to catch device losses
VK_NV_device_diagnostic_checkpoints allows us to push data to a Vulkan
queue and then query it even after a device loss. This allows us to push
the current pipeline object and see what was the call that killed the
device.
2019-12-19 16:31:33 -03:00
ReinUsesLisp
2a63b3bdb9
vk_shader_decompiler: Fix full decompilation
When full decompilation was enabled, labels were not being inserted and
instructions were misused. Fix these bugs.
2019-12-19 16:24:45 -03:00
ReinUsesLisp
de918ebeb0
vk_shader_decompiler: Skip NDC correction when it is native
Avoid changing gl_Position when the NDC used by the game is [0, 1]
(Vulkan's native).
2019-12-19 16:24:45 -03:00
ReinUsesLisp
485c21eac3
vk_shader_decompiler: Normalize output fragment attachments
Some games write from fragment shaders to an unexistant framebuffer
attachment or they don't write to one when it exists in the framebuffer.
Fix this by skipping writes or adding zeroes.
2019-12-19 16:24:45 -03:00
bunnei
1eb4a95d2b
Merge pull request #3232 from ReinUsesLisp/gl-decompiler-images
gl_shader_decompiler: Add missing DeclareImages
2019-12-19 11:32:47 -05:00
bunnei
253aa52351
Merge pull request #3231 from ReinUsesLisp/tld4s-encoding
shader_bytecode: Fix TLD4S encoding
2019-12-19 11:32:25 -05:00
ReinUsesLisp
f4a25f854c
vk_device: Add query for RGBA8Uint 2019-12-19 02:08:29 -03:00
ReinUsesLisp
abb33d4aec
vk_shader_decompiler: Update sirit and implement Texture AOFFI 2019-12-19 01:42:13 -03:00
bunnei
d53cf05513
Merge pull request #3221 from ReinUsesLisp/vk-scheduler
vk_scheduler: Delegate commands to a worker thread and state track
2019-12-18 22:04:08 -05:00
ReinUsesLisp
da0aa4da6b
gl_rasterizer: Implement RASTERIZE_ENABLE
RASTERIZE_ENABLE is the opposite of GL_RASTERIZER_DISCARD. Implement it
naturally using this.

NVN games expect rasterize to be enabled by default, reflect that in our
initial GPU state.
2019-12-18 19:28:23 -03:00
ReinUsesLisp
ae8d4b6c0c
shader/memory: Implement LDG.U8 and unaligned U8 loads
LDG can load single bytes instead of full integers or packs of integers.
These have the advantage of loading bytes that are not aligned to 4
bytes.

To emulate these this commit gets the byte being referenced (by doing
"address & 3" and then using that to extract the byte from the loaded
integer:

result = bitfieldExtract(loaded_integer, (address % 4) * 8, 8)
2019-12-18 01:21:46 -03:00
ReinUsesLisp
a7d6bd1ef1
shader/conversion: Implement byte selector in I2F
I2F's byte selector is used to choose what bytes to convert to float.
e.g. if the input is 0xaabbccdd and the selector is ".B3" it will
convert 0xaa. The default (when it's not shown in nvdisasm) is ".B0", in
that example the default would convert 0xdd to float.
2019-12-18 00:41:22 -03:00
ReinUsesLisp
15a753b9a5
shader/texture: Properly shrink unused entries in size mismatches
When a image format mismatches we were inserting zeroes to the texture
itself. This was not handling cases were the mismatch uses less
coordinates than the guest shader code. Address that by resizing the
vector.
2019-12-17 23:38:10 -03:00
ReinUsesLisp
e438079b50
gl_shader_decompiler: Add missing DeclareImages 2019-12-17 23:34:15 -03:00
ReinUsesLisp
8b26b4228b
shader_bytecode: Fix TLD4S encoding 2019-12-17 23:32:10 -03:00
ReinUsesLisp
b52297767e
renderer_vulkan/shader: Add helper GLSL shaders
These shaders are used to specify code that is not dynamically generated
in the Vulkan backend. Instead of packing it inside the build system,
it's manually built and copied to the C++ file to avoid adding
unnecessary build time dependencies.

quad_array should be dropped in the future since it can be emulated with
a memory pool generated from the CPU.
2019-12-16 17:59:08 -03:00
bunnei
65b1b05e05
Merge pull request #3182 from ReinUsesLisp/renderer-opengl
renderer_opengl: Miscellaneous clean ups
2019-12-16 13:01:04 -05:00
ReinUsesLisp
e09c1fbc1f
shader/texture: Implement TLD4.PTP 2019-12-16 04:09:24 -03:00
ReinUsesLisp
844e4a297b
shader/texture: Enable arrayed TLD4 2019-12-16 02:37:21 -03:00
ReinUsesLisp
a87c85eba2
gl_shader_decompiler: Rename "sepparate" to "separate" 2019-12-16 02:12:51 -03:00
ReinUsesLisp
3d2c44848b
shader/texture: Implement AOFFI for TLD4S 2019-12-16 02:06:42 -03:00
ReinUsesLisp
3d9fff82c0
shader/texture: Remove unnecesary parenthesis 2019-12-16 01:52:33 -03:00
Rodrigo Locatti
eac075692b
Merge pull request #3219 from FernandoS27/fix-bindless
Corrections and fixes to TLD4S & bindless samplers failing
2019-12-16 01:26:11 -03:00
bunnei
3d51153611
Merge pull request #3222 from ReinUsesLisp/maxwell-to-vk
maxwell_to_vk: Use VK_EXT_index_type_uint8 and misc changes
2019-12-14 22:30:12 -05:00
bunnei
035ec7d9de
Merge pull request #3213 from ReinUsesLisp/intel-mesa
gl_device: Enable compute shaders for Intel Mesa drivers
2019-12-14 16:04:31 -05:00
bunnei
2b650543c6
Merge pull request #3212 from ReinUsesLisp/fix-smem-lmem
gl_shader_cache: Add missing new-line on emitted GLSL
2019-12-13 21:35:29 -05:00
ReinUsesLisp
e3ea583893
maxwell_to_vk: Improve image format table and add more formats
A1B5G5R5 uses A1R5G5B5. This is flipped with image view swizzles;
flushing is still not properly implemented on Vulkan for this particular
format.
2019-12-13 03:12:29 -03:00
ReinUsesLisp
f27b21077d
maxwell_to_vk: Implement more vertex formats 2019-12-13 03:12:28 -03:00
ReinUsesLisp
8db8631d81
maxwell_to_vk: Implement more primitive topologies
Add an extra argument to query device capabilities in the future. The
intention behind this is to use native quads, quad strips, line loops
and polygons if these are released for Vulkan.
2019-12-13 03:12:28 -03:00
ReinUsesLisp
15513f0801
maxwell_to_vk: Approach GL_CLAMP closer to the GL spec
The OpenGL spec defines GL_CLAMP's formula similarly to CLAMP_TO_EDGE
and CLAMP_TO_BORDER depending on the filter mode used. It doesn't
exactly behave like this, but it's the closest we can get with what
Vulkan offers without emulating it by injecting shader code.
2019-12-13 03:12:28 -03:00
ReinUsesLisp
f845df8651
maxwell_to_vk: Use VK_EXT_index_type_uint8 when available 2019-12-13 02:37:23 -03:00
ReinUsesLisp
2df9a2dcaf
vk_scheduler: Delegate commands to a worker thread and state track
Introduce a worker thread approach for delegating Vulkan work derived
from dxvk's approach. https://github.com/doitsujin/dxvk

Now that the scheduler is what handles all Vulkan work related to
command streaming, store state tracking in itself. This way we can know
when to reupload Vulkan dynamic state to the queue (since this one is
invalidated between command buffers unlike NVN). We can also store the
renderpass state and graphics pipeline bound to avoid redundant binds
and renderpass begins/ends.
2019-12-13 02:24:48 -03:00
bunnei
8fc49a83b6
Merge pull request #3217 from jhol/fix-boost-include
Added missing include
2019-12-11 22:21:24 -05:00
Fernando Sahmkow
c0ee0aa1a8 Shader_IR: Correct TLD4S Depth Compare. 2019-12-11 19:53:17 -04:00
Fernando Sahmkow
af89723fa3 Shader_Ir: Correct TLD4S encoding and implement f16 flag. 2019-12-11 19:53:17 -04:00
Fernando Sahmkow
84a158c977 Gl_Shader_compiler: Correct Depth Compare for Texture Gather operations. 2019-12-11 19:53:16 -04:00
Fernando Sahmkow
271a3264f3 Shader_Ir: default failed tracks on bindless samplers to null values. 2019-12-11 19:53:16 -04:00
Fernando Sahmkow
1d2ba3cc97 Gl_Rasterizer: Skip Tesselation Control and Eval stages as they are un implemented.
This commit ensures the OGL backend does not execute tesselation shader 
stages as they are currently unimplemented.
2019-12-11 15:41:26 -04:00
bunnei
1a66cde175
Merge pull request #3210 from ReinUsesLisp/memory-barrier
shader: Implement MEMBAR.GL
2019-12-11 14:24:39 -05:00
Joel Holdsworth
e9faa1617c Added missing include 2019-12-11 18:11:49 +00:00
ReinUsesLisp
f564eaebed
gl_device: Enable compute shaders for Intel Mesa drivers
Previously we naively checked for "Intel" in GL_VENDOR, but this
includes both Intel's proprietary driver and the mesa driver. Re-enable
compute shaders for mesa.
2019-12-11 00:00:30 -03:00
ReinUsesLisp
48e16c4c49
gl_shader_cache: Add missing new-line on emitted GLSL
Add missing new-line. This caused shaders using local memory and shared
memory to inject a preprocessor GLSL line after an expression (resulting
in invalid code).

It looked like this:
shared uint smem[8];#define LOCAL_MEMORY_SIZE 16

It should look like this (addressed by this commit):
shared uint smem[8];
\#define LOCAL_MEMORY_SIZE 16
2019-12-10 23:52:51 -03:00
Fernando Sahmkow
7ffb672f61 Maxwell3D: Implement Depth Mode.
This commit finishes adding depth mode that was reverted before due to
other unresolved issues.
2019-12-10 19:51:46 -04:00
ReinUsesLisp
425a254fa2
shader: Implement MEMBAR.GL
Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.
2019-12-10 16:45:03 -03:00
ReinUsesLisp
233ed96a5c
vk_shader_decompiler: Fix build issues on old gcc versions 2019-12-10 01:55:38 -03:00
ReinUsesLisp
d30cf51d7d
vk_shader_decompiler: Reduce YNegate's severity 2019-12-09 23:52:28 -03:00
ReinUsesLisp
0b5b93053d
shader_ir/other: Implement S2R InvocationId 2019-12-09 23:52:28 -03:00
ReinUsesLisp
ecbfa416f0
vk_shader_decompiler: Misc changes
Update Sirit and its usage in vk_shader_decompiler. Highlights:
- Implement tessellation shaders
- Implement geometry shaders
- Implement some missing features
- Use native half float instructions when available.
2019-12-09 23:51:57 -03:00
ReinUsesLisp
9ad6327fbd
shader: Keep track of shaders using warp instructions 2019-12-09 23:40:41 -03:00
ReinUsesLisp
6233b1db08
shader_ir/memory: Implement patch stores 2019-12-09 23:25:21 -03:00
ReinUsesLisp
19ce0d4f1a
vk_device: Misc changes
- Setup more features and requirements.
- Improve logging for missing features.
- Collect telemetry parameters.
- Add queries for more image formats.
- Query push constants limits.
- Optionally enable some extensions.
2019-12-09 01:04:48 -03:00
bunnei
faf5ae6a50
Merge pull request #3198 from ReinUsesLisp/tessellation-maxwell
maxwell_3d: Add tessellation state entries
2019-12-08 22:28:25 -05:00
ReinUsesLisp
7ea362e134
externals: Update Vulkan-Headers 2019-12-08 22:08:19 -03:00
ReinUsesLisp
f632d00eb1
vk_swapchain: Add support for swapping sRGB
We don't know until the game is running if it's using an sRGB color
space or not. Add support for hot-swapping swapchain surface formats.
2019-12-06 22:42:08 -03:00
ReinUsesLisp
36651f215a
maxwell_3d: Add tessellation tess level registers 2019-12-06 22:08:22 -03:00
ReinUsesLisp
707bf41c6f
maxwell_3d: Add tessellation mode register 2019-12-06 22:07:31 -03:00
ReinUsesLisp
d2b50c5ebd
maxwell_3d: Add patch vertices register 2019-12-06 22:06:53 -03:00
ReinUsesLisp
74f515e8b6
shader_bytecode: Remove corrupted character 2019-12-06 20:31:56 -03:00
bunnei
e36814d6d5
Merge pull request #3109 from FernandoS27/new-instr
Implement FLO & TXD Instructions on GPU Shaders
2019-12-06 18:18:16 -05:00
bunnei
3c1b6b5723
Merge pull request #2987 from FernandoS27/texture-invalid
Texture_Cache: Redo invalid Surfaces handling.
2019-12-02 12:07:05 -05:00
bunnei
930b7c18a6
Merge pull request #3184 from ReinUsesLisp/framebuffer-cache
gl_framebuffer_cache: Optimize framebuffer cache management
2019-11-30 18:46:40 -05:00
ReinUsesLisp
ff64c3951a
texture_cache/surface_base: Fix out of bounds texture views
Some texture views were being created out of bounds (with more layers or
mipmaps than what the original texture has). This is because of a
miscalculation in mipmap bounding. end_layer and end_mipmap are out of
bounds (e.g. layer 6 in a cubemap), there's no need to add one more
there.

Fixes OpenGL errors and Vulkan crashes on Splatoon 2.
2019-11-29 16:51:14 -03:00
ReinUsesLisp
fb6cf12a17
gl_framebuffer_cache: Optimize framebuffer key
Pack color attachment enumerations into a single u32. To determine the
number of buffers, the highest color attachment with a shared pointer
that doesn't point to null is used.
2019-11-28 23:02:20 -03:00
ReinUsesLisp
c34da106ed
gl_rasterizer: Re-enable framebuffer cache for clear buffers 2019-11-28 23:02:20 -03:00
ReinUsesLisp
e6a0a30334
renderer_opengl: Make ScreenRectVertex's constructor constexpr 2019-11-28 20:36:02 -03:00
ReinUsesLisp
dee7844443
renderer_opengl: Remove C casts 2019-11-28 20:28:27 -03:00
ReinUsesLisp
3a44faff11
renderer_opengl: Use explicit binding for presentation shaders 2019-11-28 20:25:56 -03:00
ReinUsesLisp
75cc501d52
renderer_opengl: Drop macros for message decorations 2019-11-28 20:15:25 -03:00
ReinUsesLisp
056f049b26
renderer_opengl: Move static definitions to anonymous namespace 2019-11-28 20:14:40 -03:00
ReinUsesLisp
4589582eaf
renderer_opengl: Move commentaries to header file 2019-11-28 20:11:03 -03:00
bunnei
e3ee017e91
Merge pull request #3169 from lioncash/memory
core/memory: Deglobalize memory management code
2019-11-28 11:43:17 -05:00
Rodrigo Locatti
913d0bb269
Merge pull request #3174 from lioncash/optional
video_core/gpu_thread: Tidy up SwapBuffers()
2019-11-27 20:35:31 -03:00
Lioncash
aed6d8bef5 video_core/gpu_thread: Tidy up SwapBuffers()
We can just use std::nullopt and std::make_optional to make this a
little bit less noisy.
2019-11-27 17:46:11 -05:00
Lioncash
9403979c22 video_core/const_buffer_locker: Make use of std::tie in HasEqualKeys()
Tidies it up a little bit visually.
2019-11-27 05:53:43 -05:00
Lioncash
930e311526 video_core/const_buffer_locker: Remove unused includes 2019-11-27 05:51:13 -05:00
Lioncash
9341ca7979 video_core/const_buffer_locker: Remove #pragma once from cpp file
Silences a compiler warning.
2019-11-27 05:50:51 -05:00
Lioncash
849581075a core/memory: Migrate over RasterizerMarkRegionCached() to the Memory class
This is only used within the accelerated rasterizer in two places, so
this is also a very trivial migration.
2019-11-26 21:55:38 -05:00
Lioncash
3f08e8d8d4 core/memory: Migrate over GetPointer()
With all of the interfaces ready for migration, it's trivial to migrate
over GetPointer().
2019-11-26 21:55:38 -05:00
Lioncash
536fc7f0ea core: Prepare various classes for memory read/write migration
Amends a few interfaces to be able to handle the migration over to the
new Memory class by passing the class by reference as a function
parameter where necessary.

Notably, within the filesystem services, this eliminates two ReadBlock()
calls by using the helper functions of HLERequestContext to do that for
us.
2019-11-26 21:55:37 -05:00
bunnei
6df6caaf5f
Merge pull request #3143 from ReinUsesLisp/indexing-bug
gl_device: Deduce indexing bug from device instead of heuristic
2019-11-26 21:53:12 -05:00
ReinUsesLisp
ef4446cb11
gl_shader_decompiler: Fix casts from fp32 to f16
Casts from f32 to f16 zeroes the higher half of the target register.
2019-11-25 22:22:33 -03:00
ReinUsesLisp
410d44ce05
gl_device: Deduce indexing bug from device instead of heuristic
The heuristic to detect AMD's driver was not working properly since it
also included Intel. Instead of using heuristics to detect it, compare
the GL_VENDOR string.
2019-11-25 16:15:22 -03:00
bunnei
2899c93818
Merge pull request #3158 from ReinUsesLisp/srgb-blit
gl_texture_cache: Apply sRGB on blits
2019-11-24 20:47:13 -05:00
bunnei
33a6b45a6c
Merge pull request #3155 from bunnei/fix-asynch-gpu-wait
gpu_thread: Don't spin wait if there are no GPU commands.
2019-11-24 20:19:25 -05:00
bunnei
b03242067d
Merge pull request #3098 from ReinUsesLisp/shader-invalidations
gl_shader_cache: Miscellaneous changes to shaders
2019-11-24 19:36:30 -05:00
ReinUsesLisp
74fff717aa
gl_texture_cache: Apply sRGB on blits
glBlitFramebuffer keeps in mind GL_FRAMEBUFFER_SRGB's state. Enable this
depending on the target surface pixel format.
2019-11-24 18:13:33 -03:00
bunnei
b7031b2b9d
Merge pull request #3105 from ReinUsesLisp/fix-stencil-reg
maxwell_3d: Fix stencil_back_func_mask offset
2019-11-24 13:53:23 -05:00
bunnei
e81e0036b4
Merge pull request #3145 from ReinUsesLisp/buffer-cache-init
buffer_cache: Remove brace initialized for objects with default constructor
2019-11-24 02:55:02 -05:00
bunnei
9ec84fc592 gpu_thread: Don't spin wait if there are no GPU commands. 2019-11-23 15:17:28 -05:00
bunnei
4ed183ee42
Merge pull request #3141 from ReinUsesLisp/gl-position
gl_shader_gen: Apply default value to gl_Position
2019-11-23 13:23:46 -05:00
ReinUsesLisp
dc2e83fa31
gl_device: Reserve base bindings on limited devices
SSBOs and other resources are limited per pipeline on Intel and AMD.
Heuristically reserve resources per stage having in mind the reported
OpenGL limits.
2019-11-22 21:28:50 -03:00
ReinUsesLisp
e3d7334be9
gl_state: Skip null texture binds
glBindTextureUnit doesn't support null textures. Skip binding these.
2019-11-22 21:28:50 -03:00
ReinUsesLisp
919ac2c4d3
gl_rasterizer: Disable compute shaders on Intel
Intel's proprietary driver enters in a corrupt state when compute
shaders are executed. For now, disable these.
2019-11-22 21:28:50 -03:00
ReinUsesLisp
894ad74b87
gl_shader_cache: Hack shared memory size
The current shared memory size seems to be smaller than what the game
actually uses. This makes Nvidia's driver consistently blow up; in the
case of FE3H it made it explode on Qt's SwapBuffers while SDL2 worked
just fine. For now keep this hack since it's still progress over the
previous hardcoded shared memory size.
2019-11-22 21:28:49 -03:00
ReinUsesLisp
e35b9597ef
gl_shader_decompiler: Normalize image bindings 2019-11-22 21:28:49 -03:00
ReinUsesLisp
36d9b409fc
gl_shader_decompiler: Normalize cbuf bindings
Stage and compute shaders were using a different binding counter.
Normalize these.
2019-11-22 21:28:49 -03:00
ReinUsesLisp
f936b86c7c
gl_rasterizer: Add missing cbuf counter reset on compute 2019-11-22 21:28:49 -03:00
ReinUsesLisp
180417c514
gl_shader_cache: Remove dynamic BaseBinding specialization 2019-11-22 21:28:49 -03:00
ReinUsesLisp
c8a48aacc0
video_core: Unify ProgramType and ShaderStage into ShaderType 2019-11-22 21:28:48 -03:00
ReinUsesLisp
0f23359a44
gl_rasterizer: Bind graphics images to draw commands
Images were not being bound to draw invocations because these would
require a cache invalidation.
2019-11-22 21:28:48 -03:00
ReinUsesLisp
287ae2b9e8
gl_shader_cache: Specialize local memory size for compute shaders
Local memory size in compute shaders was stubbed with an arbitary size.
This commit specializes local memory size from guest GPU parameters.
2019-11-22 21:28:48 -03:00
ReinUsesLisp
dbeb523879
gl_shader_cache: Specialize shared memory size
Shared memory was being declared with an undefined size. Specialize from
guest GPU parameters the compute shader's shared memory size.
2019-11-22 21:28:47 -03:00
ReinUsesLisp
4f5d8e4342
gl_shader_cache: Specialize shader workgroup
Drop the usage of ARB_compute_variable_group_size and specialize compute
shaders instead. This permits compute to run on AMD and Intel
proprietary drivers.
2019-11-22 21:28:47 -03:00
ReinUsesLisp
dc9961f341
shader/texture: Handle TLDS texture type mismatches
Some games like "Fire Emblem: Three Houses" bind 2D textures to offsets
used by instructions of 1D textures. To handle the discrepancy this
commit uses the the texture type from the binding and modifies the
emitted code IR to build a valid backend expression.

E.g.: Bound texture is 2D and instruction is 1D, the emitted IR samples
a 2D texture in the coordinate ivec2(X, 0).
2019-11-22 21:28:47 -03:00
ReinUsesLisp
32c1bc6a67
shader/texture: Deduce texture buffers from locker
Instead of specializing shaders to separate texture buffers from 1D
textures, use the locker to deduce them while they are being decoded.
2019-11-22 21:28:47 -03:00
ReinUsesLisp
73aaf365e7
buffer_cache: Remove brace initialized for objects with default constructor 2019-11-20 16:00:40 -03:00
Fernando Sahmkow
cc81c0ce64 Texture_Cache: Redo invalid Surfaces handling.
This commit aims to redo the full setup of invalid textures and
guarantee correct behavior across backends in the case of finding one by
using black dummy textures that match the target of the expected
texture.
2019-11-20 14:59:35 -04:00
ReinUsesLisp
24f4198cee
shader/other: Reduce DEPBAR log severity
While DEPBAR is stubbed it doesn't change anything from our end. Shading
languages handle what this instruction does implicitly. We are not
getting anything out fo this log except noise.
2019-11-19 21:26:40 -03:00
ReinUsesLisp
bc10714dcf
gl_shader_gen: Apply default value to gl_Position
Nvidia has sane default output values for varyings, but the other
vendors don't apply these. To properly emulate this we would have to
analyze the shader header. For the time being, apply the same default
Nvidia applies so we get the same behaviour on non-Nvidia drivers.
2019-11-19 20:32:01 -03:00
bunnei
b0819e2ffb
Merge pull request #3086 from ReinUsesLisp/format-lookups
texture_cache: Use a flat table instead of switch for texture format lookups
2019-11-19 18:29:17 -05:00
Fernando Sahmkow
c8473f399e Shader_IR: Address Feedback 2019-11-18 07:34:34 -04:00
bunnei
a8295d2c53
Merge pull request #3047 from ReinUsesLisp/clip-control
gl_rasterizer: Emulate viewport flipping with ARB_clip_control
2019-11-15 12:09:19 -05:00
ReinUsesLisp
4681381a34
format_lookup_table: Address feedback
format_lookup_table: Drop bitfields

format_lookup_table: Use std::array for definition table

format_lookup_table: Include <limits> instead of <numeric>
2019-11-14 20:57:30 -03:00
ReinUsesLisp
80eacdf89b
texture_cache: Use a table instead of switch for texture formats
Use a large flat array to look up texture formats. This allows us to
properly implement formats with different component types. It should
also be faster.
2019-11-14 20:57:10 -03:00
ReinUsesLisp
48a1687f51
texture_cache: Drop abstracted ComponentType
Abstracted ComponentType was not being used in a meaningful way.
This commit drops its usage.

There is one place where it was being used to test compatibility between
two cached surfaces, but this one is implied in the pixel format.
Removing the component type test doesn't change the behaviour.
2019-11-14 18:21:42 -03:00
greggameplayer
c6bc13d0aa correct the implementation of RGBA16UI 2019-11-14 21:37:39 +01:00
Fernando Sahmkow
cd0f5dfc17 Shader_IR: Implement TXD instruction. 2019-11-14 11:15:27 -04:00
Fernando Sahmkow
f3d1b370aa Shader_IR: Implement FLO instruction. 2019-11-14 11:15:27 -04:00
Fernando Sahmkow
95137a04e1 Shader_Bytecode: Add encodings for FLO, SHF and TXD 2019-11-14 11:15:26 -04:00
Fernando Sahmkow
b6f6733131
Merge pull request #3081 from ReinUsesLisp/fswzadd-shuffles
shader: Implement FSWZADD and reimplement SHFL
2019-11-14 10:27:27 -04:00
ReinUsesLisp
7990220df7
maxwell_3d: Fix stencil_back_func_mask offset
stencil_back_func_mask and stencil_back_mask were misplaced. This commit
addresses that issue.
2019-11-13 16:35:17 -03:00
Rodrigo Locatti
cf770a68a5
Merge pull request #3084 from ReinUsesLisp/cast-warnings
video_core: Treat implicit conversions as errors
2019-11-13 02:16:22 -03:00
Rodrigo Locatti
fb9418798d
video_core: Enable sign conversion warnings
Enable sign conversion warnings but don't treat them as errors.
2019-11-11 18:00:37 -03:00
bunnei
0fc596de6e
Merge pull request #3082 from ReinUsesLisp/fix-lockers
gl_shader_cache: Fix locker constructors
2019-11-09 13:58:36 -05:00
ReinUsesLisp
18c1cb68fd video_core: Treat implicit conversions as errors 2019-11-08 22:49:39 +00:00
ReinUsesLisp
096f339a2a video_core: Silence implicit conversion warnings 2019-11-08 22:48:50 +00:00
bunnei
a056d8de16
Merge pull request #3080 from FernandoS27/glsl-fix
GLSLDecompiler: Correct Texture Gather Offset.
2019-11-08 15:56:29 -05:00
ReinUsesLisp
bfa973a62b
gl_shader_cache: Fix locker constructors
Properly pass engine when a shader is being constructed from memory.
2019-11-07 20:43:31 -03:00
ReinUsesLisp
3ab0514698
gl_shader_cache: Enable extensions only when available
Silence GLSL compilation warnings.
2019-11-07 20:08:42 -03:00
ReinUsesLisp
cd66395944
gl_shader_decompiler: Add safe fallbacks when ARB_shader_ballot is not available 2019-11-07 20:08:42 -03:00
ReinUsesLisp
56e237d1f9
shader_ir/warp: Implement FSWZADD 2019-11-07 20:08:41 -03:00
ReinUsesLisp
08b2b1080a
gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsics 2019-11-07 20:08:41 -03:00
Fernando Sahmkow
3d7c284e0f GLSLDecompiler: Correct Texture Gather Offset.
This commit corrects the argument ordering in textureGatherOffset.
2019-11-07 11:43:56 -04:00
bunnei
b6ae48966d
Merge pull request #3032 from ReinUsesLisp/simplify-control-flow-brx
shader/control_flow: Abstract repeated code chunks in BRX tracking
2019-11-07 01:30:01 -05:00
Morph
0e8a3bf3e5 buffer_cache: Add missing includes (#3079)
`boost::make_iterator_range` is available when `boost/range/iterator_range.hpp` is included.
Also include `boost/icl/interval_map.hpp` and `boost/icl/interval_set.hpp`.
2019-11-07 06:25:53 +00:00
bunnei
344d15f61e
Merge pull request #3070 from ReinUsesLisp/shader-warnings
shader_ir: Reduce severity of warnings
2019-11-07 00:47:24 -05:00
ReinUsesLisp
e9d2fad984
gl_rasterizer: Remove front facing hack 2019-11-07 01:52:18 -03:00
ReinUsesLisp
f1facaeaef
gl_shader_decompiler: Fix typo "y_negate"->"y_direction" 2019-11-07 01:52:18 -03:00
ReinUsesLisp
e2ea0c3e11
gl_shader_manager: Remove unused variable in SetFromRegs 2019-11-07 01:52:18 -03:00
ReinUsesLisp
f019817f8f
gl_rasterizer: Emulate viewport flipping with ARB_clip_control
Emulates negative y viewports with ARB_clip_control. This allows us to
more easily emulated pipelines with tessellation and/or geometry shader
stages. It also avoids corrupting games with transform feedbacks and
negative viewports (gl_Position.y was being modified).
2019-11-07 01:52:18 -03:00
Rodrigo Locatti
ff5a0f370c
shader/control_flow: Specify constness on caller lambdas
Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>
2019-11-07 01:44:09 -03:00
ReinUsesLisp
7b069252f8
shader/control_flow: Use callable template instead of std::function 2019-11-07 01:44:08 -03:00
ReinUsesLisp
46c3047283
shader/control_flow: Abstract repeated code chunks in BRX tracking
Remove copied and pasted for cycles into a common templated function.
2019-11-07 01:44:08 -03:00
ReinUsesLisp
ae7dfa93be
shader/control_flow: Silence Intellisense cast warnings 2019-11-07 01:44:08 -03:00
ReinUsesLisp
deb1b54eed
shader/control_flow: Remove brace initializer in std containers
These containers have a default constructor.
2019-11-07 01:44:08 -03:00
ReinUsesLisp
39c66abd91
shader/decode: Reduce severity of arithmetic rounding warnings 2019-11-07 01:43:38 -03:00
ReinUsesLisp
c4374d0d41
shader/arithmetic: Reduce RRO stub severity 2019-11-07 01:43:38 -03:00
ReinUsesLisp
35d40b74b3
shader/texture: Remove NODEP warnings
These warnings don't offer meaningful information while decoding
shaders. Remove them.
2019-11-07 01:43:38 -03:00
bunnei
468576284d
Merge pull request #3057 from ReinUsesLisp/buffer-sub-data
gl_rasterizer: Upload constant buffers with glNamedBufferSubData
2019-11-06 10:08:55 -05:00
Rodrigo Locatti
654b77d2ec
Merge pull request #3039 from ReinUsesLisp/cleanup-samplers
shader/node: Unpack bindless texture encoding
2019-11-06 04:54:11 +00:00
bunnei
21e07df7b7
Merge pull request #2914 from FernandoS27/fermi-fix
Fermi2D: limit blit area to only available area
2019-11-05 20:45:24 -05:00
bunnei
1bdae0fe29 common_func: Use std::array for INSERT_PADDING_* macros.
- Zero initialization here is useful for determinism.
2019-11-03 22:22:41 -05:00
ReinUsesLisp
442a1cc021
gl_rasterizer: Re-enable stream buffer memory due to global memory
Global memory is still using the stream buffer when it shouldn't. As a
temporary fix re-enable the stream buffer on compute.
2019-11-02 13:19:19 -03:00
ReinUsesLisp
76ca2a5f82
gl_rasterizer: Upload constant buffers with glNamedBufferSubData
Nvidia's OpenGL driver maps gl(Named)BufferSubData with some requirements
to a fast. This path has an extra memcpy but updates the buffer without
orphaning or waiting for previous calls. It can be seen as a better
model for "push constants" that can upload a whole UBO instead of 256
bytes.

This path has some requirements established here:
http://on-demand.gputechconf.com/gtc/2014/presentations/S4379-opengl-44-scene-rendering-techniques.pdf#page=24

Instead of using the stream buffer, this commits moves constant buffers
uploads to calls of glNamedBufferSubData and from my testing it brings a
performance improvement. This is disabled when the vendor is not Nvidia
since it brings performance regressions.
2019-11-02 05:05:34 -03:00
Fernando Sahmkow
23cabc98db Shader_IR: Fix regression on TLD4
Originally on the last commit I thought TLD4 acted the same as TLD4S and 
didn't have a mask. It actually does have a component mask. This commit 
corrects that.
2019-10-30 21:14:57 -04:00
Rodrigo Locatti
658489ebf7
Merge pull request #3050 from FernandoS27/fix-tld4
shader_ir: Fix TLD4 and add bindless variant
2019-10-30 18:37:17 +00:00
Fernando Sahmkow
9293c3a0f2 Shader_IR: Fix TLD4 and add Bindless Variant.
This commit fixes an issue where not all 4 results of tld4 were being
written, the color component was defaulted to red, among other things.
It also implements the bindless variant.
2019-10-30 12:02:03 -04:00
bunnei
2382bbe3ac
Merge pull request #3046 from ReinUsesLisp/clean-gl-state
gl_state: Miscellaneous clean up
2019-10-29 22:50:04 -04:00
bunnei
b5138f3c35
Merge pull request #3035 from ReinUsesLisp/rasterizer-accelerated
rasterizer_accelerated: Add intermediary for GPU rasterizers
2019-10-29 22:06:41 -04:00
Rodrigo Locatti
3d0cde6a75
gl_state: Use std::array::fill instead of std::fill
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2019-10-30 01:30:31 +00:00
ReinUsesLisp
ce20ed8e4e
gl_state: Move dirty checks to individual apply calls instead of Apply
This requires removing constness from some methods, but for consistency
it's removed in all methods.
2019-10-29 21:27:25 -03:00
ReinUsesLisp
3c6557c235
gl_state: Remove ApplyDefaultState
OpenGL has defaults values we can trust. Remove these.
2019-10-29 21:27:25 -03:00
ReinUsesLisp
d3651b0b82
gl_state: Change SetDefaultViewports to use default constructor 2019-10-29 21:27:24 -03:00
ReinUsesLisp
c7698d0bc8
gl_state: Minor style changes 2019-10-29 21:27:24 -03:00
ReinUsesLisp
a14d202ac2
gl_state: Remove unused Citra TextureUnits 2019-10-29 21:27:24 -03:00
ReinUsesLisp
28fece8e9b
gl_state: Move initializers from constructor to class declaration 2019-10-29 21:27:23 -03:00
ReinUsesLisp
a993df1ee2
shader/node: Unpack bindless texture encoding
Bindless textures were using u64 to pack the buffer and offset from
where they come from. Drop this in favor of separated entries in the
struct.

Remove the usage of std::set in favor of std::list (it's not std::vector
to avoid reference invalidations) for samplers and images.
2019-10-29 20:53:48 -03:00
Rodrigo Locatti
2ec5b55ee3
Merge pull request #3004 from ReinUsesLisp/maxwell3d-cleanup
maxwell_3d: Remove unused entries
2019-10-29 23:46:33 +00:00
Rodrigo Locatti
c5d9589942
Merge pull request #3037 from FernandoS27/new-formats
video_core: Implement texture format E5B9G9R9_SHAREDEXP.
2019-10-28 01:36:58 -03:00
ReinUsesLisp
fa31e5b868
maxwell_3d/kepler_compute: Remove unused arguments in GetTexture 2019-10-28 00:23:42 -03:00
ReinUsesLisp
538ddd220e
video_core/textures: Remove unused index entry in FullTextureInfo 2019-10-28 00:14:38 -03:00
ReinUsesLisp
961fe4d19b
maxwell_3d: Remove unused method GetStageTextures 2019-10-28 00:14:29 -03:00
Fernando Sahmkow
3f9262195b Video_Core: Implement texture format E5B9G9R9_SHAREDEXP.
This commit implements the E5B9G9R9 Texture format into the general 
system and OpenGL backend.
2019-10-27 16:44:09 -04:00
bunnei
6909b2f0f9
Merge pull request #3034 from ReinUsesLisp/w4244-maxwell3d
maxwell_3d: Silence implicit conversion warnings
2019-10-27 15:08:59 -04:00
ReinUsesLisp
3e469cecc1
maxwell_3d: Silence implicit conversion warnings
While we are at it, unify types for dirty reg pointers.
2019-10-27 15:22:17 -03:00
ReinUsesLisp
bd2aff3e26
rasterizer_accelerated: Add intermediary for GPU rasterizers
Add an intermediary class that implements common functions across GPU
accelerated rasterizers. This avoids code repetition on different
backends.
2019-10-27 03:40:08 -03:00
ReinUsesLisp
a5aa1bb174
astc: Silence implicit conversion warnings 2019-10-27 03:04:50 -03:00
Rodrigo Locatti
26f3e18c5c
Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased
Implement Fast BRX, fix TXQ and addapt the Shader Cache for it
2019-10-26 16:56:13 -03:00
Fernando Sahmkow
be856a38d6 Shader_IR: Address Feedback. 2019-10-26 15:38:30 -04:00
Rodrigo Locatti
a0d79085c4
Merge pull request #3027 from lioncash/lookup
shader_ir: Use std::array with std::pair instead of std::unordered_map
2019-10-26 05:49:15 -03:00
Rodrigo Locatti
d52598173d
Merge pull request #3013 from FernandoS27/tld4s-fix
Shader_Ir: Fix TLD4S from using a component mask.
2019-10-25 20:06:26 -03:00
Fernando Sahmkow
e3afd6595a Shader_IR: Clang format 2019-10-25 09:01:32 -04:00
ReinUsesLisp
78f3e8a757 gl_shader_cache: Implement locker variants invalidation 2019-10-25 09:01:32 -04:00
ReinUsesLisp
ec85648af3 gl_shader_disk_cache: Store and load fast BRX 2019-10-25 09:01:31 -04:00
ReinUsesLisp
fa2c297f3e const_buffer_locker: Minor style changes 2019-10-25 09:01:31 -04:00
ReinUsesLisp
7b81ba4d8a gl_shader_decompiler: Move entries to a separate function 2019-10-25 09:01:31 -04:00
Fernando Sahmkow
1244f2d368 Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. 2019-10-25 09:01:31 -04:00
Fernando Sahmkow
a05120ec0b Shader_IR: Correct typo in Consistent method. 2019-10-25 09:01:30 -04:00
Fernando Sahmkow
33fcec3502 Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it 2019-10-25 09:01:30 -04:00
Fernando Sahmkow
8909f52166 Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. 2019-10-25 09:01:30 -04:00
Fernando Sahmkow
acd6441134 Shader_Cache: setup connection of ConstBufferLocker 2019-10-25 09:01:29 -04:00
Fernando Sahmkow
1a58f45d76 VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders. 2019-10-25 09:01:29 -04:00
Fernando Sahmkow
2ef696c85a Shader_IR: Implement BRX tracking. 2019-10-25 09:01:29 -04:00
Rodrigo Locatti
5062728669
Merge pull request #3028 from lioncash/constexpr
shader_bytecode: Make Matcher constexpr capable
2019-10-24 15:10:40 -03:00
Lioncash
7fdf991097 shader_bytecode: Make Matcher constexpr capable
Greatly shrinks the amount of generated code for GetDecodeTable().

Collapses an assembly output of 9000+ lines down to ~3621 with Clang,
and 6513 down to ~2616 with GCC, given it's now allowed to construct all
the entries as a sequence of constant data.
2019-10-24 01:10:10 -04:00
Lioncash
382717172e shader_ir: Use std::array with pair instead of unordered_map
Given the overall size of the maps are very small, we can use arrays of
pairs here instead of always heap allocating a new map every time the
functions are called. Given the small size of the maps, the difference
in container lookups are negligible, especially given the entries are
already sorted.
2019-10-24 00:25:38 -04:00
Lioncash
1f5401c89c video_core/shader: Resolve instances of variable shadowing
Silences a few -Wshadow warnings.
2019-10-23 23:00:31 -04:00
Fernando Sahmkow
c4a0aa9207
Merge pull request #2995 from ReinUsesLisp/ignore-gmem
shader_ir/memory: Ignore global memory when tracking fails
2019-10-22 13:22:43 -04:00
Fernando Sahmkow
7ecf9f7228
Merge pull request #2983 from lioncash/fallthrough
gl_shader_decompiler/vk_shader_decompiler: Resolve implicit fallthrough cases
2019-10-22 13:16:46 -04:00
Fernando Sahmkow
1509d2ffbd Shader_Ir: Fix TLD4S from using a component mask.
TLD4S always outputs 4 values, the previous code checked a component 
mask and omitted those values that weren't part of it. This commit 
corrects that and makes sure all 4 values are set.
2019-10-22 10:59:07 -04:00
ReinUsesLisp
1ea07954fb shader_ir/memory: Ignore global memory when tracking fails
Ignore global memory operations instead of invoking undefined behaviour
when constant buffer tracking fails and we are blasting through asserts,
ignore the operation.

In the case of LDG this means filling the destination registers with
zeroes; for STG this means ignore the instruction as a whole.

The default behaviour is still to abort execution on failure.
2019-10-22 02:49:17 -03:00
ReinUsesLisp
e3107788e6
maxwell_3d: Reduce FlushMMEInlineDraw logging to Trace 2019-10-20 03:43:17 -03:00
Rodrigo Locatti
dc5eedef71
Merge pull request #2994 from lioncash/fmt
video_core/shader/ast: Minor changes to ASTPrinter
2019-10-18 01:05:25 -03:00
Lioncash
074b38b7a9 video_core/shader/ast: Make ShowCurrentState() and SanityCheck() const member functions
These can also trivially be made const member functions, with the
addition of a few consts.
2019-10-17 20:59:48 -04:00
Lioncash
222f4b45eb video_core/shader/ast: Make ASTManager::Print a const member function
Given all visiting functions never modify the nodes, we can trivially
make this a const member function.
2019-10-17 20:56:39 -04:00
Rodrigo Locatti
fd922ddb01
Merge pull request #2993 from lioncash/vulkan-expr
vk_shader_decompiler: Mark operator() function parameters as const references
2019-10-17 21:46:49 -03:00
Lioncash
7831e86c34 video_core/shader/ast: Make ExprPrinter members private
This member already has an accessor, so there's no need for it to be
public.
2019-10-17 20:39:36 -04:00
Lioncash
a2eccbf075 video_core/shader/ast: Make Indent() return a string_view
The returned string is simply a substring of our constexpr tabs
string_view, so we can just use a string_view here as well, since the
original string_view is guaranteed to always exist.

Now the function is fully non-allocating.
2019-10-17 20:29:00 -04:00
Lioncash
15d177a6ac video_core/shader/ast: Make Indent() private
It's never used outside of this class, so we can narrow its scope down.
2019-10-17 20:26:13 -04:00
Lioncash
7f6a8a33d4 video_core/shader/ast: Rename Ident() to Indent()
This can be confusing, given "ident" is generally used as a shorthand
for "identifier".
2019-10-17 20:26:13 -04:00
Lioncash
081530686c video_core/shader/ast: Make use of fmt where applicable
Makes a few strings nicer to read and also eliminates a bit of string
churn with operator+.
2019-10-17 20:26:10 -04:00
Lioncash
c6bec9aa10 vk_shader_decompiler: Mark operator() function parameters as const references
These parameters aren't actually modified in any way, so they can be
made const references.
2019-10-17 19:44:00 -04:00
Rodrigo Locatti
219fdcb9d9
Merge pull request #2966 from FernandoS27/astc-formats
Implement a series of ASTC formats and R4G4B4A4 format
2019-10-17 19:24:11 -03:00
Rodrigo Locatti
a21b88ef8f
Merge pull request #2979 from lioncash/macro
video_core/macro_interpreter: Make definitions of most private enums/unions hidden
2019-10-17 19:21:09 -03:00
Fernando Sahmkow
c0eb1aecfd Fermi2D: Use a different formula for delimiting blit areas. 2019-10-17 18:21:01 -04:00
Lioncash
125caf5d6e video_core/macro_interpreter: Make definitions of most private enums/unions hidden
This allows the implementation of these types to change without
requiring a rebuild of everything that includes the macro interpreter
header.
2019-10-17 17:55:46 -04:00
bunnei
9fe8072c67
Merge pull request #2980 from lioncash/warn
maxwell_3d: Silence truncation warnings
2019-10-17 14:02:16 -04:00
Fernando Sahmkow
57a46c69f1 Fermi2D: limit blit area to only available area
Normaly OpenGL does not care if the areas exceed the texture regions but
other backends such as Vulkan do care about the limits of this areas.
This PR crops the areas of the blit in order that they don't surpass the
limits of the textures. This should help Vulkan and faulty OpenGL
drivers
2019-10-17 10:38:44 -04:00
Rodrigo Locatti
60c602e4e7
Merge pull request #2978 from lioncash/doxygen
video_core/texture_cache: Amend Doxygen references
2019-10-16 22:09:40 -03:00
Rodrigo Locatti
e00b529a89
Merge pull request #2982 from lioncash/surface
texture_cache: Avoid unnecessary surface copies within PickStrategy() and TryReconstructSurface()
2019-10-16 19:43:32 -03:00
bunnei
ef9b31783d
Merge pull request #2912 from FernandoS27/async-fixes
General fixes to Async GPU
2019-10-16 10:34:48 -04:00
Rodrigo Locatti
60315060b1
Merge pull request #2984 from lioncash/fallthrough2
video_core/surface: Add missing break in PixelFormatFromTextureFormat()
2019-10-15 23:08:34 -03:00
Lioncash
cf9e13c255 video_core/surface: Add missing break in PixelFormatFromTextureFormat()
Prevents fallthrough into the following case.
2019-10-15 21:53:15 -04:00
Rodrigo Locatti
14f3cebcd4
Merge pull request #2981 from lioncash/copy
gl_shader_decompiler: Minor cleanup-related changes
2019-10-15 21:07:25 -03:00
Lioncash
6947bf8e44 vk_shader_decompiler: Resolve fallthrough within ExprDecompiler's ExprCondCode operator()
This would previously result in NeverExecute and UnusedIndex being
treated as regular predicates.
2019-10-15 19:40:58 -04:00
Lioncash
b42a74ff2c gl_shader_decompiler: Resolve fallthrough within ExprDecompiler's ExprCondCode operator()
This would previously result in NeverExecute and UnusedIndex being
treated as regular predicates.
2019-10-15 19:38:55 -04:00
Lioncash
a24e8bf9cf texture_cache: Avoid unnecessary surface copies within PickStrategy() and TryReconstructSurface()
We can take these by const reference and avoid making unnecessary
copies, preventing some atomic reference count increments and
decrements.
2019-10-15 19:31:33 -04:00
Lioncash
77b4916b33 control_flow: Silence truncation warnings
This can be trivially fixed by making the input size a size_t.
CFGRebuildState's constructor parameter is already a std::size_t, so
this just makes the size type fully conform with it.
2019-10-15 19:10:28 -04:00
Lioncash
4f16ce9294 gl_shader_decompiler: Make ExprDecompiler's GetResult() a const member function
This is only ever used to read, but not write, the resulting string, so
we can enforce this by making it a const member function.
2019-10-15 19:02:59 -04:00
Lioncash
67df3f7742 gl_shader_decompiler: Use a std::string_view with GetDeclarationWithSuffix()
This allows the function to be completely non-allocating for inputs of
all sizes (i.e. there's no heap cost for an input to convert to a
std::string_view).
2019-10-15 19:00:48 -04:00
Lioncash
04a1161354 gl_shader_decompiler: Fold flow_var constant into GetFlowVariable()
This is only ever used within this function, so we can narrow it's scope
down.
2019-10-15 18:58:36 -04:00
Lioncash
2f2ab9b5bc gl_shader_decompiler: Mark ASTDecompiler/ExprDecompiler parameters as const references where applicable
These member functions don't actually modify the input parameter, so we
can make this explicit with the use of const.
2019-10-15 18:57:02 -04:00
Lioncash
b8a62adcf1 gl_shader_decompiler: Pass by reference to GenerateTextureArgument()
Avoids an unnecessary atomic reference count increment and decrement.
2019-10-15 18:29:37 -04:00
Lioncash
d1d7ce74d2 gl_shader_decompiler: Use std::holds_alternative within GenerateTexture()
This only ever queries if the type exists within the variant, but
doesn't actually do anything with the return value. We can just use
std::holds_alternative for this use case.
2019-10-15 18:25:48 -04:00
Lioncash
67658dd6e8 shader/node: std::move Meta instance within OperationNode constructor
Allows usages of the constructor to avoid an unnecessary copy.
2019-10-15 18:21:59 -04:00
Lioncash
9760795bfb gl_shader_decompiler: Avoid unnecessary copies of MetaImage
MetaImage contains a std::vector, so copying here could result in
unnecessary reallocations. Given the operation lives throughout the
entire scope, this is safe to do.
2019-10-15 18:14:55 -04:00
Lioncash
c9c75f9587 maxwell_3d: Silence truncation warnings
A trivial warning caused by not using size_t as the argument types
instead of u32.
2019-10-15 17:51:35 -04:00
bunnei
2299950de1
Merge pull request #2972 from lioncash/system
{bcat, gpu, nvflinger}: Remove trivial usages of the global system accessor
2019-10-15 17:49:12 -04:00
Lioncash
b25b94400e video_core/gpu: Remove use of the global system accessor
We can just make use of the reference member variable instead of
accessing the global system instance.
2019-10-15 16:39:30 -04:00
Lioncash
524eb15513 video_core/texture_cache: Amend Doxygen references
Amends the doxygen comments so that they properly resolve. While we're
at it, we can correct some typos and fix up some of the comments'
formatting in order to make them slightly nicer to read.
2019-10-15 15:40:00 -04:00
Lioncash
ac4dbd3b25 common: Rename binary_find.h to algorithm.h
Makes the header more general for other potential algorithms in the
future. While we're at it, include a missing <functional> include to
satisfy the use of std::less.
2019-10-15 15:24:50 -04:00
Fernando Sahmkow
cfc2f30dc4 AsyncGpu: Address Feedback 2019-10-11 13:41:15 -04:00
bunnei
2ba273e49e
Merge pull request #2928 from ReinUsesLisp/dirty-depth-bounds
maxwell_3d: Add dirty flags for depth bounds values
2019-10-09 15:44:30 -04:00
bunnei
6b5e50d20e
Merge pull request #2927 from ReinUsesLisp/polygon-offset-units
gl_rasterizer: Fix polygon offset units
2019-10-09 15:38:52 -04:00
Fernando Sahmkow
f32a49d3d8 Surfaces: Implement R4G4B4A4U format. 2019-10-09 12:57:02 -04:00
Fernando Sahmkow
b9ddb517b1 Surfaces: Implement ASTC 6x6 10x10 12x12 8x6 6x5 2019-10-09 12:44:31 -04:00
ReinUsesLisp
3d0f357307
shader/half_set_predicate: Fix HSETP2 for constant buffers
HSETP2 when used with a constant buffer parses the second operand type
as F32. This is not configurable.
2019-10-07 14:49:47 -03:00
ReinUsesLisp
632c9e4ee3
shader/half_set_predicate: Reduce DEBUG_ASSERT to LOG_DEBUG 2019-10-07 14:48:58 -03:00
ReinUsesLisp
58b597c5ec
gl_shader_disk_cache: Properly ignore existing cache
Previously old entries where appended to the file even if the shader
cache was ignored at boot. Address that issue.
2019-10-06 18:00:20 -03:00
Lioncash
f883cd4f0e video_core/control_flow: Eliminate variable shadowing warnings 2019-10-05 09:14:27 -04:00
Lioncash
25702b6256 video_core/control_flow: Eliminate pessimizing moves
These can inhibit the ability of a compiler to perform RVO.
2019-10-05 09:14:27 -04:00
Lioncash
d82b181d44 video_core/ast: Unindent most of IsFullyDecompiled() by one level 2019-10-05 09:14:27 -04:00
Lioncash
6c41d1cd7e video_core/ast: Make ShowCurrentState() take a string_view instead of std::string
Allows the function to be non-allocating in terms of the output string.
2019-10-05 09:14:27 -04:00
Lioncash
3c54edae24 video_core/ast: Eliminate variable shadowing warnings 2019-10-05 09:14:26 -04:00
Lioncash
5a0a9c7449 video_core/ast: Replace std::string with a constexpr std::string_view
Same behavior, but without the need to heap allocate
2019-10-05 09:14:26 -04:00
Lioncash
3a20d9734f video_core/ast: Default the move constructor and assignment operator
This is behaviorally equivalent and also fixes a bug where some members
weren't being moved over.
2019-10-05 09:14:26 -04:00
Lioncash
43503a69bf video_core/{ast, expr}: Organize forward declaration
Keeps them alphabetically sorted for readability.
2019-10-05 09:14:26 -04:00
Lioncash
50ad745585 video_core/expr: Supply operator!= along with operator==
Provides logical symmetry to the interface.
2019-10-05 09:14:26 -04:00
Lioncash
8eb1398f8d video_core/{ast, expr}: Use std::move where applicable
Avoids unnecessary atomic reference count increments and decrements.
2019-10-05 09:14:23 -04:00
Lioncash
8e0c80f269 video_core/ast: Supply const accessors for data where applicable
Provides const equivalents of data accessors for use within const
contexts.
2019-10-05 08:22:03 -04:00
David
3728bbc22a
Merge pull request #2888 from FernandoS27/decompiler2
Shader_IR: Implement a full control flow decompiler for the shader IR.
2019-10-05 21:52:20 +10:00
ReinUsesLisp
fe7f20e659 maxwell_3d: Add dirty flags for depth bounds values
This is useful in Vulkan where we want to update depth bounds without
caring if it's enabled or disabled through vkCmdSetDepthBounds.
2019-10-05 04:07:47 +00:00
Fernando Sahmkow
538f5880ff GL_Renderer: Remove lefting snippet. 2019-10-04 19:59:55 -04:00
Fernando Sahmkow
9f2719d1a4 Gl_Rasterizer: Protect CPU Memory mapping from multiple threads. 2019-10-04 19:59:53 -04:00
Fernando Sahmkow
3f104464de Core: Wait for GPU to be idle before shutting down. 2019-10-04 19:59:53 -04:00
Fernando Sahmkow
ffc2ce89a0 Nvdrv: Do framelimiting only in the CPU Thread 2019-10-04 19:59:50 -04:00
Fernando Sahmkow
5b5e60ffec GPU_Async: Correct fences, display events and more.
This commit uses guest fences on vSync event instead of an articial fake 
fence we had.
It also corrects to keep signaling display events while loading the game 
as the OS is suppose to send buffers to vSync during that time.
2019-10-04 19:59:48 -04:00
Fernando Sahmkow
ab47a660c8 Texture_Cache: Blit Deduction corrections and simplifications. 2019-10-04 18:53:47 -04:00
Fernando Sahmkow
2036504a82 TextureCache: Add the ability to deduce if two textures are depth on blit. 2019-10-04 18:53:46 -04:00
Fernando Sahmkow
e6eae4b815 Shader_ir: Address feedback 2019-10-04 18:52:57 -04:00
Fernando Sahmkow
3c09d9abe6 Shader_Ir: Address Feedback and clang format. 2019-10-04 18:52:57 -04:00
Fernando Sahmkow
507a9c6a40 vk_shader_decompiler: Correct Branches inside conditionals. 2019-10-04 18:52:56 -04:00
Fernando Sahmkow
000ad558dd vk_shader_decompiler: Clean code and be const correct. 2019-10-04 18:52:55 -04:00
Fernando Sahmkow
7c756baa77 Shader_IR: clean up AST handling and add documentation. 2019-10-04 18:52:55 -04:00
Fernando Sahmkow
5ea740beb5 Shader_IR: Correct OutwardMoves for Ifs 2019-10-04 18:52:54 -04:00
Fernando Sahmkow
100a4bd988 vk_shader_compiler: Don't enclose branches with if(true) to avoid crashing AMD 2019-10-04 18:52:54 -04:00
Fernando Sahmkow
189a50bc2a gl_shader_decompiler: Refactor and address feedback. 2019-10-04 18:52:53 -04:00
Fernando Sahmkow
b3c46d6948 Shader_IR: corrections and clang-format 2019-10-04 18:52:53 -04:00
Fernando Sahmkow
466cd52ad4 vk_shader_compiler: Correct SPIR-V AST Decompiling 2019-10-04 18:52:52 -04:00
Fernando Sahmkow
2e9a810423 Shader_IR: allow else derivation to be optional. 2019-10-04 18:52:52 -04:00
Fernando Sahmkow
ca9901867e vk_shader_compiler: Implement the decompiler in SPIR-V 2019-10-04 18:52:51 -04:00
Fernando Sahmkow
0366c18d87 Shader_IR: mark labels as unused for partial decompile. 2019-10-04 18:52:51 -04:00
Fernando Sahmkow
47e4f6a52c Shader_Ir: Refactor Decompilation process and allow multiple decompilation modes. 2019-10-04 18:52:50 -04:00
Fernando Sahmkow
38fc995f6c gl_shader_decompiler: Implement AST decompiling 2019-10-04 18:52:50 -04:00
Fernando Sahmkow
6fdd501113 shader_ir: Declare Manager and pass it to appropiate programs. 2019-10-04 18:52:49 -04:00
Fernando Sahmkow
8be6e1c522 shader_ir: Corrections to outward movements and misc stuffs 2019-10-04 18:52:48 -04:00
Fernando Sahmkow
4fde66e609 shader_ir: Add basic goto elimination 2019-10-04 18:52:48 -04:00
Fernando Sahmkow
c17953978b shader_ir: Initial Decompile Setup 2019-10-04 18:52:47 -04:00
ReinUsesLisp
69c806feb6
gl_rasterizer: Fix polygon offset units
For some reason hardware divides polygon offset units by two. This is
visible since drivers multiply the application requested polygon offset
by two.
2019-10-01 02:00:23 -03:00
ReinUsesLisp
f926230ab1
gl_shader_decompiler: Add tailing return for HUnpack2 2019-09-24 01:03:59 -03:00
ReinUsesLisp
25bfaffdff
gl_shader_decompiler: Fix clang build issues 2019-09-24 01:03:27 -03:00
bunnei
376f1a4432
Merge pull request #2869 from ReinUsesLisp/suld
shader/image: Implement SULD and fix SUATOM
2019-09-23 21:47:03 -04:00
David
9d69206cd0
Merge pull request #2870 from FernandoS27/multi-draw
Implement a MME Draw commands Inliner and correct host instance drawing
2019-09-22 23:13:02 +10:00
Fernando Sahmkow
822ca65d69
Merge pull request #2891 from FearlessTobi/rod-tex
video_core: Implement RGBX16F and lower Surface Copy log severity
2019-09-22 09:11:28 -04:00
David
3bfba23362
Merge pull request #2867 from ReinUsesLisp/configure-framebuffers-clean
gl_rasterizer: Remove unused code paths from ConfigureFramebuffers
2019-09-22 23:10:07 +10:00
Fernando Sahmkow
68f5aff64f Maxwell3D: Corrections and refactors to MME instance refactor 2019-09-22 07:23:13 -04:00
FearlessTobi
01fc969a5f Fix clang-format 2019-09-22 02:21:56 +02:00
FearlessTobi
366e900376 fermi_2d: Lower surface copy log severity to DEBUG 2019-09-22 02:18:57 +02:00
FearlessTobi
55d272efe6 video_core: Implement RGBX16F PixelFormat 2019-09-22 02:16:44 +02:00
Rodrigo Locatti
9286976948
Merge pull request #2878 from FernandoS27/icmp
shader_ir: Implement ICMP
2019-09-21 18:06:07 -03:00
ReinUsesLisp
44000971e2
gl_shader_decompiler: Use uint for images and fix SUATOM
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as
these require a distinction between U32 and S32. These have to be
implemented with imageCompSwap loop.
2019-09-21 17:33:52 -03:00
ReinUsesLisp
675f23aedc
shader/image: Implement SULD and remove irrelevant code
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21 17:32:48 -03:00
ReinUsesLisp
4de0f1e1c8
shader_bytecode: Add SULD encoding 2019-09-21 17:31:46 -03:00
Fernando Sahmkow
527b841c15 Shader_IR: ICMP corrections and fixes 2019-09-21 14:28:03 -04:00
David
9ad42fb0cf
Merge pull request #2868 from ReinUsesLisp/fix-mipmaps
maxwell_to_gl: Fix mipmap filtering
2019-09-21 19:57:09 +10:00
David Marcec
01a4afee42 Mark DrawArrays as LOG_TRACE
There's no reason to clog logs with DrawArray.
2019-09-21 15:43:58 +10:00
bunnei
bbe82d62b0
Merge pull request #2846 from ReinUsesLisp/fixup-viewport-index
gl_shader_decompiler: Avoid writing output attribute when unimplemented
2019-09-20 17:11:20 -04:00
bunnei
88d857499b
Merge pull request #2855 from ReinUsesLisp/shfl
shader_ir/warp: Implement SHFL for Nvidia devices
2019-09-20 17:10:42 -04:00
Fernando Sahmkow
433e764bb0 Rasterizer: Correct introduced bug where a conditional render wouldn't stop a draw call from executing 2019-09-20 15:44:28 -04:00
Fernando Sahmkow
4b81d19a1a Shader_IR: Implement ICMP. 2019-09-19 20:56:29 -04:00
Fernando Sahmkow
7761e44d18 Rasterizer: Refactor and simplify DrawBatch Interface. 2019-09-19 11:41:33 -04:00
Fernando Sahmkow
d2ea592ddb Rasterizer: Address Feedback and conscerns. 2019-09-19 11:41:32 -04:00
Fernando Sahmkow
c17655ce74 Rasterizer: Refactor draw calls, remove deadcode and clean up. 2019-09-19 11:41:31 -04:00
Fernando Sahmkow
7606da5611 VideoCore: Corrections to the MME Inliner and removal of hacky instance management. 2019-09-19 11:41:29 -04:00
Fernando Sahmkow
ba02d564f8 Video Core: initial Implementation of InstanceDraw Packaging 2019-09-19 11:41:27 -04:00
bunnei
b31880dc5e
Merge pull request #2784 from ReinUsesLisp/smem
shader_ir: Implement shared memory
2019-09-18 16:26:05 -04:00
ReinUsesLisp
0526bf1895 shader_ir/warp: Implement SHFL 2019-09-17 17:44:07 -03:00
ReinUsesLisp
2dd6411753 maxwell_to_gl: Fix mipmap filtering
OpenGL texture filters follow GL_<texture_filter>_MIPMAP_<mipmap_filter>
but we were using them in the opposite way.
2019-09-17 03:32:24 -03:00
ReinUsesLisp
af809b491e gl_rasterizer: Remove unused code paths from ConfigureFramebuffers 2019-09-17 02:50:42 -03:00
Fernando Sahmkow
393cc3ef2f
Merge pull request #2851 from ReinUsesLisp/srgb
renderer_opengl: Fix sRGB blits
2019-09-15 10:38:10 -04:00
Fernando Sahmkow
b8b1747704
Merge pull request #2824 from ReinUsesLisp/mme
Revert "Revert #2466" and stub FirmwareCall 4
2019-09-15 06:17:04 -04:00
Rodrigo Locatti
193bfefce4
maxwell_3d: Update firmware 4 call stub commentary 2019-09-14 22:51:18 -03:00
Fernando Sahmkow
daae327e86
Merge pull request #2857 from ReinUsesLisp/surface-srgb
video_core/surface: Add function to detect sRGB surfaces
2019-09-14 03:53:21 -04:00
Fernando Sahmkow
18fac59050
Merge pull request #2858 from ReinUsesLisp/vk-device
vk_device: Add miscellaneous features and minor style changes
2019-09-14 03:52:06 -04:00
ReinUsesLisp
01d96e1136 vk_device: Add miscellaneous features and minor style changes
* Increase minimum Vulkan requirements
* Require VK_EXT_vertex_attribute_divisor
* Require depthClamp, samplerAnisotropy and largePoints features
* Search and expose VK_KHR_uniform_buffer_standard_layout
* Search and expose VK_EXT_index_type_uint8
* Search and expose native float16 arithmetics
* Track current driver with VK_KHR_driver_properties
* Query and expose SSBO alignment
* Query more image formats
* Improve logging overall
* Minor style changes
* Minor rephrasing of commentaries
2019-09-13 02:10:07 -03:00
ReinUsesLisp
99e23bd0fd video_core/surface: Add function to detect sRGB surfaces
This is required for proper conversion to RGBA8_UNORM or RGBA8_SRGB
surfaces when a backend can target both native and converted ASTC.
2019-09-13 00:27:04 -03:00
ReinUsesLisp
6b997c8f7f renderer_opengl: Fix rebase mistake 2019-09-11 00:09:37 -03:00
ReinUsesLisp
36abf67e79 shader/image: Implement SUATOM and fix SUST 2019-09-10 20:22:31 -03:00
Fernando Sahmkow
e60d281a01 gl_rasterizer: Correct sRGB Fix regression 2019-09-10 19:31:42 -03:00
ReinUsesLisp
78574746bd renderer_opengl: Fix sRGB blits
Removes the sRGB hack of tracking if a frame used an sRGB rendertarget
to apply at least once to blit the final texture as sRGB. Instead of
doing this apply sRGB if the presented image has sRGB.

Also enable sRGB by default on Maxwell3D registers as some games seem to
assume this.
2019-09-10 19:31:42 -03:00
bunnei
34b2c60f95
Merge pull request #2823 from ReinUsesLisp/shr-clamp
shader/shift: Implement SHR wrapped and clamped variants
2019-09-10 11:56:17 -04:00
bunnei
c7ec7bc1f5
Merge pull request #2810 from ReinUsesLisp/mme-opt
maxwell_3d: Avoid moving macro_params
2019-09-10 11:55:45 -04:00
ReinUsesLisp
17a9b0178d gl_shader_decompiler: Avoid writing output attribute when unimplemented 2019-09-06 15:02:12 -03:00
ReinUsesLisp
1f43e5296f gl_shader_decompiler: Keep track of written images and mark them as modified 2019-09-05 23:26:05 -03:00
ReinUsesLisp
7228e22098 texture_cache: Minor changes 2019-09-05 23:25:15 -03:00
ReinUsesLisp
322d0200c8 gl_rasterizer: Apply textures and images state 2019-09-05 20:35:51 -03:00
ReinUsesLisp
80ec2feee8 gl_rasterizer: Add samplers to compute dispatches 2019-09-05 20:35:51 -03:00
ReinUsesLisp
954fc02fdd gl_rasterizer: Minor code changes 2019-09-05 20:35:51 -03:00
ReinUsesLisp
04cdecb7a1 gl_state: Split textures and samplers into two arrays 2019-09-05 20:35:51 -03:00
ReinUsesLisp
6170337001 gl_rasterizer: Implement image bindings 2019-09-05 20:35:51 -03:00
ReinUsesLisp
5edf24b510 gl_state: Add support for glBindImageTextures 2019-09-05 20:35:51 -03:00
ReinUsesLisp
2424eefad2 texture_cache: Pass TIC to texture cache 2019-09-05 20:35:51 -03:00
ReinUsesLisp
3a450c1395 kepler_compute: Implement texture queries 2019-09-05 20:35:51 -03:00
ReinUsesLisp
2e5b5c2358 gl_rasterizer: Split SetupTextures 2019-09-05 20:35:51 -03:00
Fernando Sahmkow
4ee9949639
Merge pull request #2804 from ReinUsesLisp/remove-gs-special
gl_shader_cache: Remove special casing for geometry shaders
2019-09-05 16:03:46 -04:00
bunnei
03badbdd9b
Merge pull request #2833 from ReinUsesLisp/fix-stencil
gl_rasterizer: Fix stencil testing
2019-09-05 15:27:31 -04:00
ReinUsesLisp
0f7b813d65 gl_shader_decompiler: Implement shared memory 2019-09-05 01:40:24 -03:00
ReinUsesLisp
4de04eba39 shader_ir: Implement LD_S
Loads from shared memory.
2019-09-05 01:38:37 -03:00
ReinUsesLisp
f17415d431 shader_ir: Implement ST_S
This instruction writes to a memory buffer shared with threads within
the same work group. It is known as "shared" memory in GLSL.
2019-09-05 01:38:37 -03:00
David
d34fa7c4fa
Merge pull request #2802 from ReinUsesLisp/hsetp2-pred
half_set_predicate: Fix HSETP2 predicate assignments
2019-09-05 12:26:39 +10:00
ReinUsesLisp
6177cbdbe1 gl_shader_decompiler: Fixup slow path 2019-09-04 15:03:51 -03:00
ReinUsesLisp
7bbc98cfc3 gl_rasterizer: Fix stencil testing
* Fix stencil dirty flags tracking when stencil is disabled
* Attach stencil on clears (previously it only attached depth)
* Attach stencil on drawing regardless of stencil testing being enabled
2019-09-04 01:59:09 -03:00
ReinUsesLisp
5f309b88db Revert "Revert #2466" and stub FirmwareCall 4 2019-09-04 01:55:45 -03:00
ReinUsesLisp
77ef4fa907 shader/shift: Implement SHR wrapped and clamped variants
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
ReinUsesLisp
701dedcfad maxwell_3d: Avoid moving macro_params 2019-09-04 01:55:01 -03:00
ReinUsesLisp
42e1bb6d46 gl_shader_cache: Remove special casing for geometry shaders
Now that ProgramVariants holds the primitive topology we no longer need
to keep track of individual geometry shaders topologies.
2019-09-04 01:54:43 -03:00
ReinUsesLisp
dfae2d141a half_set_predicate: Fix predicate assignments 2019-09-04 01:54:23 -03:00
ReinUsesLisp
9cf52d027d gl_device: Disable precise in fragment shaders on bugged drivers 2019-09-04 01:54:00 -03:00
ReinUsesLisp
03276e7490 gl_shader_decompiler: Fixup AMD's slow path type 2019-09-04 01:54:00 -03:00
ReinUsesLisp
6c449793b8 gl_shader_decompiler: Rework GLSL decompiler type system
GLSL decompiler type system was broken. We converted all return values
to float except for some cases where returning we couldn't and
implicitly broke the rule of returning floats (e.g. for bools or bool
pairs).

Instead of doing this introduce class Expression that knows what type a
return value has and when a consumer wants to use the string it asks for
it with a required type, emitting a runtime error if types are
incompatible.

This has the disadvantage that there's more C++ code, but we can emit
better GLSL code that's easier to read.
2019-09-04 01:54:00 -03:00
bunnei
19af91434e
Merge pull request #2793 from ReinUsesLisp/bgr565
renderer_opengl: Implement RGB565 framebuffer format
2019-09-03 22:36:32 -04:00
bunnei
81fbc5370d
Merge pull request #2812 from ReinUsesLisp/f2i-selector
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
bunnei
d4f33b822b
Merge pull request #2811 from ReinUsesLisp/fsetp-fix
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
bunnei
137d165672
Merge pull request #2826 from ReinUsesLisp/macro-binding
maxwell_3d: Fix macro binding cursor
2019-09-03 22:32:42 -04:00
bunnei
50b5bb44a0
Merge pull request #2765 from FernandoS27/dma-fix
MaxwellDMA: Fixes, corrections and relaxations.
2019-09-01 13:13:05 -04:00
ReinUsesLisp
52a41f482f maxwell_3d: Fix macro binding cursor 2019-09-01 05:01:11 -03:00
Rodrigo Locatti
4d4f9cc104 video_core: Silent miscellaneous warnings (#2820)
* texture_cache/surface_params: Remove unused local variable

* rasterizer_interface: Add missing documentation commentary

* maxwell_dma: Remove unused rasterizer reference

* video_core/gpu: Sort member declaration order to silent -Wreorder warning

* fermi_2d: Remove unused MemoryManager reference

* video_core: Silent unused variable warnings

* buffer_cache: Silent -Wreorder warnings

* kepler_memory: Remove unused MemoryManager reference

* gl_texture_cache: Add missing override

* buffer_cache: Add missing include

* shader/decode: Remove unused variables
2019-08-30 14:08:00 -04:00
ReinUsesLisp
878adee0a3 gl_buffer_cache: Add missing include
RasterizerInterface was considered an incomplete object by clang.
2019-08-29 22:02:52 +00:00
bunnei
a67c4e6e02
Merge pull request #2742 from ReinUsesLisp/fix-texture-buffers
gl_texture_cache: Miscellaneous texture buffer fixes
2019-08-29 15:59:17 -04:00
bunnei
e424615839
Merge pull request #2783 from FernandoS27/new-buffer-cache
Implement a New LLE Buffer Cache
2019-08-29 13:07:01 -04:00
bunnei
f8cc5668f8
Merge pull request #2758 from ReinUsesLisp/packed-tid
shader/decode: Implement S2R Tic
2019-08-29 12:58:43 -04:00
ReinUsesLisp
e3534700d7 shader_ir/conversion: Split int and float selector and implement F2F H1 2019-08-28 16:09:33 -03:00
ReinUsesLisp
b13fbc25b8 shader_ir/conversion: Implement F2I F16 Ra.H1 2019-08-27 23:40:40 -03:00
ReinUsesLisp
6207751b00 float_set_predicate: Add missing negation bit for the second operand 2019-08-27 21:57:43 -03:00
ReinUsesLisp
4e35177e23 shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
Fernando Sahmkow
83ec2091c1 Buffer Cache: Adress Feedback. 2019-08-21 12:14:27 -04:00
Fernando Sahmkow
6ce2c85047 Buffer_Cache: Implement flushing. 2019-08-21 12:14:26 -04:00
Fernando Sahmkow
de8ff8a1c6 Buffer_Cache: Implement barriers. 2019-08-21 12:14:25 -04:00
Fernando Sahmkow
286f4c446a Buffer_Cache: Optimize and track written areas. 2019-08-21 12:14:25 -04:00
Fernando Sahmkow
5f4b746a1e BufferCache: Rework mapping caching. 2019-08-21 12:14:24 -04:00
Fernando Sahmkow
86d8563314 Buffer_Cache: Fixes and optimizations. 2019-08-21 12:14:23 -04:00
Fernando Sahmkow
862bec001b Video_Core: Implement a new Buffer Cache 2019-08-21 12:14:22 -04:00
bunnei
d654b3d82e
Merge pull request #2769 from FernandoS27/commands-flush
GPU: Flush commands on every dma pusher step.
2019-08-21 10:29:56 -04:00
bunnei
dfdd20142e
Merge pull request #2777 from ReinUsesLisp/hsetp2-fe3h-fix
half_set_predicate: Fix HSETP2_C constant buffer offset
2019-08-21 10:29:17 -04:00
bunnei
cedc1aab4a
Merge pull request #2753 from FernandoS27/float-convert
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
ReinUsesLisp
80702aa88f renderer_opengl: Implement RGB565 framebuffer format 2019-08-21 02:28:31 -03:00
ReinUsesLisp
9cdf5c6c31 renderer_opengl: Use block linear swizzling for CPU framebuffers 2019-08-21 02:17:14 -03:00
ReinUsesLisp
8ad7268c75 renderer_opengl: Use VideoCore pixel format 2019-08-21 02:16:40 -03:00
ReinUsesLisp
9a76e94b3d gpu: Change optional<reference_wrapper<T>> to T* for FramebufferConfig 2019-08-21 01:55:25 -03:00
bunnei
ca61e298b3
Merge pull request #2778 from ReinUsesLisp/nop
shader_ir: Implement NOP
2019-08-18 08:51:34 -04:00
bunnei
87bbefe55f
Merge pull request #2768 from ReinUsesLisp/hsetp2-fix
decode/half_set_predicate: Fix predicates
2019-08-18 08:50:54 -04:00
ReinUsesLisp
2ff8044806 shader_ir: Implement NOP 2019-08-04 03:02:55 -03:00
ReinUsesLisp
ec0da3ef64 half_set_predicate: Fix HSETP2_C constant buffer offset 2019-08-04 02:50:55 -03:00
Fernando Sahmkow
e52c895559 GPU: Flush commands on every dma pusher step.
This commit ensures that the host gpu is constantly fed with commands to
work with, while the guest gpu keeps producing the rest of the commands.
This reduces syncing time between host and guest gpu.
2019-07-26 16:54:22 -04:00
bunnei
52f54c728d
Merge pull request #2592 from FernandoS27/sync1
Implement GPU Synchronization Mechanisms & Correct NVFlinger
2019-07-26 14:26:44 -04:00
ReinUsesLisp
77f1a676a1 decode/half_set_predicate: Fix predicates 2019-07-26 00:12:38 -03:00
Fernando Sahmkow
a452ff983d MaxwellDMA: Fixes, corrections and relaxations.
This commit fixes offsets on Linear -> Tiled copies, corrects z pos
fortiled->linear copies, corrects bytes_per_pixel calculation in tiled
-> linear copies and relaxes some limitations set by latest dma fixes
refactors.
2019-07-25 20:41:42 -04:00
bunnei
b0ff3179ef
Merge pull request #2739 from lioncash/cflow
video_core/control_flow: Minor changes/warning cleanup
2019-07-25 13:04:56 -04:00
bunnei
4d26550f5f
Merge pull request #2737 from FernandoS27/track-fix
Shader_Ir: Correct tracking to track from right to left
2019-07-25 12:41:52 -04:00
bunnei
31e8a61527
Merge pull request #2743 from FernandoS27/surpress-assert
Downgrade and suppress a series of GPU asserts and debug messages.
2019-07-25 12:34:36 -04:00
bunnei
9be9600bdc
Merge pull request #2704 from FernandoS27/conditional
maxwell3d: Implement Conditional Rendering
2019-07-24 17:07:57 -04:00
ReinUsesLisp
104641db07 shader/decode: Implement S2R Tic 2019-07-22 16:16:10 -03:00
bunnei
f601f25bcc
Merge pull request #2734 from ReinUsesLisp/compute-shaders
gl_rasterizer: Implement compute shaders
2019-07-22 11:12:55 -04:00
bunnei
27e10e0442
Merge pull request #2735 from FernandoS27/pipeline-rework
Rework Dirty Flags in GPU Pipeline, Optimize CBData and Redo Clearing mechanism
2019-07-21 00:59:52 -04:00
Fernando Sahmkow
11f4e739bd Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
Fernando Sahmkow
7a35178ee2 Maxwell3D: Reorganize and address feedback 2019-07-20 10:18:35 -04:00
Fernando Sahmkow
1158777737 Shader_Ir: Change Debug Asserts for Log Warnings 2019-07-19 22:15:34 -04:00
ReinUsesLisp
45c162444d shader/half_set_predicate: Fix HSETP2 implementation 2019-07-19 22:21:22 -03:00
ReinUsesLisp
6c4985edc9 shader/half_set_predicate: Implement missing HSETP2 variants 2019-07-19 22:20:47 -03:00
Lioncash
c1c89411da video_core/control_flow: Provide operator!= for types with operator==
Provides operational symmetry for the respective structures.
2019-07-18 21:03:31 -04:00
Lioncash
1780e0e3d0 video_core/control_flow: Prevent sign conversion in TryGetBlock()
The return value is a u32, not an s32, so this would result in an
implicit signedness conversion.
2019-07-18 21:03:31 -04:00
Lioncash
a162a844d2 video_core/control_flow: Remove unnecessary BlockStack copy constructor
This is the default behavior of the copy constructor, so it doesn't need
to be specified.

While we're at it we can make the other non-default constructor
explicit.
2019-07-18 21:03:30 -04:00
Lioncash
56bc11d952 video_core/control_flow: Use std::move where applicable
Results in less work being done where avoidable.
2019-07-18 21:03:30 -04:00
Lioncash
e7b39f47f8 video_core/control_flow: Use the prefix variant of operator++ for iterators
Same thing, but potentially allows a standard library implementation to
pick a more efficient codepath.
2019-07-18 21:03:30 -04:00
Lioncash
6885e7e7ec video_core/control_flow: Use empty() member function for checking emptiness
It's what it's there for.
2019-07-18 21:03:30 -04:00
Lioncash
45fa12a05c video_core: Resolve -Wreorder warnings
Ensures that the constructor members are always initialized in the order
that they're declared in.
2019-07-18 21:03:30 -04:00
Lioncash
47df844338 video_core/control_flow: Make program_size for ScanFlow() a std::size_t
Prevents a truncation warning from occurring with MSVC. Also the
internal data structures already treat it as a size_t, so this is just a
discrepancy in the interface.
2019-07-18 21:03:29 -04:00
Lioncash
3df9558593 video_core/control_flow: Place all internally linked types/functions within an anonymous namespace
Previously, quite a few functions were being linked with external
linkage.
2019-07-18 21:03:29 -04:00
Lioncash
1109db86b7 video_core/shader/decode: Prevent sign-conversion warnings
Makes it explicit that the conversions here are intentional.
2019-07-18 21:03:29 -04:00
bunnei
63bda67a34
Merge pull request #2738 from lioncash/shader-ir
shader-ir: Minor cleanup-related changes
2019-07-18 13:52:01 -04:00
Fernando Sahmkow
5a06e33859 Shader_Ir: correct clang format 2019-07-18 10:09:26 -04:00
Fernando Sahmkow
43f57d668c GPU: Add missing puller methods.
This adds some missing puller methods. We don't assert them as these are 
nop operations for us.
2019-07-18 08:54:42 -04:00
Fernando Sahmkow
3a3fee5abf MaxwellDMA/KeplerCopy: Downgrade DMA log message to Trace.
This log was just to know which games used DMA. It's no longer 
important.
2019-07-18 08:31:38 -04:00
Fernando Sahmkow
d3b71ff80d Gl_Texture_Cache: Remove assert on component type in GetFormatTuple
Textures can have different components types in different orders. This 
assert was completely inprecise and the effectiveness of such is better 
handled by case and within the texture cache.
2019-07-18 08:20:31 -04:00
Fernando Sahmkow
0b65e9335e Shader_Ir: Downgrade precision and rounding asserts to debug asserts.
This commit reduces the sevirity of asserts for FP precision and 
rounding as this are well known and have little to no consequences in 
gpu's accuracy.
2019-07-18 08:17:19 -04:00
ReinUsesLisp
74632c76ce gl_shader_decompiler: Rename bufferImage to imageBuffer
The online OpenGL documentation is wrong. The type definition is
imageBuffer.
2019-07-18 01:16:44 -03:00
ReinUsesLisp
87909d327f gl_shader_cache: Fix newline on buffer preprocessor definitions 2019-07-18 01:16:15 -03:00
ReinUsesLisp
e7bdf8b22a textures: Fix texture buffer size calculation 2019-07-18 01:07:08 -03:00
ReinUsesLisp
84027f4808 gl_texture_cache: Do not set texture parameters to buffers 2019-07-18 01:06:26 -03:00
ReinUsesLisp
73b2dc6d4f gl_texture_cache: Add missing break in CreateTexture 2019-07-18 01:04:18 -03:00
Fernando Sahmkow
4be61013a1 GL_State: Feedback and fixes 2019-07-17 17:29:56 -04:00
Fernando Sahmkow
5ad889f6fd Maxwell3D: Address Feedback 2019-07-17 17:29:55 -04:00
Fernando Sahmkow
7826f0afd9 Texture_Cache: Rebase Fixes 2019-07-17 17:29:54 -04:00
Fernando Sahmkow
8cdbfe69b1 GL_Rasterizer: Corrections to Clearing. 2019-07-17 17:29:54 -04:00
Fernando Sahmkow
0ff4a5fa39 Maxwell3D: Correct marking dirtiness on CB upload 2019-07-17 17:29:53 -04:00
Fernando Sahmkow
fec32fed18 GL_Rasterizer: Rework RenderTarget/DepthBuffer clearing 2019-07-17 17:29:52 -04:00
Fernando Sahmkow
a081dea8ab Maxwell3D: Implement State Dirty Flags. 2019-07-17 17:29:51 -04:00
Fernando Sahmkow
0d3db58657 Maxwell3D: Rework CBData Upload 2019-07-17 17:29:50 -04:00
Fernando Sahmkow
f2e7b29c14 Maxwell3D: Rework the dirty system to be more consistant and scaleable 2019-07-17 17:29:49 -04:00
Fernando Sahmkow
e42bcf2314 maxwell3d: Implement Conditional Rendering
Conditional Rendering takes care of conditionaly clearing or drawing
depending on a set of queries. This PR implements the query checks to
stablish if things can be rendered or not.
2019-07-17 17:13:19 -04:00
Fernando Sahmkow
223a535f3f
Merge pull request #2740 from lioncash/bra
shader/decode/other: Correct branch indirect argument within BRA handling
2019-07-17 14:25:08 -04:00
Lioncash
bebbdc2067 shader_ir: std::move Node instance where applicable
These are std::shared_ptr instances underneath the hood, which means
copying them isn't as cheap as a regular pointer. Particularly so on
weakly-ordered systems.

This avoids atomic reference count increments and decrements where they
aren't necessary for the core set of operations.
2019-07-16 19:49:23 -04:00
Lioncash
60926ac16b shader_ir: Rename Get/SetTemporal to Get/SetTemporary
This is more accurate in terms of describing what the functions are
actually doing. Temporal relates to time, not the setting of a temporary
itself.
2019-07-16 19:47:43 -04:00
Lioncash
44d87ff641 shader_ir: Remove unused includes
Removes unnecessary header dependencies.
2019-07-16 19:47:42 -04:00
Fernando Sahmkow
d614193e49 Shader_Ir: Correct tracking to track from right to left 2019-07-16 15:06:59 -04:00
Fernando Sahmkow
b56e7f870a
Merge pull request #2565 from ReinUsesLisp/track-indirect
shader/track: Track indirect buffers
2019-07-16 14:58:35 -04:00
Lioncash
e2d7dda166 shader/decode/other: Correct branch indirect argument within BRA handling
This appears to have been a copy/paste error introduced within
8a6fc529a9
2019-07-16 12:20:45 -04:00
ReinUsesLisp
2a4044a858 gl_shader_cache: Fix clang-format issues 2019-07-15 20:33:51 -03:00
ReinUsesLisp
6b0d017675 gl_shader_decompiler: Stub local memory size 2019-07-15 17:38:25 -03:00
ReinUsesLisp
56bca83bde gl_shader_cache: Address review commentaries 2019-07-15 17:38:25 -03:00
ReinUsesLisp
bbecd13697 gl_shader_cache: Address CI issues 2019-07-15 17:38:25 -03:00
ReinUsesLisp
725ba6cf63 gl_rasterizer: Implement compute shaders 2019-07-15 17:38:25 -03:00
Fernando Sahmkow
1bdb59fc6e
Merge pull request #2695 from ReinUsesLisp/layer-viewport
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
bunnei
b77a1ed67a
Merge pull request #2705 from FernandoS27/tex-cache-fixes
GPU: Fixes to Texture Cache and Include Microprofiles for GL State/BufferCopy/Macro Interpreter
2019-07-14 22:44:36 -04:00
ReinUsesLisp
afa8096df5 shader: Allow tracking of indirect buffers without variable offset
While changing this code, simplify tracking code to allow returning
the base address node, this way callers don't have to manually rebuild
it on each invocation.
2019-07-14 22:36:44 -03:00
bunnei
3477b92289
Merge pull request #2675 from ReinUsesLisp/opengl-buffer-cache
buffer_cache: Implement a generic buffer cache and its OpenGL backend
2019-07-14 19:03:43 -04:00
Fernando Sahmkow
2ac7472d3f Texture_Cache: Address Feedback 2019-07-14 17:42:39 -04:00
Fernando Sahmkow
0f54b541f4 Texture_Cache: Remove some unprecise fallback case and clang format 2019-07-14 12:00:32 -04:00
Fernando Sahmkow
5818959e54 Texture_Cache: Force Framebuffer reset if an active render target is unregistered. 2019-07-14 12:00:31 -04:00
Fernando Sahmkow
913b7a6872 GPU: Add a microprofile for macro interpreter 2019-07-14 12:00:30 -04:00
Fernando Sahmkow
a9943222f2 GL_State: Add a microprofile timer to OpenGL state. 2019-07-14 12:00:30 -04:00
Fernando Sahmkow
5c1e1a148e Gl_Texture_Cache: Measure Buffer Copy Times 2019-07-14 12:00:29 -04:00
Fernando Sahmkow
5d31bab69a Texture_Cache: Correct Linear Structural Match. 2019-07-14 12:00:28 -04:00
Fernando Sahmkow
4882c058fd
Merge pull request #2690 from SciresM/physmem_fixes
Implement MapPhysicalMemory/UnmapPhysicalMemory
2019-07-14 09:16:46 -04:00
Fernando Sahmkow
0ec9da2f9f
Merge pull request #2692 from ReinUsesLisp/tlds-f16
shader/texture: Add F16 support for TLDS
2019-07-14 08:44:38 -04:00
bunnei
bb67091c77
Merge pull request #2609 from FernandoS27/new-scan
Implement a New Shader Scanner, Decompile Flow Stack and implement BRX BRA.CC
2019-07-11 17:36:23 -04:00
ReinUsesLisp
0eb0c24269 gl_shader_decompiler: Fix gl_PointSize redeclaration 2019-07-11 16:10:59 -03:00
ReinUsesLisp
aca40de224 gl_shader_decompiler: Fix conditional usage of GL_ARB_shader_viewport_layer_array 2019-07-11 04:27:00 -03:00
bunnei
fd066ffbce
Merge pull request #2697 from lioncash/doc
gl_rasterizer: Amend documentation comment for ConfigureFramebuffers()
2019-07-10 16:38:09 -04:00
bunnei
7fb7054bc8
Merge pull request #2686 from ReinUsesLisp/vk-scheduler
vk_scheduler: Drop execution context in favor of views
2019-07-10 16:35:48 -04:00
bunnei
206ec29f17
Merge pull request #2691 from lioncash/override
video_core: Add missing override specifiers
2019-07-10 16:25:43 -04:00
Fernando Sahmkow
f2549739d1 shader_ir: Add comments on missing instruction.
Also shows Nvidia's address space on comments.
2019-07-09 17:15:45 -04:00
Michael Scire
a1845d1dd3 prefer system reference over global accessor 2019-07-09 08:11:35 -07:00
Fernando Sahmkow
2de7649311 shader_ir: limit explorastion to best known program size. 2019-07-09 08:14:43 -04:00
Fernando Sahmkow
e7c6045a03 control_flow: Correct block breaking algorithm. 2019-07-09 08:14:43 -04:00
Fernando Sahmkow
dc4a93594c control_flow: Assert shaders bigger than limit. 2019-07-09 08:14:42 -04:00
Fernando Sahmkow
e7a88f0ab3 control_flow: Address feedback. 2019-07-09 08:14:42 -04:00
Fernando Sahmkow
34357b110c shader_ir: Correct parsing of scheduling instructions and correct sizing 2019-07-09 08:14:41 -04:00
Fernando Sahmkow
cfb3db1a32 shader_ir: Correct max sizing 2019-07-09 08:14:40 -04:00
Fernando Sahmkow
d45fed3030 shader_ir: Remove unnecessary constructors and use optional for ScanFlow result 2019-07-09 08:14:40 -04:00
Fernando Sahmkow
01b21ee1e8 shader_ir: Corrections, documenting and asserting control_flow 2019-07-09 08:14:39 -04:00
Fernando Sahmkow
d5533b440c shader_ir: Unify blocks in decompiled shaders. 2019-07-09 08:14:39 -04:00
Fernando Sahmkow
926b80102f shader_ir: Decompile Flow Stack 2019-07-09 08:14:38 -04:00
Fernando Sahmkow
459fce3a8f shader_ir: propagate shader size to the IR 2019-07-09 08:14:37 -04:00
Fernando Sahmkow
8a6fc529a9 shader_ir: Implement BRX & BRA.CC 2019-07-09 08:14:37 -04:00
Fernando Sahmkow
c218ae4b02 shader_ir: Remove the old scanner. 2019-07-09 08:14:36 -04:00
Fernando Sahmkow
8af6e6a052 shader_ir: Implement a new shader scanner 2019-07-09 08:14:36 -04:00
Lioncash
c04785c928 gl_rasterizer: Amend documentation comment for ConfigureFramebuffers()
must_reconfigure isn't a parameter for this function any more, so it can
be replaced with current_state.

While we're at it, we can make the parameters of the declaration match
the same name as the ones in the definition.
2019-07-09 02:08:15 -04:00
Michael Scire
697206092e Prevent merging of device mapped memory blocks.
This sets the DeviceMapped attribute for GPU-mapped memory blocks,
and prevents merging device mapped blocks. This prevents memory
mapped from the gpu from having its backing address changed by
block coalesce.
2019-07-08 22:52:05 -07:00
ReinUsesLisp
c9d886c84e gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.

In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
Tobias
be020f7621
Delete decode_integer_set.cpp 2019-07-07 21:40:33 +02:00
ReinUsesLisp
d0966b9f7c shader/texture: Add F16 support for TLDS 2019-07-07 16:05:56 -03:00
Lioncash
cbdd6cd1c0 vk_sampler_cache: Remove unused includes
These are no longer used within this header, so they can be removed.
2019-07-07 13:40:36 -04:00
Lioncash
4b27680639 video_core: Add missing override specifiers 2019-07-07 13:38:39 -04:00
ReinUsesLisp
86a874a2fc vk_scheduler: Drop execution context in favor of views
Instead of passing by copy an execution context through out the whole
Vulkan call hierarchy, use a command buffer view and fence view
approach.

This internally dereferences the command buffer or fence forcing the
user to be unable to use an outdated version of it on normal usage.
It is still possible to keep store an outdated if it is casted to
VKFence& or vk::CommandBuffer.

While changing this file, add an extra parameter for Flush and Finish to
allow releasing the fence from this calls.
2019-07-07 03:30:22 -03:00
ReinUsesLisp
79a23ca5f0 buffer_cache: Avoid [[nodiscard]] to make clang-format happy 2019-07-06 01:17:05 -03:00
ReinUsesLisp
83050c9495 buffer_cache: Try to fix MinGW build 2019-07-06 01:14:05 -03:00
ReinUsesLisp
f7691ebe57 gl_rasterizer: Fix nullptr dereference on disabled buffers 2019-07-06 00:37:56 -03:00
ReinUsesLisp
7ecf64257a gl_rasterizer: Minor style changes 2019-07-06 00:37:55 -03:00
ReinUsesLisp
9cdc576f60 gl_rasterizer: Fix vertex and index data invalidations 2019-07-06 00:37:55 -03:00
ReinUsesLisp
1fa21fa192 gl_buffer_cache: Implement with generic buffer cache 2019-07-06 00:37:55 -03:00
ReinUsesLisp
32c0212b24 buffer_cache: Implement a generic buffer cache
Implements a templated class with a similar approach to our current
generic texture cache. It is designed to be compatible with Vulkan and
OpenGL,
2019-07-06 00:37:55 -03:00
ReinUsesLisp
2bcae41a73 gl_buffer_cache: Remove global system getters 2019-07-06 00:37:55 -03:00
ReinUsesLisp
02ab844934 gl_device: Query SSBO alignment 2019-07-06 00:37:55 -03:00
ReinUsesLisp
d14fbfb9b5 gl_buffer_cache: Implement flushing 2019-07-06 00:37:55 -03:00
ReinUsesLisp
345f852bdb gl_rasterizer: Drop gl_global_cache in favor of gl_buffer_cache 2019-07-06 00:37:55 -03:00
ReinUsesLisp
8155b12d3d gl_buffer_cache: Rework to support internalized buffers 2019-07-06 00:37:55 -03:00
ReinUsesLisp
f8ba72d491 gl_buffer_cache: Store in CachedBufferEntry the used buffer handle 2019-07-06 00:37:55 -03:00
ReinUsesLisp
b54fb8fc4c gl_buffer_cache: Return used buffer from Upload function 2019-07-06 00:37:55 -03:00
ReinUsesLisp
a6d2f52fc3 gl_rasterizer: Add some commentaries 2019-07-06 00:37:55 -03:00
ReinUsesLisp
2b9d4088ec gl_rasterizer: Make DrawParameters rasterizer instance const 2019-07-06 00:37:55 -03:00
ReinUsesLisp
2e39c20da5 gl_rasterizer: Move index buffer uploading to its own method 2019-07-06 00:37:55 -03:00
Fernando Sahmkow
d20ede40b1 NVServices: Styling, define constructors as explicit and corrections 2019-07-05 15:49:32 -04:00
Fernando Sahmkow
b391e5f638 NVFlinger: Correct GCC compile error 2019-07-05 15:49:31 -04:00
Fernando Sahmkow
0335a25d1f NVServices: Make NVEvents Automatic according to documentation. 2019-07-05 15:49:29 -04:00
Fernando Sahmkow
7d1b974bca GPU: Correct Interrupts to interrupt on syncpt/value instead of event, mirroring hardware 2019-07-05 15:49:26 -04:00
Fernando Sahmkow
f2e026a1d8 gpu_asynch: Simplify synchronization to a simpler consumer->producer scheme. 2019-07-05 15:49:20 -04:00
Fernando Sahmkow
0706d633bf nv_host_ctrl: Make Sync GPU variant always return synced result. 2019-07-05 15:49:20 -04:00
Fernando Sahmkow
600dddf88d Async GPU: do invalidate as synced operation
Async GPU: Always invalidate synced.
2019-07-05 15:49:19 -04:00
Fernando Sahmkow
c13433aee4 Gpu: use an std mutex instead of a spin_lock to guard syncpoints 2019-07-05 15:49:18 -04:00
Fernando Sahmkow
eef55f493b Gpu: Mark areas as protected. 2019-07-05 15:49:16 -04:00
Fernando Sahmkow
a45643cb3b nv_services: Stub CtrlEventSignal 2019-07-05 15:49:15 -04:00
Fernando Sahmkow
8942047d41 Gpu: Implement Hardware Interrupt Manager and manage GPU interrupts 2019-07-05 15:49:14 -04:00
Fernando Sahmkow
82b829625b video_core: Implement GPU side Syncpoints 2019-07-05 15:49:11 -04:00
Zach Hilman
772c86a260
Merge pull request #2601 from FernandoS27/texture_cache
Implement a new Texture Cache
2019-07-05 13:39:13 -04:00
Fernando Sahmkow
3b9d89839d texture_cache: Address Feedback 2019-07-05 09:46:53 -04:00
Fernando Sahmkow
30b176f92b texture_cache: Correct Texture Buffer Uploading 2019-07-04 19:38:19 -04:00
Zach Hilman
ad50cd7df9 gl_shader_cache: Make CachedShader constructor private
Fixes missing review comments introduced.
2019-07-03 20:39:46 -04:00
Zach Hilman
da5a537029
Merge pull request #2563 from ReinUsesLisp/shader-initializers
gl_shader_cache: Use static constructors for CachedShader initialization
2019-07-03 20:20:05 -04:00
Fernando Sahmkow
4705d1b523 rasterizer_cache: Protect inherited caches from submission level 2019-07-01 04:32:01 -04:00
ReinUsesLisp
6e1db6b703 texture_cache: Pack sibling queries inside a method 2019-06-29 20:47:46 -03:00
ReinUsesLisp
8eae66907e texture_cache: Use std::vector reservation for sampled_textures 2019-06-29 20:10:31 -03:00
ReinUsesLisp
f6f1a8f26a texture_cache: Style changes 2019-06-29 19:52:37 -03:00
ReinUsesLisp
dd9ace502b texture_cache: Use std::array for siblings_table 2019-06-29 18:54:13 -03:00
ReinUsesLisp
3f3c3ca5f9 texture_cache: Address feedback 2019-06-29 17:29:39 -03:00
Fernando Sahmkow
223ca80753 texture_cache: Correct variable naming. 2019-06-25 19:35:08 -04:00
Fernando Sahmkow
5aeabd9a17 gl_texture_cache: Correct asserts 2019-06-25 19:26:59 -04:00
Fernando Sahmkow
88bc39374f texture_cache: Corrections, documentation and asserts 2019-06-25 18:36:19 -04:00
Fernando Sahmkow
c0abc7124d surface_params: Corrections, asserts and documentation. 2019-06-25 18:03:25 -04:00
Fernando Sahmkow
fb234560b0 copy_params: use constexpr for constructor 2019-06-25 17:42:50 -04:00
Fernando Sahmkow
18d24fbdd0 gl_texture_cache: Corrections and fixes 2019-06-25 17:40:08 -04:00
Fernando Sahmkow
36665ce0b2 gl_resource_manager: Correct MakeStreamCopy 2019-06-25 17:32:04 -04:00
Fernando Sahmkow
58c8a44e7a texture_cache: Query MemoryManager from the system 2019-06-25 17:26:00 -04:00
ReinUsesLisp
7565389700 texture_cache: Include "core/core.h" 2019-06-24 02:15:57 -03:00
ReinUsesLisp
e723441e37 gl_texture_cache: Explicitly add indirect include 2019-06-24 02:13:55 -03:00
ReinUsesLisp
34841a41c3 texture_cache/surface_view: Address feedback 2019-06-24 02:09:56 -03:00
ReinUsesLisp
0837290992 texture_cache/surface_base: Address feedback 2019-06-24 02:08:52 -03:00
ReinUsesLisp
75de730e28 video_core/surface: Address feedback 2019-06-24 02:07:11 -03:00
ReinUsesLisp
10a83653ee decode/texture: Address feedback 2019-06-24 02:05:05 -03:00
ReinUsesLisp
4504302abc renderer_opengl/utils: Remove unused includes and unused forward declaration 2019-06-24 02:03:37 -03:00
ReinUsesLisp
4b2ff1e00e gl_texture_cache: Address some feedback 2019-06-24 02:01:44 -03:00
ReinUsesLisp
0b6df52109 gl_shader_disk_cache: Address feedback 2019-06-24 01:59:32 -03:00
ReinUsesLisp
b8b05a484a gl_shader_decompiler: Address feedback 2019-06-24 01:56:38 -03:00
ReinUsesLisp
4d63f97945 shader_bytecode: Include missing <array> 2019-06-24 01:51:02 -03:00
bunnei
a9f3c54871
Merge pull request #2579 from ReinUsesLisp/fix-aoffi-test
gl_device: Fix TestVariableAoffi test
2019-06-21 15:28:55 -04:00
Fernando Sahmkow
d1812316e1 texture_cache: Style and Corrections 2019-06-20 21:24:47 -04:00
Fernando Sahmkow
51ba60b27e shader_cache: Correct versioning and size calculation. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
97c8c9f49a texture_cache: Eliminate linear textures fallthrough 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
6acdae0e4c texture_cache: Correct format R16U as sibling 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
d7587842eb texture_cache: Implement texception detection and texture barriers. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
198a0395bb texture_cache: Corrections to buffers and shadow formats use. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
fed773a86c texture_cache: Implement Irregular Views in surfaces 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
082740d34d surface: Correct format S8Z24 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
03d489dcf5 texture_cache: Initialize all siblings to invalid pixel format. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
9422cf7c10 gl_texture_cache: Use Stream Buffers instead of Persistant for Buffer Copies. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
fac3706253 gl_texture_cache: Correct Image Blit 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
7232a1ed16 decoders: correct block calculation 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
3dd7643214 texture_cache: Use siblings textures on Rebuild and fix possible error on blitting 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
4db28f72f6 texture_cache: Remove old rasterizer cache 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
2d83553ea7 texture_cache: Implement siblings texture formats. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
cb728797b0 fermi2d: Correct Origin Mode 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
a56f687793 texture_cache: correct texture buffer on surface params 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
b01f9c8a70 texture_cache: eliminate accelerated depth->color/color->depth copies due to driver instability. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
561ce29c98 texture_cache: correct mutex locks 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
b7de31ac97 shader_ir: Fix image copy rebase issues 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
6f69f06873 texture_cache: Don't Image Copy if component types differ 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
9f755218a1 texture_cache: move some large methods to cpp files 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
3809041c24 texture_cache: Optimize GetSurface and use references on functions that don't change a surface. 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
60bf761afb texture_cache: Implement Buffer Copy and detect Turing GPUs Image Copies 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
228f516bb4 texture_cache uncompress-compress is untopological.
This makes conflicts between non compress and compress textures to be 
auto recycled. It also limits the amount of mipmaps a texture can have 
if it goes above it's limit.
2019-06-20 21:38:33 -03:00
Fernando Sahmkow
9251354152 texture_cache: Correct copying between compressed and uncompressed formats 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
0966665fc2 texture_cache: Only load on recycle with accurate GPU.
Testing so far has proven this to be quite safe as texture memory read 
added a 2-5ms load to the current cache.
2019-06-20 21:38:33 -03:00
Fernando Sahmkow
ea1525dab1 Fix rebase errors 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
bdf9faab33 texture_cache: Handle uncontinuous surfaces. 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
e60ed2bb3e texture_cache: return null surface on invalid address 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
fcac55d5bf texture_cache: Add checks for texture buffers. 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
175aa343ff texture_cache: Fermi2D reform and implement View Mirage
This also does some fixes on compressed textures reinterpret and on the
Fermi2D engine in general.
2019-06-20 21:38:33 -03:00
ReinUsesLisp
1bf4154e7d gl_shader_decompiler: Implement image binding settings 2019-06-20 21:38:33 -03:00
ReinUsesLisp
9097301d92 shader: Implement bindless images 2019-06-20 21:38:33 -03:00
ReinUsesLisp
06c4ce8645 shader: Decode SUST and implement backing image functionality 2019-06-20 21:38:33 -03:00
ReinUsesLisp
007ffbef1c gl_rasterizer: Track texture buffer usage 2019-06-20 21:38:33 -03:00
ReinUsesLisp
58c0d37422 video_core: Make ARB_buffer_storage a required extension 2019-06-20 21:36:12 -03:00
ReinUsesLisp
07f7ce1da2 gl_rasterizer_cache: Use texture buffers to emulate texture buffers 2019-06-20 21:36:12 -03:00
ReinUsesLisp
b8c75a845b maxwell_3d: Partially implement texture buffers as 1D textures 2019-06-20 21:36:12 -03:00
ReinUsesLisp
6c81c8f5b7 gl_shader_decompiler: Allow 1D textures to be texture buffers 2019-06-20 21:36:12 -03:00
ReinUsesLisp
4e81fc8296 shader: Implement texture buffers 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
d267948a73 texture_cache: loose TryReconstructSurface when accurate GPU is not on.
Also corrects some asserts.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
6162cb922e texture_cache: Document the most important methods. 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
4530511ee4 texture_cache: Try to Reconstruct Surface on bigger than overlap.
This fixes clouds in SMO Cap Kingdom and lens on Cloud Kingdom.
Also moved accurate_gpu setting check to Pick Strategy
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
a79831d9d0 texture_cache: Implement Guard mechanism 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
7731a0e2d1 texture_cache: General Fixes
Fixed ASTC mipmaps loading
Fixed alignment on openGL upload/download
Fixed Block Height Calculation
Removed unalign_height
2019-06-20 21:36:12 -03:00
ReinUsesLisp
c2ed348bdd surface_params: Ensure pitch is always written to avoid surface leaks 2019-06-20 21:36:12 -03:00
ReinUsesLisp
9098905dd1 gl_framebuffer_cache: Use a hashed struct to cache framebuffers 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
d65a4af895 texture_cache return invalid buffer on deactivated color_mask 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
6bd034eae9 engine_upload: Addapt to new Texture Cache 2019-06-20 21:36:12 -03:00
ReinUsesLisp
2131f71573 surface_params: Optimize CreateForTexture
Instead of using Common::AlignUp, use Common::AlignBits to align the
texture compression factor.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
41b4674458 gl_texture_cache: Make main views be proxy textures instead of a full view. 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
07cc7e0c12 texture_cache: Add ASync Protections 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
1bbc9debfb Remove Framebuffer reconfiguration and restrict rendertarget protection 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
5192521dc3 texture_cache: Implement GPU Dirty Flags 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
94f2be5473 texture_cache: Optimize GetMipBlockHeight and GetMipBlockDepth 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
a4a58be2d4 texture_cache: Implement L1_Inner_cache 2019-06-20 21:36:12 -03:00
ReinUsesLisp
345e73f2fe video_core: Use un-shifted block sizes to avoid integer divisions
Instead of storing all block width, height and depths in their shifted
form:

block_width = 1U << block_shift;

Store them like they are provided by the emulated hardware (their
block_shift form). This way we can avoid doing the costly
Common::AlignUp operation to align texture sizes and drop CPU integer
divisions with bitwise logic (defined in Common::AlignBits).
2019-06-20 21:36:12 -03:00
ReinUsesLisp
28d7c2f5a5 texture_cache: Change internal cache from lists to vectors 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
b347543e83 Reduce amount of size calculations. 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
4e2071b6d9 texture_cache: Correct premature texceptions
Due to our current infrastructure, it is possible for a mipmap to be set 
on as a render target before a texception of that mipmap's superset be 
set afterwards. This is problematic as we rely on texture views to set 
up texceptions and protecting render targets targets for 3D texture 
rendering.

One simple solution is to configure framebuffers after texture setup but 
this brings other problems. This solution, forces a reconfiguration of 
the framebuffers after such event happens.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
ba677ccb5a texture_cache: Implement guest flushing 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
de0b1cb2b2 Fixes to mipmap's process and reconstruct process 2019-06-20 21:36:12 -03:00
ReinUsesLisp
e0002599ac surface_base: Add parenthesis to EmplaceOverview's predicate 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
324e470879 Texture Cache: Implement Blitting and Fermi Copies 2019-06-20 21:36:12 -03:00
ReinUsesLisp
549fd18ac4 surface_view: Add constructor for ViewParams 2019-06-20 21:36:12 -03:00
ReinUsesLisp
16e8625a30 surface_base: Split BreakDown into layered and non-layered variants 2019-06-20 21:36:12 -03:00
ReinUsesLisp
2b30000a1e surface_base: Silence truncation warnings and minor renames and reordering 2019-06-20 21:36:12 -03:00
ReinUsesLisp
03d10ea3b4 copy_params: Use constructor instead of C-like initialization 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
1af4414861 Correct Mipmaps View method in Texture Cache 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
d86f9cd709 Change texture_cache chaching from GPUAddr to CacheAddr
This also reverses the changes to make invalidation and flushing through
the GPU address.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
b711cdce78 Corrections to Structural Matching
The texture will now be reconstructed if the width only matches on GoB 
alignment.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
bc930754cc Implement Texture Cache V2 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
3d471e732d Correct Surface Base and Views for new Texture Cache 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
3b26206dbd Add OGLTextureView 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
6b0695b3cd Deglobalize Memory Manager on texture cahe and Implement Invalidation and Flushing using GPUVAddr 2019-06-20 21:36:11 -03:00
ReinUsesLisp
6c410104f4 texture_cache: Remove execution context copies from the texture cache
This is done to simplify the OpenGL implementation, it is needed for
Vulkan.
2019-06-20 21:36:11 -03:00
ReinUsesLisp
fa59a7b4d8 gl_texture_cache: Implement fermi copies 2019-06-20 21:36:11 -03:00
ReinUsesLisp
1b4503c571 texture_cache: Split texture cache into different files 2019-06-20 21:36:11 -03:00
ReinUsesLisp
5f3aacdc37 texture_cache: Move staging buffer into a generic implementation 2019-06-20 21:36:11 -03:00
ReinUsesLisp
2787a0c287 texture_cache: Flush 3D textures in the order they are drawn 2019-06-20 21:36:11 -03:00
ReinUsesLisp
4b396f375c gl_texture_cache: Minor changes 2019-06-20 21:36:11 -03:00
ReinUsesLisp
0cefb7bcb4 gl_texture_cache: Add copy from multiple overlaps into a single surface 2019-06-20 21:36:11 -03:00
ReinUsesLisp
84139586c9 gl_texture_cache: Attach surface textures instead of views 2019-06-20 21:36:11 -03:00
ReinUsesLisp
fb94871791 gl_texture_cache: Add fast copy path 2019-06-20 21:36:11 -03:00
ReinUsesLisp
bab21e8cb3 gl_texture_cache: Initial implementation 2019-06-20 21:36:11 -03:00
bunnei
c28694d907
Merge pull request #2591 from lioncash/record
core: Remove unused CiTrace source files
2019-06-19 22:28:26 -04:00
Lioncash
61d2498f00 core: Remove unused CiTrace source files
These source files have been unused for the entire lifecycle of the
project. They're a hold-over from Citra and only add to the build time
of the project, so they can be removed.

There's also likely no way this would ever work in yuzu in its current
form without revamping quite a bit of it, given how different the GPU on
the Switch is compared to the 3DS.
2019-06-18 16:57:59 -04:00
bunnei
c7b5c245e1
Merge pull request #2562 from ReinUsesLisp/split-cbuf-upload
video_core/engines: Move ConstBufferInfo out of Maxwell3D
2019-06-17 22:35:04 -04:00
Zach Hilman
c0e7b91145
Merge pull request #2538 from ReinUsesLisp/ssy-pbk
shader: Split SSY and PBK stack
2019-06-15 20:30:13 -04:00
ReinUsesLisp
ee81fb94cd gl_device: Fix TestVariableAoffi test
This test is intended to be invalid GLSL, but it was being invalid in
two points instead of one. The intention is to use a non-immediate
parameter in a textureOffset like function.

The problem is that this shader was being compiled as a separable
shader object and the text was writting to gl_Position without a
redeclaration, being invalid GLSL.

Address that issue by using a user-defined output attribute.
2019-06-11 23:02:50 -03:00
bunnei
f981efdf8d
Merge pull request #2572 from FernandoS27/gpu-mem
GPUVM: Correct GPU VM virtual address space
2019-06-11 21:09:57 -04:00
Fernando Sahmkow
f79823fda7 GPUVM: Correct GPU VM virtual address space 2019-06-09 17:47:15 -04:00
ReinUsesLisp
528c15051c kepler_compute: Use std::array for cbuf info 2019-06-07 20:36:22 -03:00
ReinUsesLisp
17d5fb6d06 kepler_compute: Fix block_dim_x encoding 2019-06-07 20:35:46 -03:00
ReinUsesLisp
4ec8a3df08 gl_shader_cache: Use static constructors for CachedShader initialization 2019-06-07 20:20:22 -03:00
ReinUsesLisp
5669ff3cbd gl_rasterizer: Remove unused parameters in descriptor uploads 2019-06-07 19:52:16 -03:00
ReinUsesLisp
2f2a61887a video_core/engines: Move ConstBufferInfo out of Maxwell3D 2019-06-07 19:47:15 -03:00
Zach Hilman
de33ad25f5
Merge pull request #2514 from ReinUsesLisp/opengl-compat
video_core: Drop OpenGL core in favor of OpenGL compatibility
2019-06-07 17:23:25 -04:00
ReinUsesLisp
fe8e6618f2 shader: Split SSY and PBK stack
Hardware testing revealed that SSY and PBK push to a different stack,
allowing code like this:

        SSY label1;
        PBK label2;
        SYNC;
label1: PBK;
label2: EXIT;
2019-06-07 02:18:27 -03:00
ReinUsesLisp
769a50661a shader/node: Minor changes
Reflect std::shared_ptr nature of Node on initializers and remove
constant members in nodes.

Add some commentaries.
2019-06-06 20:03:33 -03:00
ReinUsesLisp
e1b3be7ced shader: Move Node declarations out of the shader IR header
Analysis passes do not have a good reason to depend on shader_ir.h to
work on top of nodes. This splits node-related declarations to their own
file and leaves the IR in shader_ir.h
2019-06-06 20:02:37 -03:00
ReinUsesLisp
bf4dfb3ad4 shader: Use shared_ptr to store nodes and move initialization to file
Instead of having a vector of unique_ptr stored in a vector and
returning star pointers to this, use shared_ptr. While changing
initialization code, move it to a separate file when possible.

This is a first step to allow code analysis and node generation beyond
the ShaderIR class.
2019-06-05 20:41:52 -03:00
bunnei
a20ba09bfd
Merge pull request #2520 from ReinUsesLisp/vulkan-refresh
vk_device,vk_shader_decompiler: Miscellaneous changes
2019-06-05 18:10:00 -04:00
bunnei
55c5029171
Merge pull request #2540 from ReinUsesLisp/remove-guest-position
gl_shader_decompiler: Remove guest "position" varying
2019-06-05 18:07:23 -04:00
bunnei
0bcc305797
Merge pull request #2512 from ReinUsesLisp/comp-indexing
gl_shader_decompiler: Pessimize uniform buffer access on AMD's prorpietary driver
2019-06-05 18:02:30 -04:00
Zach Hilman
81e09bb121
Merge pull request #2545 from lioncash/timing
core/core_timing_util: Use std::chrono types for specifying time units
2019-06-05 15:52:37 -04:00
Zach Hilman
dd4fe0dab1
Merge pull request #2534 from ReinUsesLisp/shader-cleanup
gl_shader_cache: Minor style changes
2019-06-05 15:28:34 -04:00
Lioncash
42f5fd0ab3 core/core_timing_util: Use std::chrono types for specifying time units
Makes the interface more type-safe and consistent in terms of return
values.
2019-06-04 20:31:24 -04:00
Fernando Sahmkow
a32c52b1d8 shader_bytecode: Mark EXIT as flow instruction 2019-06-04 12:18:35 -04:00
ReinUsesLisp
0935c2d97b gl_shader_decompiler: Remove guest "position" varying
"position" was being written but not read anywhere besides geometry
shaders, where it had the same value as gl_Position.

This commit replaces "position" with gl_Position, reducing the
complexity of our code and the emitted GLSL code.
2019-06-03 01:01:34 -03:00
ReinUsesLisp
e72b9044a0 gl_shader_cache: Store a system class and drop global accessors 2019-05-30 14:01:40 -03:00
ReinUsesLisp
ad321564ed gl_shader_cache: Add commentaries explaining the intention in shaders creation 2019-05-30 13:58:38 -03:00
ReinUsesLisp
838b6d2ff8 gl_shader_cache: Flip if condition in GetStageProgram to reduce indentation 2019-05-30 13:56:03 -03:00
ReinUsesLisp
6ac4490751 gl_buffer_cache: Remove unused ReserveMemory method 2019-05-30 13:21:01 -03:00
ReinUsesLisp
a89cc0bafc maxwell_to_gl: Use GL_CLAMP to emulate Clamp wrap mode 2019-05-30 13:21:01 -03:00
ReinUsesLisp
b76df62c00 gl_rasterizer: Move alpha testing to the OpenGL pipeline
Removes the alpha testing code from each fragment shader invocation.
2019-05-30 13:21:01 -03:00
ReinUsesLisp
df509486c4 gl_rasterizer: Use GL_QUADS to emulate quads rendering 2019-05-30 13:21:01 -03:00
bunnei
e3608578e4
Merge pull request #2446 from ReinUsesLisp/tid
shader: Implement S2R Tid{XYZ} and CtaId{XYZ}
2019-05-29 12:21:17 -04:00
ReinUsesLisp
21c0b4dec8 gl_device: Add commentary to AOFFI unit test source code
The intention behind this commit is to hint someone inspecting an
apitrace dump to ignore this ill-formed GLSL code.
2019-05-27 00:55:57 -03:00
ReinUsesLisp
84928e6d67 gl_shader_gen: Always declare extensions after the version declaration
This addresses a bug on geometry shaders where code was being written
before all #extension declarations were done. Ref to #2523
2019-05-27 00:51:35 -03:00
ReinUsesLisp
f424b46036 vk_device: Let formats array type be deduced 2019-05-26 03:09:06 -03:00
ReinUsesLisp
a4c5e3e339 vk_shader_decompiler: Misc fixes
Fix missing OpSelectionMerge instruction. This caused devices loses on
most hardware, Intel didn't care.

Fix [-1;1] -> [0;1] depth conversions.

Conditionally use VK_EXT_scalar_block_layout. This allows us to use
non-std140 layouts on UBOs.

Update external Vulkan headers.
2019-05-26 01:48:04 -03:00
ReinUsesLisp
dec3c981d0 vk_device: Enable features when available and misc changes
Keeps track of native ASTC support, VK_EXT_scalar_block_layout
availability and SSBO range.

Check for independentBlend and vertexPipelineStorageAndAtomics as a
required feature. Always enable it.

Use vk::to_string format to log Vulkan enums.

Style changes.
2019-05-26 01:41:34 -03:00
Lioncash
5a4564bd8e renderer_opengl/utils: Use a std::string_view with LabelGLObject()
Uses a std::string_view instead of a std::string, given the pointed to
string isn't modified and is only used in a formatting operation.

This is nice because a few usages directly supply a string literal to
the function, allowing these usages to otherwise not heap allocate,
unlike the std::string overloads.

While we're at it, we can combine the address formatting into a single
formatting call.
2019-05-24 23:50:10 -04:00
bunnei
68c9c9222d
Merge pull request #2358 from ReinUsesLisp/parallel-shader
gl_shader_cache: Use shared contexts to build shaders in parallel at boot
2019-05-24 22:42:08 -04:00
bunnei
1a2d90ab09
Merge pull request #2485 from ReinUsesLisp/generic-memory
shader/memory: Implement generic memory stores and loads (ST and LD)
2019-05-24 18:24:26 -04:00
ReinUsesLisp
d8827b07b5 gl_shader_decompiler: Use an if based cbuf indexing for broken drivers
The following code is broken on AMD's proprietary GLSL compiler:
```glsl
uint idx = ...;
vec4 values = ...;
float some_value = values[idx & 3];
```

It index the wrong components, to fix this the following pessimized code
is emitted when that bug is present:
```glsl
uint idx = ...;
vec4 values = ...;
float some_value;
if ((idx & 3) == 0) some_value = values.x;
if ((idx & 3) == 1) some_value = values.y;
if ((idx & 3) == 2) some_value = values.z;
if ((idx & 3) == 3) some_value = values.w;
```
2019-05-24 02:47:56 -03:00
ReinUsesLisp
46177901b8 gl_device: Add test to detect broken component indexing
Component indexing on AMD's proprietary driver is broken. This commit adds
a test to detect when we are on a driver that can't successfully manage
component indexing.

It dispatches a dummy draw with just one vertex shader that writes to an
indexed SSBO from the GPU with data sent through uniforms, it then reads
that data from the CPU and compares the expected output.
2019-05-24 02:47:56 -03:00
Lioncash
b6dcb1ae4d shader/shader_ir: Make Comment() take a std::string by value
This allows for forming comment nodes without making unnecessary copies
of the std::string instance.

e.g. previously:

Comment(fmt::format("Base address is c[0x{:x}][0x{:x}]",
        cbuf->GetIndex(), cbuf_offset));

Would result in a copy of the string being created, as CommentNode()
takes a std::string by value (a const ref passed to a value parameter
results in a copy).

Now, only one instance of the string is ever moved around. (fmt::format
returns a std::string, and since it's returned from a function by value,
this is a prvalue (which can be treated like an rvalue), so it's moved
into Comment's string parameter), we then move it into the CommentNode
constructor, which then moves the string into its member variable).
2019-05-23 03:01:55 -03:00
Lioncash
228e58d0a5 shader/decode/*: Add missing newline to files lacking them
Keeps the shader code file endings consistent.
2019-05-23 02:55:52 -03:00
Lioncash
87b4c1ac5e shader/decode/*: Eliminate indirect inclusions
Amends cases where we were using things that were indirectly being
satisfied through other headers. This way, if those headers change and
eliminate dependencies on other headers in the future, we don't have
cascading compilation errors.
2019-05-23 02:55:52 -03:00
Lioncash
195b54602f shader/decode/memory: Remove left in debug pragma 2019-05-22 17:08:50 -04:00
Lioncash
de23847184 renderer_opengl/gl_shader_decompiler: Remove redundant name specification in format string
This accidentally slipped through a rebase.
2019-05-21 09:47:21 -04:00
ReinUsesLisp
69215b5a55 gl_shader_cache: Fix clang strict standard build issues 2019-05-20 22:46:05 -03:00
ReinUsesLisp
c03b8c4c19 gl_shader_cache: Use shared contexts to build shaders in parallel 2019-05-20 22:45:55 -03:00
ReinUsesLisp
75e7b45d69 shader/memory: Implement ST (generic memory) 2019-05-20 22:41:53 -03:00
ReinUsesLisp
f78ef617b6 shader/memory: Implement LD (generic memory) 2019-05-20 22:38:59 -03:00
bunnei
9a17b20896
Merge pull request #2494 from lioncash/shader-text
gl_shader_decompiler: Add AddLine() overloads with single function that forwards to libfmt
2019-05-20 20:42:40 -04:00
ReinUsesLisp
9c3461604c shader: Implement S2R Tid{XYZ} and CtaId{XYZ} 2019-05-20 16:36:49 -03:00
ReinUsesLisp
ada79fa8ad gl_shader_decompiler: Make GetSwizzle constexpr 2019-05-20 16:36:48 -03:00
Lioncash
58a0c13e34 gl_shader_decompiler: Tidy up minor remaining cases of unnecessary std::string concatenation 2019-05-20 14:14:48 -04:00
Lioncash
6fb29764d6 gl_shader_decompiler: Replace individual overloads with the fmt-based one
Gets rid of the need to special-case brace handling depending on the
overload used, and makes it consistent across the board with how fmt
handles them.

Strings with compile-time deducible strings are directly forwarded to
std::string's constructor, so we don't need to worry about the
performance difference here, as it'll be identical.
2019-05-20 14:14:48 -04:00
Lioncash
784d2b6c3d gl_shader_decompiler: Utilize fmt overload of AddLine() where applicable 2019-05-20 14:14:44 -04:00
Fernando Sahmkow
911fafb967 Revert #2466
This reverts a tested behavior on delay slots not exiting if the exit 
flag is set. Currently new tests are required in order to ensure this 
behavior.
2019-05-19 16:04:44 -04:00
Lioncash
91ec251c4a gl_shader_decompiler: Add AddLine() overload that forwards to fmt
In a lot of places throughout the decompiler, string concatenation via
operator+ is used quite heavily. This is usually fine, when not heavily
used, but when used extensively, can be a problem. operator+ creates an
entirely new heap allocated temporary string and given we perform
expressions like:

std::string thing = a + b + c + d;

this ends up with a lot of unnecessary temporary strings being created
and discarded, which kind of thrashes the heap more than we need to.
Given we utilize fmt in some AddLine calls, we can make this a part of
the ShaderWriter's API. We can make an overload that simply acts as a
passthrough to fmt.

This way, whenever things need to be appended to a string, the operation
can be done via a single string formatting operation instead of
discarding numerous temporary strings. This also has the benefit of
making the strings themselves look nicer and makes it easier to spot
errors in them.
2019-05-19 14:12:20 -04:00
bunnei
d49efbfb4a
Merge pull request #2441 from ReinUsesLisp/al2p
shader: Implement AL2P and ALD.PHYS
2019-05-19 14:02:58 -04:00
Hexagon12
b94b08fa6f
Merge pull request #2491 from FernandoS27/dma-fix
Dma_pusher: ASSERT on empty command_list
2019-05-19 16:27:15 +01:00
Hexagon12
f8b1e53369
Merge pull request #2452 from FernandoS27/raster-cache-fix
Correct possible error on Rasterizer Caches
2019-05-19 16:00:44 +01:00
Hexagon12
2aebbe9bf9
Merge pull request #2497 from lioncash/shader-ir
shader/shader_ir: Minor changes
2019-05-19 15:51:06 +01:00
Hexagon12
fadf66993c
Merge pull request #2495 from lioncash/cache
gl_shader_disk_cache: Minor cleanup
2019-05-19 15:50:23 +01:00
Fernando Sahmkow
9e98100c94 Dma_pusher: ASSERT on empty command_list
This is a measure to avoid crashes on command list reading as an empty
command_list is considered a NOP.
2019-05-19 10:48:31 -04:00
Hexagon12
18cdbdafa2
Merge pull request #2467 from lioncash/move
video_core/gpu_thread: Remove redundant copy constructor for CommandDataContainer
2019-05-19 15:20:37 +01:00
Hexagon12
9175bffbdb
Merge pull request #2466 from yuzu-emu/mme-exit-delay-slot
GPU/MMEInterpreter: Ignore the 'exit' flag when it's executed inside a delay slot.
2019-05-19 15:14:41 +01:00
Hexagon12
ac3775e6ae
Merge pull request #2468 from lioncash/deduction
yuzu: Remove explicit types from locks where applicable
2019-05-19 15:05:56 +01:00
Hexagon12
b54bd3f018
Merge pull request #2472 from FernandoS27/tic
maxwell_3d: reduce severity of different component formats assert.
2019-05-19 15:04:47 +01:00
Hexagon12
3bd5f01240
Merge pull request #2469 from lioncash/copyable
video_core/engines/maxwell_3d: Add is_trivially_copyable_v check for Regs
2019-05-19 15:02:17 +01:00
Sebastian Valle
a6ed792ac4
Merge pull request #2470 from lioncash/ranged-for
video_core/engines/maxwell_3d: Simplify for loops into ranged for loops within InitializeRegisterDefaults()
2019-05-19 09:01:19 -05:00
Hexagon12
4452195d41
Merge pull request #2480 from ReinUsesLisp/fix-quads
gl_rasterizer: Pass the right number of array quad vertices count
2019-05-19 14:58:49 +01:00
Hexagon12
8e9a1e4249
Merge pull request #2483 from ReinUsesLisp/fix-point-size
gl_rasterizer: Limit OpenGL point size to a minimum of 1
2019-05-19 14:57:05 +01:00
Sebastian Valle
dfddb12255
Merge pull request #2471 from lioncash/engine-upload
video_core/engines/engine_upload: Minor tidying
2019-05-19 08:54:42 -05:00
Sebastian Valle
f9ad88f9d7
Merge pull request #2484 from ReinUsesLisp/triangle-fan
maxwell_to_gl: Add TriangleFan primitive topology
2019-05-19 08:53:29 -05:00
Lioncash
e310d943b8 shader/shader_ir: Remove unnecessary inline specifiers
constexpr internally links by default, so the inline specifier is
unnecessary.
2019-05-19 08:23:15 -04:00
Lioncash
212b148923 shader/shader_ir: Simplify constructors for OperationNode
Many of these constructors don't even need to be templated. The only
ones that need to be templated are the ones that actually make use of
the parameter pack.

Even then, since std::vector accepts an initializer list, we can supply
the parameter pack directly to it instead of creating our own copy of
the list, then copying it again into the std::vector.
2019-05-19 08:23:14 -04:00
Lioncash
81e7e63080 shader/shader_ir: Remove unnecessary template parameter packs from Operation() overloads where applicable
These overloads don't actually make use of the parameter pack, so they
can be turned into regular non-template function overloads.
2019-05-19 08:23:14 -04:00
Lioncash
e09ee0ff23 shader/shader_ir: Mark tracking functions as const member functions
These don't actually modify instance state, so they can be marked as
const member functions
2019-05-19 08:23:09 -04:00
Lioncash
ce04ab38bb shader/shader_ir: Place implementations of constructor and destructor in cpp file
Given the class contains quite a lot of non-trivial types, place the
constructor and destructor within the cpp file to avoid inlining
construction and destruction code everywhere the class is used.
2019-05-19 04:02:02 -04:00
Lioncash
3356ea5bc2 gl_shader_gen: std::move objects where applicable
Avoids performing copies into the pair being returned. Instead, we can
just move the resources into the pair, avoiding the need to make copies
of both the std::string and ShaderEntries struct.
2019-05-19 03:46:54 -04:00
Lioncash
0a7f09a99b gl_shader_disk_cache: in-class initialize virtual file offset of ShaderDiskCacheOpenGL
Given the offset is assigned a fixed value in the constructor, we can
just assign it directly and get rid of the need to write the name of the
variable again in the constructor initializer list.
2019-05-19 02:55:18 -04:00
Lioncash
634b78a4c6 gl_shader_disk_cache: Default ShaderDiskCacheOpenGL's destructor in the cpp file
Given the disk shader cache contains non-trivial types, we should
default it in the cpp file in order to prevent inlining of the
complex destruction logic.
2019-05-19 02:50:50 -04:00
Lioncash
7fdc644c44 gl_shader_disk_cache: Make hash specializations noexcept
The standard library expects hash specializations that don't throw
exceptions. Make this explicit in the type to allow selection of better
code paths if possible in implementations.
2019-05-19 02:46:45 -04:00
Lioncash
683c4e523f gl_shader_disk_cache: Remove redundant code string construction in LoadDecompiledEntry()
We don't need to load the code into a vector and then construct a string
over the data. We can just create a string with the necessary size ahead
of time, and read the data directly into it, getting rid of an
unnecessary heap allocation.
2019-05-19 02:46:44 -04:00
Lioncash
5e4c227608 gl_shader_disk_cache: Make variable non-const in decompiled entry case
std::move does nothing when applied to a const variable. Resources can't
be moved if the object is immutable. With this change, we don't end up
making several unnecessary heap allocations and copies.
2019-05-19 02:46:44 -04:00
Lioncash
f417be9d3b gl_shader_disk_cache: Special-case boolean handling
Booleans don't have a guaranteed size, but we still want to have them
integrate into the disk cache system without needing to actually use a
different type. We can do this by supplying non-template overloads for
the bool type.

Non-template overloads always have precedence during function
resolution, so this is safe to provide.

This gets rid of the need to smatter ternary conditionals, as well as
the need to use u8 types to store the value in.
2019-05-19 02:46:38 -04:00
ReinUsesLisp
21ea8b2fcb gl_rasterizer: Limit OpenGL point size to a minimum of 1 2019-05-18 03:07:29 -03:00
ReinUsesLisp
52340c3294 maxwell_to_gl: Add TriangleFan primitive topology 2019-05-17 19:58:02 -03:00
ReinUsesLisp
a652e58c54 gl_rasterizer: Pass the right number of array quad vertices count 2019-05-17 17:08:34 -03:00
Fernando Sahmkow
fc975e9021 maxwell_3d: reduce sevirity of different component formats assert.
This was reduced due to happening on most games and at such constant
rate that it affected performance heavily for the end user. In general,
we are well aware of the assert and an implementation is already
planned.
2019-05-14 17:12:54 -04:00
Lioncash
b01cce716e video_core/engines/engine_upload: Amend constructor initializer list order
Silences a -Wreorder warning.
2019-05-14 13:43:28 -04:00
Lioncash
9b6d993e52 video_core/engines/engine_upload: Default destructor in the cpp file
Avoids inlining destruction logic where applicable, and also makes
forward declarations not cause unexpected compilation errors depending
on where the State class is used.
2019-05-14 13:41:41 -04:00
Lioncash
ec1c69258a video_core/engines/engine_upload: Remove unnecessary const on parameters in function declarations
These only apply in the definition of the function. They can be omitted
from the declaration.
2019-05-14 13:40:09 -04:00
Lioncash
0f83c8dffa video_core/engines/engine_upload: Remove unnecessary includes 2019-05-14 13:39:04 -04:00
Lioncash
5db1b54b58 video_core/engines/maxwell3d: Get rid of three magic values in CallMethod()
We can use the named constant instead of using 32 directly.
2019-05-14 09:02:47 -04:00
Lioncash
48ce5880a0 video_core/engines/maxwell_3d: Simplify for loops into ranged for loops within InitializeRegisterDefaults()
Lessens the amount of code that needs to be read, and gets rid of the
need to introduce an indexing variable. Instead, we just operate on the
objects directly.
2019-05-14 08:53:19 -04:00
Lioncash
c212fc9b2c video_core/engines/maxwell_3d: Add is_trivially_copyable_v check for Regs
std::memset is used to clear the entire register structure, which
requires that the Regs struct be trivially copyable (otherwise undefined
behavior is invoked). This prevents the case where a non-trivial type is
potentially added to the struct.
2019-05-14 08:47:56 -04:00
Lioncash
d6d809db87 yuzu: Remove explicit types from locks where applicable
With C++17's deduction guides, the type doesn't need to be explicitly
specified within locking primitives anymore.
2019-05-14 08:18:48 -04:00
Lioncash
c5129a3a58 video_core/gpu_thread: Remove redundant copy constructor for CommandDataContainer
std::move within a copy constructor (on a data member that isn't
mutable) will always result in a copy. Because of that, the behavior of
this copy constructor is identical to the one that would be generated
automatically by the compiler, so we can remove it.
2019-05-14 08:09:17 -04:00
Mat M
c4d549919f
Merge pull request #2462 from lioncash/video-mm
video_core/memory_manager: Minor tidying
2019-05-14 06:40:33 -04:00
Mat M
dadcf317dc
Merge pull request #2461 from lioncash/unused-var
video_core: Remove a few unused variables and functions
2019-05-14 06:36:26 -04:00
Rodrigo Locatti
940a71089d
Merge pull request #2413 from FernandoS27/opt-gpu
Rasterizer Cache: refactor flushing & optimize memory usage of surfaces
2019-05-13 23:01:59 -03:00
Sebastian Valle
9ef45f00bf
GPU/MMEInterpreter: Ignore the 'exit' flag when it's executed inside a delay slot.
It seems instructions marked with the 'exit' flag will not cause an exit when executed within a delay slot.

This was hwtested by fincs.
2019-05-12 16:38:51 -05:00
Lioncash
716fbaef74 video_core/memory_manager: Mark IsBlockContinuous() as a const member function
Corrects the typo in its name and marks the function as a const member
function, given it doesn't actually modify memory manager state.
2019-05-09 19:14:36 -04:00
Lioncash
d4bcd006b2 video_core/memory_manager: Mark the constructor as explicit
Prevents implicit converting constructions of the memory manager.
2019-05-09 19:10:26 -04:00
Lioncash
fd12788967 video_core/memory_manager: Default the destructor within the cpp file
Makes the class less surprising when it comes to forward declaring the
type, and also prevents inlining the destruction code of the class,
given it contains non-trivial types.
2019-05-09 19:10:13 -04:00
Lioncash
53afe47cec video_core/memory_manager: Amend doxygen comments
Corrects references to non-existent parameters and corrects typos.
2019-05-09 19:09:19 -04:00
Lioncash
5235b053b4 video_core/memory_manager: Remove superfluous const from function declarations
These are able to be omitted from the declaration of functions, since
they don't do anything at the type system level. The definitions of the
functions can retain the use of const though, since they make the
variables immutable in the implementation of the function where they're
used.
2019-05-09 18:59:49 -04:00
Lioncash
b6408e9671 video_core/renderer_opengl/gl_shader_cache: Correct member initialization order
Silences a -Wreorder warning.
2019-05-09 18:55:47 -04:00
Lioncash
e43ba3acd4 video_core/shader/decode/texture: Remove unused variable from GetTld4Code() 2019-05-09 18:49:56 -04:00
Lioncash
e3c45b4338 renderer_vulkan/vk_shader_decompiler: Remove unused variable from DeclareInternalFlags() 2019-05-09 18:47:48 -04:00
Lioncash
175fe8aaeb video_core/renderer_opengl/gl_shader_decompiler: Remove unused Composite() function
This isn't used at all, so it can be removed.
2019-05-09 18:45:26 -04:00
Lioncash
6d28d288a3 video_core/renderer_opengl/gl_rasterizer_cache: Remove unused variable in UploadGLMipmapTexture()
This variable is unused entirely, so it can be removed.
2019-05-09 18:42:48 -04:00
Lioncash
ba165b1092 video_core/gpu_thread: Remove unused local variable
Instead of retrieving the data from the std::variant instance, we can
just check if the variant contains that type of data.

This is essentially the same behavior, only it returns a bool indicating
whether or not the type in the variant is currently active, instead of
actually retrieving the data.
2019-05-09 18:39:21 -04:00
Lioncash
c56d893e77 video_core/textures/astc: Remove unused variables
Silences a few compilation warnings.
2019-05-09 18:33:36 -04:00
bunnei
daca045fcd
Merge pull request #2442 from FernandoS27/astc-fix
Fix Layered ASTC Textures
2019-05-09 13:23:14 -04:00
bunnei
f69d3a6351
Merge pull request #2443 from ReinUsesLisp/skip-repeated-variants
gl_shader_disk_cache: Skip stored shader variants instead of asserting
2019-05-09 13:22:42 -04:00
bunnei
c27b81cb85
Merge pull request #2429 from FernandoS27/compute
Corrections and Implementation on GPU Engines
2019-05-09 13:19:22 -04:00
Fernando Sahmkow
3a08c3207b Correct possible error on Rasterizer Caches
There was a weird bug that could happen if the object died directly and
the cache address wasn't stored.
2019-05-07 12:33:10 -04:00
Lioncash
9e15193ef8
shader/decode/texture: Remove unused variable
This isn't used anywhere, so we can get rid of it.
2019-05-04 02:10:38 -04:00
Lioncash
08b270676b
gl_rasterizer: Silence unused variable warning
Makes use of src, so it's not considered unused.
2019-05-04 02:00:17 -04:00
ReinUsesLisp
d4df803b2b shader_ir/other: Implement IPA.IDX 2019-05-02 21:46:37 -03:00
ReinUsesLisp
5321cdd276 gl_shader_decompiler: Skip physical unused attributes 2019-05-02 21:46:37 -03:00
ReinUsesLisp
28bffb1ffa shader_ir/memory: Assert on non-32 bits ALD.PHYS 2019-05-02 21:46:25 -03:00
ReinUsesLisp
fe700e1856 shader: Add physical attributes commentaries 2019-05-02 21:46:25 -03:00
ReinUsesLisp
c6f9e651b2 gl_shader_decompiler: Implement GLSL physical attributes 2019-05-02 21:46:25 -03:00
ReinUsesLisp
71aa9d0877 shader_ir/memory: Implement physical input attributes 2019-05-02 21:46:25 -03:00
ReinUsesLisp
b7d412c99b gl_shader_decompiler: Abstract generic attribute operations 2019-05-02 21:46:25 -03:00
ReinUsesLisp
bd81a03d9d gl_shader_decompiler: Declare all possible varyings on physical attribute usage 2019-05-02 21:46:25 -03:00
ReinUsesLisp
06b363c9b5 shader: Remove unused AbufNode Ipa mode 2019-05-02 21:46:25 -03:00
ReinUsesLisp
002ecbea19 shader_ir/memory: Emit AL2P IR 2019-05-02 21:46:25 -03:00
ReinUsesLisp
7632a7d6d2 shader_bytecode: Add AL2P decoding 2019-05-02 21:46:25 -03:00
Fernando Sahmkow
e64c41efe8 Refactors and name corrections. 2019-05-01 15:31:39 -04:00
ReinUsesLisp
4aa081b4e7 gl_shader_disk_cache: Skip stored shader variants instead of asserting
Instead of asserting on already stored shader variants, silently skip them.
This shouldn't be happening but when a shader is invalidated and it is
not stored in the shader cache, this assert would hit and save that
shader anyways when the asserts are disabled.
2019-05-01 00:36:11 -03:00
Fernando Sahmkow
95261639fb Fix Layered ASTC Textures
By adding the missing layer offset in ASTC compression.
2019-04-30 23:02:31 -04:00
bunnei
79e54abe19
Merge pull request #2100 from FreddyFunk/disk-cache-precompiled-file
gl_shader_disk_cache: Improve precompiled shader cache generation speed and size
2019-04-30 19:24:01 -04:00
bunnei
91e239d66f
Merge pull request #2435 from ReinUsesLisp/misc-vc
shader_ir: Miscellaneous fixes
2019-04-28 22:29:43 -04:00
bunnei
c52233ec8b
Merge pull request #2322 from ReinUsesLisp/wswitch
video_core: Silent -Wswitch warnings
2019-04-28 22:24:58 -04:00
bunnei
9a3737120d
Merge pull request #2423 from FernandoS27/half-correct
Corrections on Half Float operations: HADD2 HMUL2 and HFMA2
2019-04-28 22:24:22 -04:00
ReinUsesLisp
2156e52014 shader_ir: Move Sampler index entry in operand< to sort declarations 2019-04-26 01:13:05 -03:00
ReinUsesLisp
b77b4b76bb shader_ir: Add missing entry to Sampler operand< comparison 2019-04-26 01:11:24 -03:00
ReinUsesLisp
0b91087a1e shader_ir/texture: Fix sampler const buffer key shift 2019-04-26 01:09:29 -03:00
FreddyFunk
1a3ff252a4 Re added new lines at the end of files 2019-04-23 23:19:28 +02:00
unknown
3091b40691 gl_shader_disk_cache: Compress precompiled shader cache file with Zstandard 2019-04-23 22:24:31 +02:00
unknown
9db2c734c9 gl_shader_disk_cache: Use VectorVfsFile for the virtual precompiled shader cache file 2019-04-23 22:24:23 +02:00
unknown
3fe542cf60 gl_shader_disk_cache: Remove per shader compression 2019-04-23 21:40:01 +02:00
Fernando Sahmkow
b3118ee316 Fixes and Corrections to DMA Engine 2019-04-23 15:28:18 -04:00
Hexagon12
8df9449bb8
Merge pull request #2422 from ReinUsesLisp/fixup-samplers
gl_state: Fix samplers memory corruption
2019-04-23 18:30:35 +03:00
Hexagon12
b2fbcaae30
Merge pull request #2425 from FernandoS27/y-direction
Fix flipping on some games by applying Y direction register
2019-04-23 18:29:29 +03:00
Fernando Sahmkow
f1e5314f1a Add Swizzle Parameters to the DMA engine 2019-04-23 11:21:00 -04:00
Fernando Sahmkow
e140e2ebc6 Add Documentation Headers to all the GPU Engines 2019-04-23 08:44:52 -04:00
Fernando Sahmkow
021d28c9b8 Corrections and styling 2019-04-23 08:02:24 -04:00
bunnei
4fad91ca45
Merge pull request #2383 from ReinUsesLisp/aoffi-test
gl_shader_decompiler: Disable variable AOFFI on unsupported devices
2019-04-22 22:14:02 -04:00
Fernando Sahmkow
701ce1c9d0 Implement Maxwell3D Data Upload 2019-04-22 19:27:36 -04:00
Fernando Sahmkow
e4ff140b99 Introduce skeleton of the GPU Compute Engine. 2019-04-22 19:05:43 -04:00
Fernando Sahmkow
a91d3fc639 Revamp Kepler Memory to use a subegine to manage uploads 2019-04-22 18:50:56 -04:00
bunnei
b5889cbd6f
Merge pull request #2403 from FernandoS27/compressed-linear
Support compressed formats on linear textures.
2019-04-22 17:09:42 -04:00
bunnei
68b707711a
Merge pull request #2411 from FernandoS27/unsafe-gpu
GPU Manager: Implement ReadBlockUnsafe and WriteBlockUnsafe
2019-04-22 17:09:00 -04:00
bunnei
01100f8afd
Merge pull request #2400 from FernandoS27/corret-kepler-mem
Implement Kepler Memory on both Linear and BlockLinear.
2019-04-22 16:47:05 -04:00
Fernando Sahmkow
4c36b78567 Rasterizer Cache: Use a temporal storage for Surfaces loading/flushing.
This PR should heavily reduce memory usage since temporal buffers are no
longer stored per Surface but instead managed by the Rasterizer Cache.
2019-04-21 11:42:07 -04:00
Fernando Sahmkow
623b2e4b8f Corrections Half Float operations on const buffers and implement saturation. 2019-04-20 21:11:33 -04:00
bunnei
da0c3bc658
Merge pull request #2407 from FernandoS27/f2f
Do some corrections in conversion shader instructions.
2019-04-20 00:42:34 -04:00
bunnei
650d9b1044
Merge pull request #2409 from ReinUsesLisp/half-floats
shader_ir/decode: Miscellaneous fixes to half-float decompilation
2019-04-19 21:31:52 -04:00
Fernando Sahmkow
08cdcc2871 Apply Position Y Direction 2019-04-19 20:49:00 -04:00
Fernando Sahmkow
a3eb91ed8c RasterizerCache Redesign: Flush
flushing is now responsability of children caches instead of the cache 
object. This change will allow the specific cache to pass extra 
parameters on flushing and will allow more flexibility.
2019-04-19 20:44:56 -04:00
Fernando Sahmkow
db4b2bc798 make ReadBlockunsafe and WriteBlockunsafe, ignore invalid pages. 2019-04-19 20:35:54 -04:00
bunnei
40dc893c37
Merge pull request #2374 from lioncash/pagetable
core: Reorganize boot order
2019-04-19 19:09:20 -04:00
ReinUsesLisp
d74cb16535 gl_state: Fix samplers memory corruption
It was possible for "samplers" to be read without being written. This
addresses that.
2019-04-19 17:07:56 -03:00
ReinUsesLisp
fbe8d1ceaa video_core: Silent -Wswitch warnings 2019-04-18 15:54:39 -03:00
bunnei
4294062516
Merge pull request #2318 from ReinUsesLisp/sampler-cache
gl_sampler_cache: Port sampler cache to OpenGL
2019-04-17 21:45:56 -04:00
bunnei
5bd5140bde
Merge pull request #2348 from FernandoS27/guest-bindless
Implement Bindless Textures on Shader Decompiler and GL backend
2019-04-17 20:59:49 -04:00
bunnei
0cfbd3325b
Merge pull request #2315 from ReinUsesLisp/severity-decompiler
shader_ir/decode: Reduce the severity of common assertions
2019-04-16 22:21:19 -04:00
bunnei
21d498bc06
Merge pull request #2384 from ReinUsesLisp/gl-state-clear
gl_rasterizer: Apply just the needed state on Clear
2019-04-16 22:19:03 -04:00
bunnei
1b83f255c2
Merge pull request #2092 from ReinUsesLisp/stg
shader/memory: Implement STG and global memory flushing
2019-04-16 22:15:17 -04:00
Fernando Sahmkow
d0082de82a Implement IsBlockContinous
This detects when a GPU Memory Block is not continous within host cpu
memory.
2019-04-16 18:49:35 -04:00
Fernando Sahmkow
da91e6e4b6 Apply Const correctness to SwizzleKepler and replace u32 for size_t on iterators. 2019-04-16 12:00:46 -04:00
Fernando Sahmkow
13d626fc21 Use ReadBlockUnsafe for fetyching DMA CommandLists 2019-04-16 11:22:34 -04:00
Fernando Sahmkow
06d1c5a991 Document unsafe versions and add BlockCopyUnsafe 2019-04-16 10:11:35 -04:00
Fernando Sahmkow
6fc562a9aa Use ReadBlockUnsafe for Shader Cache 2019-04-15 23:34:03 -04:00
Fernando Sahmkow
ef381e6924 Use ReadBlockUnsafe on TIC and TSC reading
Use ReadBlockUnsafe on TIC and TSC reading as memory is never flushed
from host GPU there.
2019-04-15 23:10:24 -04:00
Fernando Sahmkow
367704aa82 GPU MemoryManager: Implement ReadBlockUnsafe and WriteBlockUnsafe 2019-04-15 23:01:35 -04:00
Fernando Sahmkow
3e96c367bd Use WriteBlock and ReadBlock. 2019-04-15 22:42:34 -04:00
Fernando Sahmkow
bec28d692d Implement Block Linear copies in Kepler Memory. 2019-04-15 21:22:16 -04:00
ReinUsesLisp
ef8245bed2 vk_shader_decompiler: Add missing operations 2019-04-15 21:32:57 -03:00
ReinUsesLisp
f43995ec53 shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmetic
Operations done before the main half float operation (like HAdd) were
managing a packed value instead of the unpacked one. Adding an unpacked
operation allows us to drop the per-operand MetaHalfArithmetic entry,
simplifying the code overall.
2019-04-15 21:16:10 -03:00
ReinUsesLisp
abcbcb1b2a gl_shader_decompiler: Fix MrgH0 decompilation
GLSL decompilation for HMergeH0 was wrong. This addresses that issue.
2019-04-15 21:16:10 -03:00
ReinUsesLisp
64613db605 shader_ir/decode: Implement half float saturation 2019-04-15 21:16:10 -03:00
ReinUsesLisp
90cbf89303 shader_ir/decode: Reduce severity of unimplemented half-float FTZ 2019-04-15 21:16:09 -03:00
ReinUsesLisp
acf618afbc renderer_opengl: Implement half float NaN comparisons 2019-04-15 21:13:26 -03:00
ReinUsesLisp
ae46ad48ed shader_ir: Avoid using static on heap-allocated objects
Using static here might be faster at runtime, but it adds a heap
allocation called before main.
2019-04-15 21:12:43 -03:00
Fernando Sahmkow
aa471274d9 Do some corrections in conversion shader instructions.
Corrects encodings for I2F, F2F, I2I and F2I
Implements Immediate variants of all four conversion types.
Add assertions to unimplemented stuffs.
2019-04-15 19:16:27 -04:00
Fernando Sahmkow
8a099ac99f Correct Kepler Memory on Linear Pushes. 2019-04-15 14:51:36 -04:00
Fernando Sahmkow
773d955dfa Support compressed formats on linear textures. 2019-04-15 13:56:09 -04:00
Fernando Sahmkow
bf561e4340 Correct Pitch in Fermi2D 2019-04-15 12:24:29 -04:00
ReinUsesLisp
f15c59a164 gl_shader_decompiler: Use variable AOFFI on supported hardware 2019-04-14 05:13:19 -03:00
ReinUsesLisp
5c280e6ff0 shader_ir: Implement STG, keep track of global memory usage and flush 2019-04-14 00:25:32 -03:00
bunnei
c9454c8422
Merge pull request #2373 from FernandoS27/z32
Set Pixel Format to Z32 if its R32F and depth compare enabled, and Implement format ZF32_X24S8
2019-04-13 22:14:51 -04:00
bunnei
ee2206a1b7
Merge pull request #2386 from ReinUsesLisp/shader-manager
gl_shader_manager: Move code to source file and minor clean up
2019-04-13 22:09:27 -04:00
Lioncash
6d0551196d video_core/gpu: Create threads separately from initialization
Like with CPU emulation, we generally don't want to fire off the threads
immediately after the relevant classes are initialized, we want to do
this after all necessary data is done loading first.

This splits the thread creation into its own interface member function
to allow controlling when these threads in particular get created.
2019-04-11 22:11:40 -04:00
bunnei
ea80e2bc57
Merge pull request #2235 from ReinUsesLisp/spirv-decompiler
vk_shader_decompiler: Implement a SPIR-V decompiler
2019-04-11 21:54:23 -04:00
Fernando Sahmkow
c9305959d3 gl_rasterizer_cache: Relax restrictions on FastCopySurface and FastLayeredCopySurface 2019-04-11 13:14:28 -04:00
bunnei
6951741a94
Merge pull request #2278 from ReinUsesLisp/vc-texture-cache
video_core: Implement API agnostic view based texture cache
2019-04-10 21:17:35 -04:00
bunnei
0371650bd7
Merge pull request #2372 from FernandoS27/fermi-fix
Correct Fermi Copy on Linear Textures.
2019-04-10 21:17:03 -04:00
ReinUsesLisp
93af663683 gl_shader_manager: Move code to source file and minor clean up 2019-04-10 19:29:15 -03:00
ReinUsesLisp
6df25e9c7b gl_rasterizer: Apply just the needed state on Clear 2019-04-10 18:13:15 -03:00
ReinUsesLisp
0032821864 gl_device: Implement interface and add uniform offset alignment 2019-04-10 15:56:12 -03:00
ReinUsesLisp
75d23a3679 vk_shader_decompiler: Implement flow primitives 2019-04-10 14:20:25 -03:00
ReinUsesLisp
58ad8dfac6 vk_shader_decompiler: Implement most common texture primitives 2019-04-10 14:20:25 -03:00
ReinUsesLisp
4667ed8e22 vk_shader_decompiler: Implement texture decompilation helper functions 2019-04-10 14:20:25 -03:00
ReinUsesLisp
676172e20d vk_shader_decompiler: Implement Assign and LogicalAssign 2019-04-10 14:20:25 -03:00
ReinUsesLisp
d316d248ab vk_shader_decompiler: Implement non-OperationCode visits 2019-04-10 14:20:25 -03:00
ReinUsesLisp
b758c861b0 vk_shader_decompiler: Implement OperationCode decompilation interface 2019-04-10 14:20:25 -03:00
ReinUsesLisp
fec4eb9776 vk_shader_decompiler: Implement Visit 2019-04-10 14:20:25 -03:00
ReinUsesLisp
ca51f99840 vk_shader_decompiler: Implement labels tree and flow 2019-04-10 14:20:25 -03:00
ReinUsesLisp
13aa664f3f vk_shader_decompiler: Implement declarations 2019-04-10 14:20:25 -03:00
ReinUsesLisp
ad53b233c5 vk_shader_decompiler: Declare and stub interface for a SPIR-V decompiler 2019-04-10 14:20:25 -03:00
ReinUsesLisp
970d9e57c8 video_core: Add sirit as optional dependency with Vulkan
sirit is a runtime assembler for SPIR-V
2019-04-10 14:20:25 -03:00
bunnei
97648f4841
Merge pull request #2345 from ReinUsesLisp/multibind
gl_rasterizer: Use ARB_multi_bind to update buffers with a single call per drawcall
2019-04-10 11:23:19 -04:00
bunnei
ed9dba89d3
Merge pull request #2375 from FernandoS27/fix-ldc
Remove unnecessary bounding in LD_C
2019-04-09 21:23:24 -04:00
Fernando Sahmkow
c9f35d96be Remove bounding in LD_C 2019-04-09 20:02:11 -04:00
bunnei
2598433f9c
Merge pull request #2354 from lioncash/header
video_core/texures/texture: Remove unnecessary includes
2019-04-09 19:19:41 -04:00
bunnei
353a099481
Merge pull request #2366 from FernandoS27/xmad-fix
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 -04:00
bunnei
bc7e149835
Merge pull request #2369 from FernandoS27/mip-align
gl_backend: Align Pixel Storage
2019-04-09 17:20:43 -04:00
Fernando Sahmkow
cd91e98dab Correct Fermi Copy on Linear Textures. 2019-04-09 14:13:58 -04:00
Fernando Sahmkow
7c458311d3 Implement Texture Format ZF32_X24S8. 2019-04-09 12:33:46 -04:00
Fernando Sahmkow
b0aa8ad736 Correct depth compare with color formats for R32F 2019-04-09 12:06:59 -04:00
Fernando Sahmkow
9f16833097 gl_backend: Align Pixel Storage
This commit makes sure GL reads on the correct pack size for the
respective texture buffer.
2019-04-08 17:16:02 -04:00
Fernando Sahmkow
5c55ae4e18 Correct LOP_IMN encoding 2019-04-08 13:39:12 -04:00
Fernando Sahmkow
16adc735a5 Correct XMAD mode, psl and high_b on different encodings. 2019-04-08 13:01:17 -04:00
Fernando Sahmkow
ef8be408d3 Adapt Bindless to work with AOFFI 2019-04-08 12:07:56 -04:00
Fernando Sahmkow
492040bd9c Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format. 2019-04-08 11:36:11 -04:00
Fernando Sahmkow
797e351bf8 Fix bad rebase 2019-04-08 11:35:22 -04:00
Fernando Sahmkow
c60b0b8432 Fix TMML 2019-04-08 11:35:22 -04:00
Fernando Sahmkow
a77e9a27b0 Simplify ConstBufferAccessor 2019-04-08 11:35:19 -04:00
Fernando Sahmkow
fd4e994de3 Refactor GetTextureCode and GetTexCode to use an optional instead of optional parameters 2019-04-08 11:35:18 -04:00
Fernando Sahmkow
4841440382 Implement TXQ_B 2019-04-08 11:29:52 -04:00
Fernando Sahmkow
189bd1980c Implement TMML_B 2019-04-08 11:29:49 -04:00
Fernando Sahmkow
ac3ba9a33e Corrections to TEX_B 2019-04-08 11:28:44 -04:00
Fernando Sahmkow
90d06acfed Fixes to Const Buffer Accessor and Formatting 2019-04-08 11:23:47 -04:00
Fernando Sahmkow
7af82ca022 Implement Bindless Handling on SetupTexture 2019-04-08 11:23:46 -04:00
Fernando Sahmkow
fe392fff24 Unify both sampler types. 2019-04-08 11:23:45 -04:00
Fernando Sahmkow
e28fd3d0a5 Implement Bindless Samplers and TEX_B in the IR. 2019-04-08 11:23:42 -04:00
Fernando Sahmkow
c4ac05c82c Implement Const Buffer Accessor 2019-04-08 11:19:34 -04:00
bunnei
f14328bf0a
Merge pull request #2300 from FernandoS27/null-shader
shader_cache: Permit a Null Shader in case of a bad host_ptr.
2019-04-07 17:58:27 -04:00
bunnei
c2fee0e519
Merge pull request #2355 from ReinUsesLisp/sync-point
maxwell_3d: Reduce severity of ProcessSyncPoint
2019-04-07 17:56:11 -04:00
bunnei
8aaf418bd6
Merge pull request #2306 from ReinUsesLisp/aoffi
shader_ir: Implement AOFFI for TEX and TLD4
2019-04-07 17:52:30 -04:00
bunnei
6b18a1592f
Merge pull request #2321 from ReinUsesLisp/gl-state-rework
gl_state: Rework to enable individual applies
2019-04-07 17:50:07 -04:00
bunnei
21a4e7deea
Merge pull request #2098 from FreddyFunk/disk-cache-zstd
gl_shader_disk_cache: Use Zstandard for compression
2019-04-07 17:48:33 -04:00
bunnei
80162888e6
Merge pull request #2352 from bunnei/mem-manager-fixes
memory_manager: Improved implementation of read/write/copy block.
2019-04-07 17:44:59 -04:00
Fernando Sahmkow
021cd56bc9 Permit a Null Shader in case of a bad host_ptr. 2019-04-07 07:52:01 -04:00
ReinUsesLisp
ddcb711ee8 maxwell_3d: Reduce severity of ProcessSyncPoint 2019-04-06 02:18:20 -03:00
Lioncash
89c106e31b video_core/textures/convert: Replace include with a forward declaration
Avoids dragging in a direct dependency in a header.
2019-04-06 00:14:36 -04:00
Lioncash
fbf452ab0e video_core/texures/texture: Remove unnecessary includes
Nothing in this header relies on common_funcs or the memory manager.

This gets rid of reliance on indirect inclusions in the OpenGL caches.
2019-04-06 00:03:35 -04:00
bunnei
864280fabc
Merge pull request #2317 from FernandoS27/sync
Implement SyncPoint Register in the GPU.
2019-04-05 23:50:54 -04:00
bunnei
e3402d976d
Merge pull request #2346 from lioncash/header
video_core/engines: Remove unnecessary inclusions where applicable
2019-04-05 23:44:27 -04:00
bunnei
20be92d5e6 memory_manager: Improved implementation of read/write/copy block.
- Fixes graphical issues with Chocobo's Mystery Dungeon EVERY BUDDY!
- Fixes a crash with Mario Tennis Aces
2019-04-05 23:43:34 -04:00
bunnei
89b8801a97
Merge pull request #2350 from lioncash/vmem
video_core/memory_manager: Mark a few member functions with the const qualifier
2019-04-05 23:40:54 -04:00
bunnei
41890a84be
Merge pull request #2347 from lioncash/trunc
video_core/gpu_thread: Silence truncation warning in ThreadManager's constructor
2019-04-05 23:39:31 -04:00
bunnei
520e4e5d4b
Merge pull request #2327 from ReinUsesLisp/crash-safe-visit
gl_shader_decompiler: Return early when an operation is invalid
2019-04-05 23:36:18 -04:00
bunnei
cb2209d06a
Merge pull request #2337 from lioncash/temporary
gl_shader_decompiler: Rename GenerateTemporal() to GenerateTemporary()
2019-04-05 23:35:31 -04:00
Lioncash
00e7190e29 video_core/macro_interpreter: Remove assertion within FetchParameter()
We can just use .at(), which essentially does the same thing, but with
less code.
2019-04-05 22:56:58 -04:00
Lioncash
1efdb4897e video_core/macro_interpreter: Simplify GetRegister()
Given we already ensure nothing can set the zeroth register in
SetRegister(), we don't need to check if the index is zero and special
case it. We can just access the register normally, since it's already
going to be zero.

We can also replace the assertion with .at() to perform the equivalent
behavior inline as part of the API.
2019-04-05 22:55:13 -04:00
Lioncash
c13fbe6a41 video_core/memory_manager: Make Read() a const qualified member function
Given this doesn't actually alter internal state, this can be made a
const member function.
2019-04-05 20:30:48 -04:00
Lioncash
76ef6e5c2b video_core/memory_manager: Make ReadBlock() a const qualifier member function
Now, since we have a const qualified variant of GetPointer(), we can put
it to use in ReadBlock() to retrieve the source pointer that is passed
into memcpy.

Now block reading may be done from a const context.
2019-04-05 20:28:44 -04:00
Lioncash
34510bcda8 video_core/memory_manager: Add a const qualified variant of GetPointer()
Allows retrieving read-only pointers from a const context externally.
2019-04-05 20:25:28 -04:00
Lioncash
085b388a7a video_core/memory_manager: Make FindFreeRegion() a const member function
This doesn't modify internal state, so it can be made a const member
function.
2019-04-05 20:22:55 -04:00
Lioncash
9dec087fca video_core/memory_manager: Make GpuToCpuAddress() a const member function
This doesn't modify any internal state, so it can be made a const member
function to allow its use in const contexts.
2019-04-05 20:18:29 -04:00
Fernando Sahmkow
fc91e21206 Implement SyncPoint Register in the GPU. 2019-04-05 19:19:30 -04:00
Lioncash
30ce9b2b5c video_core/gpu_thread: Silence truncation warning in ThreadManager's constructor
Since c5d41fd812 callback parameters were
changed to use an s64 to represent late cycles instead of an int, so
this was causing a truncation warning to occur here. Changing it to s64
is sufficient to silence the warning.
2019-04-05 18:37:37 -04:00
Lioncash
22f02076c6 video_core/engines: Make memory manager members private
These aren't used externally by anything, so they can be made private
data members.
2019-04-05 18:26:43 -04:00
Lioncash
26223f8124 video_core/engines: Remove unnecessary inclusions where applicable
Replaces header inclusions with forward declarations where applicable
and also removes unused headers within the cpp file. This reduces a few
more dependencies on core/memory.h
2019-04-05 18:26:32 -04:00
ReinUsesLisp
34c3e2c786 renderer_opengl/utils: Skip empty binds 2019-04-05 19:19:49 -03:00
ReinUsesLisp
b631c09e72 gl_rasterizer: Use ARB_multi_bind to update SSBOs 2019-04-05 19:18:43 -03:00
ReinUsesLisp
2d1f054c61 gl_rasterizer: Use ARB_multi_bind to update UBOs across stages 2019-04-05 19:10:46 -03:00
bunnei
66be5150d6
Merge pull request #2282 from bunnei/gpu-asynch-v2
gpu_thread: Improve synchronization by using CoreTiming.
2019-04-04 22:38:04 -04:00
bunnei
f7d6e08688
Merge pull request #2336 from ReinUsesLisp/txq
gl_shader_decompiler: Fix TXQ types
2019-04-04 22:36:19 -04:00
Lioncash
52746ed8dc gl_shader_decompiler: Rename GenerateTemporal() to GenerateTemporary()
Temporal generally indicates a relation to time, but this is just
creating a temporary, so this isn't really an accurate name for what the
function is actually doing.
2019-04-04 19:35:04 -04:00
ReinUsesLisp
88a3c05b7b gl_shader_decompiler: Fix TXQ types
TXQ returns integer types. Shaders usually do:

R0 = TXQ(); // => int
R0 = static_cast<float>(R0);

If we don't treat it as an integer, it will cast a binary float value as
float - resulting in a corrupted number.
2019-04-04 20:07:11 -03:00
Lioncash
3fd5998d84 video_core/renderer_opengl: Remove unnecessary includes
Quite a few unused includes have built up over time, particularly on
core/memory.h. Removing these includes means the source files including
those files will no longer need to be rebuilt if they're changed, making
compilation slightly faster in this scenario.
2019-04-04 12:00:46 -04:00
bunnei
d6374b2522
Merge pull request #2093 from FreddyFunk/disk-cache-better-compression
Better LZ4 compression utilization for the disk based shader cache and the yuzu build system
2019-04-03 21:50:29 -04:00
bunnei
d7438d067f
Merge pull request #2299 from lioncash/maxwell
gl_shader_manager: Remove reliance on a global accessor within MaxwellUniformData::SetFromRegs()
2019-04-03 21:47:48 -04:00
ReinUsesLisp
78bd66d037 gl_state: Rework to enable individual applies 2019-04-03 20:26:27 -03:00
ReinUsesLisp
04979560fb shader_ir/memory: Reduce severity of LD_L cache management and log it 2019-04-03 17:12:44 -03:00
ReinUsesLisp
24abeb9a67 shader_ir/memory: Reduce severity of ST_L cache management and log it 2019-04-03 17:12:44 -03:00
ReinUsesLisp
06c1f75f21 gl_shader_decompiler: Return early when an operation is invalid 2019-04-03 16:02:09 -03:00
bunnei
7931a68d4e
Merge pull request #2302 from ReinUsesLisp/vk-swapchain
vk_swapchain: Implement a swapchain manager
2019-04-03 11:50:05 -04:00
ReinUsesLisp
576ad9a012 gl_sampler_cache: Port sampler cache to OpenGL 2019-04-02 16:58:08 -03:00
ReinUsesLisp
c5047540c9 video_core: Abstract vk_sampler_cache into a templated class 2019-04-02 15:54:11 -03:00
bunnei
4555b63750 gpu_thread: Improve synchronization by using CoreTiming. 2019-04-01 21:32:39 -04:00
Lioncash
781ab8407b general: Use deducation guides for std::lock_guard and std::unique_lock
Since C++17, the introduction of deduction guides for locking facilities
means that we no longer need to hardcode the mutex type into the locks
themselves, making it easier to switch mutex types, should it ever be
necessary in the future.
2019-04-01 12:53:47 -04:00
ReinUsesLisp
38658b38b4 gl_shader_decompiler: Hide local definitions inside an anonymous namespace 2019-03-31 00:26:34 -03:00
Mat M
da02946f4f
shader_ir/decode: Silent implicit sign conversion warning
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-03-31 00:12:54 -03:00
bunnei
1960164055
Merge pull request #2297 from lioncash/reorder
video_core: Amend constructor initializer list order where applicable
2019-03-30 20:00:26 -04:00
bunnei
e199d1e14f
Merge pull request #2298 from lioncash/variable
video_core/{gl_rasterizer, gpu_thread}: Remove unused class variables where applicable
2019-03-30 19:59:45 -04:00
ReinUsesLisp
e8abe4b77c gl_shader_decompiler: Add AOFFI backing implementation 2019-03-30 02:55:18 -03:00
ReinUsesLisp
cb68ce7c2f shader_ir/decode: Implement AOFFI for TEX and TLD4 2019-03-30 02:53:29 -03:00
ReinUsesLisp
cf4ecc1945 shader_ir: Implement immediate register tracking 2019-03-30 02:53:16 -03:00
unknown
b4857e326f common/zstd_compression: simplify decompression interface 2019-03-29 18:22:08 +01:00
unknown
aa92da205e gl_shader_disk_cache: Fixup clang format 2019-03-29 18:22:08 +01:00
unknown
35ebbbc167 gl_shader_disk_cache: Use Zstandard for compression 2019-03-29 18:22:08 +01:00
unknown
4fad477aeb gl_shader_disk_cache: Use LZ4HC with compression level 9 instead of compression level 12 for less compression time 2019-03-29 18:13:00 +01:00
unknown
c791192d64 Addressed feedback 2019-03-29 18:12:42 +01:00
unknown
74cee1b65d gl_shader_disk_cache: Use better compression for transferable and precompiled shader disk chache files 2019-03-29 16:42:19 +01:00
unknown
798d76f4c7 data_compression: Move LZ4 compression from video_core/gl_shader_disk_cache to common/data_compression 2019-03-29 16:42:19 +01:00
ReinUsesLisp
746dab407e vk_swapchain: Implement a swapchain manager 2019-03-29 00:00:51 -03:00
bunnei
76f024865d
Merge pull request #2296 from lioncash/override
video_core: Add missing override specifiers
2019-03-28 17:54:51 -04:00
Lioncash
c1ba3e3d4a gl_shader_manager: Remove unnecessary gl_shader_manager inclusion
This isn't used at all in the OpenGL shader cache, so we can remove it's
include here, meaning one less file needs to be recompiled if any
changes ever occur within that header.

core/memory.h is also not used within this file at all, so we can remove
it as well.
2019-03-28 11:16:25 -04:00
Lioncash
1650593927 gl_shader_manager: Move using statement into the cpp file
Avoids introducing Maxwell3D into the namespace for everything that
includes the header.
2019-03-28 11:16:21 -04:00
Lioncash
7d88fc83bf gl_shader_manager: Remove reliance on global accessor within MaxwellUniformData::SetFromRegs()
We can just pass in the Maxwell3D instance instead of going through the
system class to get at it.

This also lets us simplify the interface a little bit. Since we pass in
the Maxwell3D context now, we only really need to pass the shader stage
index value in.
2019-03-28 11:14:24 -04:00
Lioncash
d68716efdc gl_shader_manager: Amend Doxygen string for MaxwellUniformData
Previously only one line of the whole comment was in proper Doxygen
formatting.
2019-03-27 13:10:43 -04:00
Lioncash
947d364dba gpu_thread: Remove unused dma_pusher class member variable from ThreadManager
The pusher instance is only ever used in the constructor of the
ThreadManager for creating the thread that the ThreadManager instance
contains. Aside from that, the member is unused, so it can be removed.
2019-03-27 12:51:21 -04:00
Lioncash
e2131f7310 gl_rasterizer: Remove unused reference member variable from RasterizerOpenGL
This member variable is no longer being used, so it can be removed,
removing a dependency on EmuWindow from the rasterizer's interface"
2019-03-27 12:45:59 -04:00
Lioncash
a5fa4b311e video_core: Amend constructor initializer list order where applicable
Specifies the members in the same order that initialization would take
place in.

This also silences -Wreorder warnings.
2019-03-27 12:37:53 -04:00
Lioncash
bbe700359d video_core: Add missing override specifiers
Ensures that the signatures will always match with the base class.

Also silences a few compilation warnings.
2019-03-27 12:24:52 -04:00
Lioncash
e36f1a5ba9 video_core/gpu: Amend typo in GPU member variable name
smaphore -> semaphore
2019-03-27 12:12:57 -04:00
bunnei
e5893db3e6
Merge pull request #2256 from bunnei/gpu-vmm
gpu: Rewrite MemoryManager based on the VMManager implementation.
2019-03-22 18:41:12 -04:00
ReinUsesLisp
d708d03d20 video_core: Implement API agnostic view based texture cache
Implements an API agnostic texture view based texture cache. Classes
defined here are intended to be inherited by the API implementation and
used in API-specific code.

This implementation exposes protected virtual functions to be called
from the implementer.

Before executing any surface copies methods (defined in API-specific code)
it tries to detect if the overlapping surface is a superset and if it
is, it creates a view. Views are references of a subset of a surface, it
can be a superset view (the same as referencing the whole texture).
Current code manages 1D, 1D array, 2D, 2D array, cube maps and cube map
arrays with layer and mipmap level views. Texture 3D slices views are
not implemented.

If the view attempt fails, the fast path is invoked with the overlapping
textures (defined in the implementer). If that one fails (returning
nullptr) it will flush and reload the texture.
2019-03-22 13:34:04 -03:00
bunnei
d0dddb3e9d Revert "Devirtualize Register/Unregister and use a wrapper instead."
- Fixes graphical issues from transitions in Super Mario Odyssey.
2019-03-21 21:56:56 -04:00
bunnei
2117edd0f8 memory_manager: Cleanup FindFreeRegion. 2019-03-20 23:12:28 -04:00
bunnei
5a5fccaa23 memory_manager: Use Common::AlignUp in public interface as needed. 2019-03-20 22:58:49 -04:00
bunnei
72837e4b3d memory_manager: Bug fixes and further cleanup. 2019-03-20 22:36:03 -04:00
bunnei
19330f45d3 maxwell_dma: Check for valid source in destination before copy.
- Avoid a crash in Octopath Traveler.
2019-03-20 22:36:03 -04:00
bunnei
197dcf0b5e memory_manager: Add protections for invalid GPU addresses.
- Avoid a crash in Xenoblade Chronicles 2.
2019-03-20 22:36:03 -04:00
bunnei
21eb4cfa7f gl_rasterizer_cache: Check that backing memory is valid before creating a surface.
- Fixes a crash in Puyo Puyo Tetris.
2019-03-20 22:36:02 -04:00
bunnei
22d3dfbcd4 gpu: Rewrite virtual memory manager using PageTable. 2019-03-20 22:36:02 -04:00
bunnei
241563d15c gpu: Move GPUVAddr definition to common_types. 2019-03-20 22:36:02 -04:00
bunnei
032e4c4ca3 gl_rasterizer: Skip zero addr/sized regions on flush/invalidate. 2019-03-16 22:03:19 -04:00
bunnei
2392e146b0
Merge pull request #2244 from bunnei/gpu-mem-refactor
video_core: Refactor to use MemoryManager interface for all memory access.
2019-03-16 21:59:45 -04:00
bunnei
10118c71e0 memory: Simplify rasterizer cache operations. 2019-03-16 00:41:08 -04:00
bunnei
574e89d924 video_core: Refactor to use MemoryManager interface for all memory access.
# Conflicts:
#	src/video_core/engines/kepler_memory.cpp
#	src/video_core/engines/maxwell_3d.cpp
#	src/video_core/morton.cpp
#	src/video_core/morton.h
#	src/video_core/renderer_opengl/gl_global_cache.cpp
#	src/video_core/renderer_opengl/gl_global_cache.h
#	src/video_core/renderer_opengl/gl_rasterizer_cache.cpp
2019-03-16 00:38:48 -04:00
bunnei
2eaf6c41a4 gpu: Use host address for caching instead of guest address. 2019-03-14 22:34:42 -04:00
bunnei
84d3cdf7d7
Merge pull request #2233 from ReinUsesLisp/morton-cleanup
video_core/morton: Miscellaneous changes
2019-03-14 21:23:12 -04:00
bunnei
6788ebffc8
Merge pull request #2229 from ReinUsesLisp/vk-sampler-cache
vk_sampler_cache: Implement a sampler cache
2019-03-14 21:22:34 -04:00
bunnei
8bd17aa044
Merge pull request #2216 from ReinUsesLisp/rasterizer-system
gl_rasterizer: Use system instance passed from argument
2019-03-14 16:37:05 -04:00
bunnei
4e6c667586
Merge pull request #2227 from lioncash/override
renderer_opengl/gl_global_cache: Add missing override specifiers
2019-03-13 17:05:49 -04:00
ReinUsesLisp
ffe2e50458 video_core/morton: Use enum to describe MortonCopyPixels128 mode 2019-03-13 16:35:21 -03:00
ReinUsesLisp
6ed6129b4f video_core/morton: Remove unused parameter in MortonSwizzle 2019-03-13 16:35:10 -03:00
ReinUsesLisp
9030a8259f video_core/morton: Remove clang-format off when it's not needed 2019-03-13 16:16:45 -03:00
ReinUsesLisp
fdf76a25ab video_core/morton: Remove unused functions 2019-03-13 16:15:54 -03:00
ReinUsesLisp
a63295a872 video_core/texture: Fix up sampler lod bias 2019-03-13 00:45:54 -03:00
Mat M
a3734d7e31
vk_sampler_cache: Use operator== instead of memcmp
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-03-12 21:05:36 -03:00
ReinUsesLisp
aa59d77c3b vk_sampler_cache: Implement a sampler cache 2019-03-12 20:20:57 -03:00
ReinUsesLisp
8ebeb9ade2 video_core/texture: Add a raw representation of TSCEntry 2019-03-12 16:56:29 -03:00
bunnei
2ad44a453f
Merge pull request #2215 from ReinUsesLisp/samplers
gl_rasterizer: Encapsulate sampler queries into methods
2019-03-12 13:10:53 -04:00
Lioncash
3350c0a779 renderer_opengl/gl_global_cache: Replace indexing for assignment with insert_or_assign
The previous code had some minor issues with it, really not a big deal,
but amending it is basically 'free', so I figured, "why not?".

With the standard container maps, when:

map[key] = thing;

is done, this can cause potentially undesirable behavior in certain
scenarios. In particular, if there's no value associated with the key,
then the map constructs a default initialized instance of the value
type.

In this case, since it's a std::shared_ptr (as a type alias) that is
the value type, this will construct a std::shared_pointer, and then
assign over it (with objects that are quite large, or actively heap
allocate this can be extremely undesirable).

We also make the function take the region by value, as we can avoid a
copy (and by extension with std::shared_ptr, a copy causes an atomic
reference count increment), in certain scenarios when ownership isn't a
concern (i.e. when ReserveGlobalRegion is called with an rvalue
reference, then no copy at all occurs). So, it's more-or-less a "free"
gain without many downsides.
2019-03-11 12:20:35 -04:00
Lioncash
1070c020db renderer_opengl/gl_global_cache: Append missing override specifiers
Two of the functions here are overridden functions, so we can append
these specifiers to make it explicit.
2019-03-11 12:02:30 -04:00
ReinUsesLisp
a6c048920e gl_rasterizer: Use system instance passed from argument 2019-03-11 03:17:21 -03:00
bunnei
633ce92908
Merge pull request #2147 from ReinUsesLisp/texture-clean
shader_ir: Remove "extras" from the MetaTexture
2019-03-10 17:28:36 -04:00
bunnei
4a84921b31
Merge pull request #2143 from ReinUsesLisp/texview
gl_rasterizer_cache: Create texture views for array discrepancies
2019-03-10 17:27:49 -04:00
ReinUsesLisp
a0be7b3b92 gl_rasterizer: Encapsulate sampler queries into methods 2019-03-09 04:35:57 -03:00
ReinUsesLisp
6ee0ba64c8 gl_rasterizer: Minor logger changes 2019-03-09 03:34:49 -03:00
bunnei
160fc63c72
Merge pull request #2209 from lioncash/reorder
video_core/gpu_thread: Silence a -Wreorder warning
2019-03-08 12:04:26 -05:00
bunnei
78c803b4f3
Merge pull request #2208 from lioncash/gpu
video_core/gpu: Make GPU's destructor virtual
2019-03-08 12:03:58 -05:00
bunnei
1143923cdd
Merge pull request #2191 from ReinUsesLisp/maxwell-to-vk
maxwell_to_vk: Initial implementation
2019-03-08 11:51:08 -05:00
ReinUsesLisp
e7ac5a6adf dma_pusher: Store command_list_header by copy
Instead of holding a reference that will get invalidated by
dma_pushbuffer.pop(), hold it as a copy. This doesn't have any
performance cost since CommandListHeader is 8 bytes long.
2019-03-08 04:06:54 -03:00
Lioncash
c2d4c8b95e video_core/gpu_thread: Remove unimplemented WaitForIdle function prototype
This function didn't have a definition, so we can remove it to prevent
accidentally attempting to use it.
2019-03-07 16:08:52 -05:00
Lioncash
48a461a629 video_core/gpu_thread: Amend constructor initializer list order
Moves the data members to satisfy the order they're declared as in the
constructor initializer list.

Silences a -Wreorder warning.
2019-03-07 16:05:49 -05:00
Lioncash
24e2e601d5 video_core/gpu: Make GPU's destructor virtual
Because of the recent separation of GPU functionality into sync/async
variants, we need to mark the destructor virtual to provide proper
destruction behavior, given we use the base class within the System
class.

Prior to this, it was undefined behavior whether or not the destructor
in the derived classes would ever execute.
2019-03-07 15:59:45 -05:00
bunnei
4f352833a5
Merge pull request #2055 from bunnei/gpu-thread
Asynchronous GPU command processing
2019-03-07 10:41:53 -05:00
bunnei
076c76f4e4
Merge pull request #2149 from ReinUsesLisp/decoders-style
gl_rasterizer_cache: Move format conversion functions to their own file
2019-03-06 21:56:20 -05:00
bunnei
84ad81ee67 gpu_thread: Fix deadlock with threading idle state check. 2019-03-06 21:48:57 -05:00
bunnei
63aa08acbe gpu_thread: (HACK) Ignore flush on FlushAndInvalidateRegion. 2019-03-06 21:48:57 -05:00
bunnei
3f1b4fb23a gpu: Always flush. 2019-03-06 21:48:57 -05:00
bunnei
aaa373585c gpu: Refactor a/synchronous implementations into their own classes. 2019-03-06 21:48:57 -05:00
bunnei
7b574f406b gpu: Move command processing to another thread. 2019-03-06 21:48:57 -05:00
bunnei
d2ff93c319
Merge pull request #2190 from lioncash/ogl-global
core: Remove the global telemetry accessor function
2019-03-06 21:41:53 -05:00
bunnei
ac51d048a9 gpu: Refactor command and swap buffers interface for asynch. 2019-03-06 21:09:09 -05:00
bunnei
4483089d70 gpu: Refactor to take RendererBase instead of RasterizerInterface. 2019-03-06 21:09:09 -05:00
bunnei
22f105c06d
Merge pull request #2203 from lioncash/engines-include
video_core/engines: Remove unnecessary includes
2019-03-06 10:51:27 -05:00
Lioncash
f9ee0dc7ee video_core/engines: Remove unnecessary includes
Removes a few unnecessary dependencies on core-related machinery, such
as the core.h and memory.h, which reduces the amount of rebuilding
necessary if those files change.

This also uncovered some indirect dependencies within other source
files. This also fixes those.
2019-03-05 20:35:32 -05:00
Lioncash
42085ff110 video_core/surface: Remove obsolete TODO in PixelFormatFromRenderTargetFormat()
This isn't needed anymore, according to Hexagon
2019-03-05 10:15:06 -05:00
bunnei
07e13d6728
Merge pull request #2165 from ReinUsesLisp/unbind-tex
gl_rasterizer: Unbind textures but don't apply the gl_state
2019-03-04 13:51:59 -05:00
Lioncash
90febaf717 video_core/renderer_opengl: Replace direct usage of global system object accessors
We already pass a reference to the system object to the constructor of the renderer,
so we can just use that instead of using the global accessor functions.
2019-03-04 10:24:09 -05:00
ReinUsesLisp
1f6571b3de maxwell_to_vk: Initial implementation 2019-03-04 04:06:05 -03:00
Mat M
a461e266ea
Merge pull request #2183 from ReinUsesLisp/vk-buffer-cache-clang
vk_buffer_cache: Fix clang-format
2019-03-02 14:43:15 -05:00
bunnei
3c39b39bbc
Merge pull request #2182 from bunnei/my-wasted-friday
fuck git for ruining my day, I will learn but I will not forgive
2019-03-02 00:57:15 -05:00
ReinUsesLisp
8e84e81e74 vk_buffer_cache: Fix clang-format 2019-03-02 02:16:45 -03:00
bunnei
ab70c2583d fuck git for ruining my day, I will learn but I will not forgive 2019-03-02 00:01:34 -05:00
ReinUsesLisp
35c105a108 vk_buffer_cache: Implement a buffer cache
This buffer cache is just like OpenGL's buffer cache with some minor
style changes. It uses VKStreamBuffer.
2019-03-01 17:33:36 -03:00
ReinUsesLisp
e85066dac7 gl_rasterizer: Remove texture unbinding after dispatching a draw call
Unbinding was required when OpenGL delete operations didn't unbind a
resource if it was bound. This is no longer needed and can be removed.
2019-02-28 00:17:50 -03:00
ReinUsesLisp
bb3ab7d66c gl_state: Fixup multibind bug 2019-02-28 00:17:03 -03:00
bunnei
1b13859af8
Merge pull request #2152 from ReinUsesLisp/vk-stream-buffer
vk_stream_buffer: Implement a stream buffer
2019-02-27 21:19:15 -05:00
bunnei
1f5d6a8fed
Merge pull request #2121 from FernandoS27/texception2
Improve the Accuracy of the Rasterizer Cache through a Texception Pass
2019-02-27 21:17:55 -05:00
bunnei
66f4fd4c81
Merge pull request #2172 from lioncash/reorder
gl_rasterizer/vk_memory_manager: Silence -Wreorder warnings
2019-02-27 21:14:20 -05:00
Fernando Sahmkow
7ea097e5c2 Devirtualize Register/Unregister and use a wrapper instead. 2019-02-27 21:58:50 -04:00
Fernando Sahmkow
5a9204dbd7 Corrections and redesign. 2019-02-27 21:58:49 -04:00
Fernando Sahmkow
d6b9b51606 Fix linux compile error. 2019-02-27 21:58:48 -04:00
Fernando Sahmkow
e64fa4d2ea Remove NotifyFrameBuffer as we are doing a texception pass every drawcall. 2019-02-27 21:58:47 -04:00
Fernando Sahmkow
3558c88442 Remove certain optimizations that caused texception to fail in certain scenarios. 2019-02-27 21:58:45 -04:00
Fernando Sahmkow
e9d84ef22c Bug fixes and formatting 2019-02-27 21:58:44 -04:00
Fernando Sahmkow
5bc82d124c rasterizer_cache_gl: Implement Texception Pass 2019-02-27 21:58:43 -04:00
Fernando Sahmkow
8932001610 rasterizer_cache_gl: Implement Partial Reinterpretation of Surfaces. 2019-02-27 21:58:40 -04:00
Fernando Sahmkow
44ea2810e4 rasterizer_cache: mark reinterpreted surfaces and add ability to reload marked surfaces on next use. 2019-02-27 21:58:39 -04:00
Fernando Sahmkow
d583fc1e97 rasterizer_cache_gl: Notify on framebuffer change 2019-02-27 21:58:37 -04:00
Fernando Sahmkow
45b6d2d349 rasterizer_cache: Expose FlushObject to Child classes and allow redefining of Register and Unregister 2019-02-27 21:57:33 -04:00
bunnei
f15e2dd881
Merge pull request #2163 from ReinUsesLisp/bitset-dirty
maxwell_3d: Use std::bitset to manage dirty flags
2019-02-27 20:50:08 -05:00
ReinUsesLisp
27ddbeb01c gl_rasterizer_cache: Create texture views for array discrepancies
When a texture is sampled in a shader with a different array mode than
the cached state, create a texture view and bind that to the shader
instead.
2019-02-27 14:41:06 -03:00
bunnei
66e023fba2
Merge pull request #2167 from lioncash/namespace
common: Move Quaternion, Rectangle, Vec2, Vec3, and Vec4 into the Common namespace
2019-02-27 11:19:53 -05:00
Lioncash
16ea93c11e vk_memory_manager: Reorder constructor initializer list in terms of member declaration order
Reorders members in the order that they would actually be initialized
in. Silences a -Wreorder warning.
2019-02-27 11:08:19 -05:00
Lioncash
a6a783b3dc gl_rasterizer: Reorder constructor initializer list in terms of member declaration order
Orders the members in the order they would actually be initialized in.
Silences a -Wreorder warning.
2019-02-27 11:08:19 -05:00
Lioncash
e7eff72e83 gl_shader_disk_cache: Remove #pragma once from cpp file
This is only necessary in headers. Silences a warning with clang.
2019-02-27 11:02:49 -05:00
Lioncash
b9238edd0d common/math_util: Move contents into the Common namespace
These types are within the common library, so they should be within the
Common namespace.
2019-02-27 03:38:39 -05:00
ReinUsesLisp
0ad3c031f4 gl_rasterizer_cache: Move format conversion to its own file 2019-02-26 20:08:27 -03:00
ReinUsesLisp
0ccd490fcd decoders: Minor style changes 2019-02-26 20:08:27 -03:00
ReinUsesLisp
d91e35a50a renderer_opengl: Update pixel format tracking 2019-02-26 03:47:16 -03:00
ReinUsesLisp
5219edd715 maxwell_3d: Use std::bitset to manage dirty flags 2019-02-26 03:01:48 -03:00
ReinUsesLisp
730eb1dad7 vk_stream_buffer: Remove copy code path 2019-02-26 02:09:43 -03:00
ReinUsesLisp
5ca63d0675 shader/decode: Remove extras from MetaTexture 2019-02-26 00:11:30 -03:00
ReinUsesLisp
48e6f77c03 shader/decode: Split memory and texture instructions decoding 2019-02-26 00:11:30 -03:00
Lioncash
c1b2e35625 shader/track: Resolve variable shadowing warnings 2019-02-25 09:10:59 -05:00
bunnei
c07987dfab
Merge pull request #2118 from FernandoS27/ipa-improve
shader_decompiler: Improve Accuracy of Attribute Interpolation.
2019-02-24 23:04:22 -05:00
bunnei
c4243c07cc
Merge pull request #2119 from FernandoS27/fix-copy
rasterizer_cache_gl: Only do fast layered copy on the same format.
2019-02-24 23:03:52 -05:00
bunnei
90c780e6f3
Merge pull request #2139 from degasus/dma_pusher
video_core/dma_pusher: The full list of headers at once.
2019-02-24 04:15:49 -05:00
ReinUsesLisp
33a0597603 vk_stream_buffer: Implement a stream buffer
This manages two kinds of streaming buffers: one for unified memory
models and one for dedicated GPUs. The first one skips the copy from the
staging buffer to the real buffer, since it creates an unified buffer.

This implementation waits for all fences to finish their operation
before "invalidating". This is suboptimal since it should allocate
another buffer or start searching from the beginning. There is room for
improvement here.

This could also handle AMD's "pinned" memory (a heap with 256 MiB) that
seems to be designed for buffer streaming.
2019-02-24 04:27:51 -03:00
ReinUsesLisp
281a8bf259 vk_resource_manager: Minor VKFenceWatch changes 2019-02-24 04:19:04 -03:00
bunnei
f7090bacc5
Merge pull request #2146 from ReinUsesLisp/vulkan-scheduler
vk_scheduler: Implement a scheduler
2019-02-23 23:32:43 -05:00
bunnei
d062991643
Merge pull request #2150 from ReinUsesLisp/fixup-layer-swizzle
gl_rasterizer_cache: Fixup parameter order in layered swizzle
2019-02-23 23:31:34 -05:00
ReinUsesLisp
92050c4d86 vk_memory_manager: Fixup commit interval allocation
VKMemoryCommitImpl was using as the end of its interval "begin + end".
That ended up wasting memory.
2019-02-24 01:04:41 -03:00
ReinUsesLisp
abef11a540 gl_rasterizer_cache: Fixup parameter order in layered swizzle 2019-02-23 23:27:30 -03:00
ReinUsesLisp
f546fb35ed vk_scheduler: Implement a scheduler
The scheduler abstracts command buffer and fence management with an
interface that's able to do OpenGL-like operations on Vulkan command
buffers.

It returns by value a command buffer and fence that have to be used for
subsequent operations until Flush or Finish is executed, after that the
current execution context (the pair of command buffers and fences) gets
invalidated a new one must be fetched. Thankfully validation layers will
quickly detect if this is skipped throwing an error due to modifications
to a sent command buffer.
2019-02-22 01:33:32 -03:00
bunnei
94b27bb8a5
Merge pull request #2138 from ReinUsesLisp/vulkan-memory-manager
vk_memory_manager: Implement memory manager
2019-02-21 22:26:54 -05:00
bunnei
9539c4203b
Merge pull request #2125 from ReinUsesLisp/fixup-glstate
gl_state: Synchronize gl_state even when state is disabled
2019-02-20 21:47:46 -05:00
bunnei
ae437320c8
Merge pull request #2130 from lioncash/system_engine
video_core: Remove usages of System::GetInstance() within the engines
2019-02-20 21:24:56 -05:00
Markus Wick
6dd40976d0 video_core/dma_pusher: Simplyfy Step() logic.
As fetching command list headers and and the list of command headers is a fixed 1:1 relation now, they can be implemented within a single call.
This cleans up the Step() logic quite a bit.
2019-02-19 10:28:42 +01:00
Markus Wick
717394c980 video_core/dma_pusher: The full list of headers at once.
Fetching every u32 from memory leads to a big overhead. So let's fetch all of them as a block if possible.
This reduces the Memory::* calls by the dma_pusher by a factor of 10.
2019-02-19 09:58:38 +01:00
ReinUsesLisp
b675c97cdd vk_memory_manager: Implement memory manager
A memory manager object handles the memory allocations for a device. It
allocates chunks of Vulkan memory objects and then suballocates.
2019-02-19 03:42:28 -03:00
bunnei
4bce08d497
Merge pull request #2122 from ReinUsesLisp/vulkan-resource-manager
vk_resource_manager: Implement fence and command buffer allocator
2019-02-18 21:05:28 -05:00
bunnei
4699fdca8f
Merge pull request #2127 from FearlessTobi/fix-screenshot-srgb
renderer_opengl: respect the sRGB colorspace for the screenshot feature
2019-02-16 15:36:00 -05:00
Lioncash
a8fa5019b5 video_core: Remove usages of System::GetInstance() within the engines
Avoids the use of the global accessor in favor of explicitly making the
system a dependency within the interface.
2019-02-15 22:06:23 -05:00
James Rowe
99da6362c4
Merge pull request #2123 from lioncash/coretiming-global
core_timing: De-globalize core_timing facilities
2019-02-15 19:52:11 -07:00
Lioncash
bd983414f6 core_timing: Convert core timing into a class
Gets rid of the largest set of mutable global state within the core.
This also paves a way for eliminating usages of GetInstance() on the
System class as a follow-up.

Note that no behavioral changes have been made, and this simply extracts
the functionality into a class. This also has the benefit of making
dependencies on the core timing functionality explicit within the
relevant interfaces.
2019-02-15 21:50:25 -05:00
fearlessTobi
9a56b99fa4 renderer_opengl: respect the sRGB colorspace for the screenshot feature
Previously, we were completely ignoring for screenshots whether the game uses RGB or sRGB.
This resulted in screenshot colors that looked off for some titles.
2019-02-15 21:27:29 +01:00
ReinUsesLisp
8dfc81239f gl_state: Synchronize gl_state even when state is disabled
There are some potential edge cases where gl_state may fail to track the
state if a related state changes while the toggle is disabled or it
didn't change. This addresses that.
2019-02-15 01:30:14 -03:00
bunnei
4327f430f1
Merge pull request #2112 from lioncash/shadowing
gl_rasterizer_cache: Get rid of variable shadowing
2019-02-14 21:45:20 -05:00
bunnei
a8fc5d6edd
Merge pull request #2111 from ReinUsesLisp/fetch-fix
gl_shader_decompiler: Re-implement TLDS lod
2019-02-14 21:42:34 -05:00
ReinUsesLisp
ae6c052ed9 vk_resource_manager: Implement a command buffer pool with VKFencedPool 2019-02-14 18:44:26 -03:00
ReinUsesLisp
a2b6de7e9f vk_resource_manager: Add VKFencedPool interface
Handles a pool of resources protected by fences. Manages resource
overflow allocating more resources.

This class is intended to be used through inheritance.
2019-02-14 18:44:26 -03:00
ReinUsesLisp
0ffdd0a683 vk_resource_manager: Implement VKResourceManager and fence allocator
CommitFence iterates a pool of fences until one is found. If all fences
are being used at the same time, allocate more.
2019-02-14 18:44:26 -03:00
ReinUsesLisp
aa0b6babda vk_resource_manager: Implement VKFenceWatch
A fence watch is used to keep track of the usage of a fence and protect
a resource or set of resources without having to inherit from their
handlers.
2019-02-14 18:44:26 -03:00
ReinUsesLisp
25c2fe1c6b vk_resource_manager: Implement VKFence
Fences take ownership of objects, protecting them from GPU-side or
driver-side concurrent access. They must be commited from the resource
manager. Their usage flow is: commit the fence from the resource
manager, protect resources with it and use them, send the fence to an
execution queue and Wait for it if needed and then call Release. Used
resources will automatically be signaled when they are free to be
reused.
2019-02-14 18:44:26 -03:00
ReinUsesLisp
33a4cebc22 vk_resource_manager: Add VKResource interface
VKResource is an interface that gets signaled by a fence when it is free
to be reused.
2019-02-14 18:36:15 -03:00
bunnei
fcc3aa0bbf
Merge pull request #2113 from ReinUsesLisp/vulkan-base
vulkan: Add dependencies and device abstraction
2019-02-14 10:06:48 -05:00
Fernando Sahmkow
10682ad7e0 shader_decompiler: Improve Accuracy of Attribute Interpolation. 2019-02-14 03:25:07 -04:00
Fernando Sahmkow
bb41683394 rasterizer_cache_gl: Only do fast layered copy on the same format. As
glCopyImageSubData does not support different formats.
2019-02-13 16:55:00 -04:00
bunnei
cd542d5aac
Merge pull request #2099 from greggameplayer/BGRA8-Framebuffer-Real
Implement BGRA8 framebuffer format
2019-02-12 21:44:20 -05:00
ReinUsesLisp
8beca060d1 vk_device: Abstract device handling into a class
VKDevice contains all the data required to manage and initialize a
physical device. Its intention is to be passed across Vulkan objects to
query device-specific data (for example the logical device and the
dispatch loader).
2019-02-12 21:43:02 -03:00
Lioncash
86b55cb6df renderer_opengl: Remove reference to global system instance
We already store a reference to the system instance that the renderer is
created with, so we don't need to refer to the system instance via
Core::System::GetInstance()
2019-02-12 19:33:22 -05:00
bunnei
8135f4bfce
Merge pull request #2110 from lioncash/namespace
core_timing: Rename CoreTiming namespace to Core::Timing
2019-02-12 19:26:37 -05:00
bunnei
c440ecfafe
Merge pull request #2104 from ReinUsesLisp/compute-assert
kepler_compute: Fixup assert and rename the engine
2019-02-12 19:24:34 -05:00
Lioncash
054e39647c gl_rasterizer_cache: Remove unnecessary newline 2019-02-12 16:56:19 -05:00
Lioncash
e25c464c02 gl_rasterizer_cache: Get rid of variable shadowing
Avoids shadowing the members of the struct itself, which results in a
-Wshadow warning.
2019-02-12 16:46:39 -05:00
ReinUsesLisp
18fe910957 renderer_vulkan: Add declarations file
This file is intended to be included instead of vulkan/vulkan.hpp. It
includes declarations of unique handlers using a dynamic dispatcher
instead of a static one (which would require linking to a Vulkan
library).
2019-02-12 18:33:02 -03:00
ReinUsesLisp
e60d4d70bc gl_shader_decompiler: Re-implement TLDS lod 2019-02-12 17:03:07 -03:00
Lioncash
48d9d66dc5 core_timing: Rename CoreTiming namespace to Core::Timing
Places all of the timing-related functionality under the existing Core
namespace to keep things consistent, rather than having the timing
utilities sitting in its own completely separate namespace.
2019-02-12 12:42:17 -05:00
bunnei
444231a83d
Merge pull request #2108 from FernandoS27/fix-cc
Fix incorrect value for CC bit in IADD
2019-02-12 10:39:03 -05:00
bunnei
c1accfefde
Merge pull request #2109 from FernandoS27/fix-f2i
Corrected F2I None mode to RoundEven.
2019-02-12 10:20:29 -05:00
bunnei
27e5efd265
Merge pull request #2068 from ReinUsesLisp/shader-cleanup-textures
shader_ir: Clean texture management code
2019-02-12 10:20:15 -05:00
Fernando Sahmkow
f5ec165e8c Corrected F2I None mode to RoundEven. 2019-02-11 18:46:45 -04:00
Fernando Sahmkow
edd668047c Fix incorrect value for CC bit in IADD 2019-02-11 16:44:43 -04:00
ReinUsesLisp
1ddcd0e6f0 kepler_compute: Fixup assert and rename engines
When I originally added the compute assert I used the wrong
documentation. This addresses that.

The dispatch register was tested with homebrew against hardware and is
triggered by some games (e.g. Super Mario Odyssey). What exactly is
missing to get a valid program bound by this engine requires more
investigation.
2019-02-10 19:29:33 -03:00
greggameplayer
a6a73d8892 Implement BGRA8 framebuffer format 2019-02-09 23:44:01 +01:00
bunnei
1d98027a0e
Merge pull request #1904 from bunnei/better-fermi-copy
gl_rasterizer: Implement a more accurate fermi 2D copy.
2019-02-08 23:32:24 -05:00
Fernando Sahmkow
e543320129 Implement linear textures (#2089) 2019-02-08 18:28:01 -05:00
ReinUsesLisp
e36e7ae74e gl_rasterizer_cache: Fixup texture view parameters
These parameters were declared as constants and passed to glTextureView
but then they were removed on a rabase. This addresses that mistake.
2019-02-08 18:32:58 -03:00
ReinUsesLisp
889c646ac0 shader_ir: Remove F4 prefix to texture operations
This was originally included because texture operations returned a vec4.
These operations now return a single float and the F4 prefix doesn't
mean anything.
2019-02-07 17:36:46 -03:00
ReinUsesLisp
d62b0a9e29 shader_ir: Clean texture management code
Previous code relied on GLSL parameter order (something that's always
ill-formed on an IR design). This approach passes spatial coordiantes
through operation nodes and array and depth compare values in the the
texture metadata. It still contains an "extra" vector containing generic
nodes for bias and component index (for example) which is still a bit
ill-formed but it should be better than the previous approach.
2019-02-07 00:46:13 -03:00
bunnei
f09d1dffd1
Merge pull request #2083 from ReinUsesLisp/shader-ir-cbuf-tracking
shader/track: Add a more permissive global memory tracking
2019-02-06 21:56:14 -05:00
bunnei
35e1118766 gl_rasterizer_cache: Mark surface copy destinations as modified. 2019-02-06 21:54:25 -05:00
bunnei
dd1aab5446 gl_rasterizer: Implement a more accurate fermi 2D copy.
- This is a blit, use the blit registers.
2019-02-06 21:54:21 -05:00
Frederic L
d0ac624403 gl_shader_disk_cache: Check LZ4 size limit
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-02-06 22:23:41 -03:00
Frederic L
9f0b247cf6 gl_shader_disk_cache: Consider compressed size zero as an error
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-02-06 22:23:41 -03:00
ReinUsesLisp
e6a2245304 gl_shader_disk_cache: Use unordered containers 2019-02-06 22:23:41 -03:00
ReinUsesLisp
e147ed4fc0 gl_shader_cache: Fixup GLSL unique identifiers 2019-02-06 22:23:40 -03:00
ReinUsesLisp
eb73247433 gl_shader_cache: Link loading screen with disk shader cache load 2019-02-06 22:23:40 -03:00
ReinUsesLisp
df0f31f44e gl_shader_cache: Set GL_PROGRAM_SEPARABLE to dumped shaders
i965 (and probably all mesa drivers) require GL_PROGRAM_SEPARABLE when using
glProgramBinary. This is probably required by the standard but it's ignored by
permisive proprietary drivers.
2019-02-06 22:23:40 -03:00
ReinUsesLisp
7fefec585c gl_shader_disk_cache: Pass core system as argument and guard against games without title ids 2019-02-06 22:23:40 -03:00
ReinUsesLisp
2bc6a699dc gl_shader_disk_cache: Guard reads and writes against failure 2019-02-06 22:23:40 -03:00
ReinUsesLisp
750abcc23d gl_shader_disk_cache: Address miscellaneous feedback 2019-02-06 22:23:40 -03:00
ReinUsesLisp
8ee3666a3c gl_shader_disk_cache: Pass return values returning instead of by parameters 2019-02-06 22:23:40 -03:00
ReinUsesLisp
ed956569a4 gl_shader_disk_cache: Compress program binaries using LZ4 2019-02-06 22:23:39 -03:00
ReinUsesLisp
f087639e4a gl_shader_disk_cache: Compress GLSL code using LZ4 2019-02-06 22:23:39 -03:00
ReinUsesLisp
cfb20c4c9d gl_shader_disk_cache: Save GLSL and entries into the precompiled file 2019-02-06 22:23:39 -03:00
ReinUsesLisp
e78da8dc1f settings: Hide shader cache behind a setting 2019-02-06 22:20:57 -03:00
ReinUsesLisp
be4641c43f gl_shader_disk_cache: Invalidate shader cache changes with CMake hash 2019-02-06 22:20:57 -03:00
ReinUsesLisp
a3703f5767 gl_shader_cache: Refactor to support disk shader cache 2019-02-06 22:20:57 -03:00
ReinUsesLisp
4039086226 gl_shader_disk_cache: Add transferable cache invalidation 2019-02-06 22:20:57 -03:00
ReinUsesLisp
a1faed9950 gl_shader_disk_cache: Add precompiled load 2019-02-06 22:20:57 -03:00
ReinUsesLisp
57fb15d2a3 gl_shader_disk_cache: Add precompiled save 2019-02-06 22:20:57 -03:00
ReinUsesLisp
3435cd8d5e gl_shader_disk_cache: Add transferable load 2019-02-06 22:20:57 -03:00
ReinUsesLisp
b1efceec89 gl_shader_disk_cache: Add transferable stores 2019-02-06 22:20:57 -03:00
ReinUsesLisp
98be5a4928 gl_shader_disk_cache: Add ShaderDiskCacheOpenGL class and helpers 2019-02-06 22:20:57 -03:00
ReinUsesLisp
145c3ac89e gl_shader_disk_cache: Add file and move BaseBindings declaration 2019-02-06 22:20:57 -03:00
ReinUsesLisp
c2c5260fd7 gl_shader_decompiler: Remove name entries 2019-02-06 22:20:57 -03:00
ReinUsesLisp
8b11368671 gl_shader_util: Add parameter to handle retrievable programs 2019-02-06 22:20:57 -03:00
ReinUsesLisp
0ed5d728ca rasterizer_interface: Add disk cache entry for the rasterizer 2019-02-06 22:20:57 -03:00
ReinUsesLisp
049050856f shader_decode: Implement LDG and basic cbuf tracking 2019-02-06 22:20:57 -03:00
bunnei
10ab714fe0
Merge pull request #2042 from ReinUsesLisp/nouveau-tex
maxwell_3d: Allow texture handles with TIC id zero
2019-02-06 20:19:20 -05:00
bunnei
40ac058557
Merge pull request #2071 from ReinUsesLisp/dsa-texture
gl_rasterizer: Use DSA for textures and move swizzling to texture state
2019-02-06 20:17:59 -05:00
bunnei
7aa7d8f4ff
Merge pull request #2085 from ReinUsesLisp/cube-minus-one
video_core/texture: Fix BitField size for depth_minus_one
2019-02-05 17:15:26 -05:00
bunnei
72c70d6808
Merge pull request #2081 from ReinUsesLisp/lmem-64
shader_ir/memory: Add LD_L 64 bits loads
2019-02-05 09:17:48 -05:00
ReinUsesLisp
b5e685b297 video_core/texture: Fix BitField size for depth_minus_one 2019-02-05 04:32:06 -03:00
bunnei
bb4549a73d
Merge pull request #2082 from FernandoS27/txq-stl
Fix TXQ not using the component mask.
2019-02-04 20:22:32 -05:00
Mat M
a568cd805b
Update src/video_core/engines/shader_bytecode.h
Co-Authored-By: FernandoS27 <fsahmkow27@gmail.com>
2019-02-03 21:27:26 -04:00
Fernando Sahmkow
0306c50339 Fix TXQ not using the component mask. 2019-02-03 18:17:18 -04:00
ReinUsesLisp
dfa7be5ddf shader_ir/memory: Add ST_L 64 and 128 bits stores 2019-02-03 19:08:10 -03:00
ReinUsesLisp
0d1d755086 shader/track: Search inside of conditional nodes
Some games search conditionally use global memory instructions. This
allows the heuristic to search inside conditional nodes for the source
constant buffer.
2019-02-03 17:21:20 -03:00
ReinUsesLisp
42b75e8be8 shader_ir: Rename BasicBlock to NodeBlock
It's not always used as a basic block. Rename it for consistency.
2019-02-03 17:21:20 -03:00
ReinUsesLisp
6a6fabea58 shader_ir: Pass decoded nodes as a whole instead of per basic blocks
Some games call LDG at the top of a basic block, making the tracking
heuristic to fail. This commit lets the heuristic the decoded nodes as a
whole instead of per basic blocks.

This may lead to some false positives but allows it the heuristic to
track cases it previously couldn't.
2019-02-03 17:21:20 -03:00
ReinUsesLisp
2bdbb90af7 video_core: Assert on invalid GPU to CPU address queries 2019-02-03 04:58:40 -03:00
ReinUsesLisp
04e68e9738 maxwell_3d: Allow sampler handles with TSC id zero 2019-02-03 04:58:40 -03:00
ReinUsesLisp
390721a561 maxwell_3d: Allow texture handles with TIC id zero
Also remove "enabled" field from Tegra::Texture::FullTextureInfo because
it would become unused.
2019-02-03 04:58:24 -03:00
ReinUsesLisp
e01a9de35f memory_manager: Check for reserved page status 2019-02-03 04:58:24 -03:00
ReinUsesLisp
f61c1ed246 shader_ir/memory: Add LD_L 128 bits loads 2019-02-03 00:35:34 -03:00
ReinUsesLisp
9feb68085d shader_bytecode: Rename BytesN enums to BitsN 2019-02-03 00:25:40 -03:00
ReinUsesLisp
0be835132c shader_ir/memory: Add LD_L 64 bits loads 2019-02-03 00:25:40 -03:00
bunnei
eceab45dac
Merge pull request #2074 from ReinUsesLisp/shader-ir-unify-offset
shader_ir: Unify constant buffer offset values
2019-02-01 13:24:04 -05:00
bunnei
2d226ff8ac
Merge pull request #2067 from ReinUsesLisp/workaround-fb
gl_rasterizer: Workaround invalid zeta clears
2019-02-01 12:50:09 -05:00
ReinUsesLisp
26f8a700a7 rasterizer_interface: Remove unused AccelerateFill operation 2019-02-01 03:02:22 -03:00
ReinUsesLisp
13222f94c0 video_core: Remove unused Fill surface type 2019-02-01 02:57:47 -03:00
ReinUsesLisp
3e80b08944 gl_rasterizer_cache: Fixup test clause 2019-01-30 19:10:35 -03:00
Mat M
911587fb8d gl_rasterizer_cache: Guard clause swizzle testing
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-01-30 19:10:35 -03:00
ReinUsesLisp
220df45b7d gl_state: Remove texture target tracking 2019-01-30 19:10:35 -03:00
ReinUsesLisp
704744bb72 gl_rasterizer_cache: Move swizzling to textures instead of state 2019-01-30 19:10:35 -03:00
ReinUsesLisp
3bbaa98c78 gl_state: Use DSA and multi bind to update texture bindings 2019-01-30 19:10:11 -03:00
ReinUsesLisp
4b676e7786 gl_rasterizer: Use DSA for textures 2019-01-30 19:10:11 -03:00
Hexagon12
35480167b1
Merge pull request #2076 from lioncash/enc
video_core/dma_pusher: Silence C4828 warnings
2019-01-30 19:42:15 +02:00
Lioncash
0b594f3344 video_core/dma_pusher: Silence C4828 warnings
This was previously causing:

warning C4828: The file contains a character starting at offset 0xa33
that is illegal in the current source character set (codepage 65001).

warnings on Windows when compiling yuzu.
2019-01-30 12:36:31 -05:00
bunnei
92b18345a8
Merge pull request #1485 from FernandoS27/render-info
Add more info into textures' object labels
2019-01-30 12:35:56 -05:00
ReinUsesLisp
477d616f7d shader_ir: Unify constant buffer offset values
Constant buffer values on the shader IR were using different offsets if
the access direct or indirect. cbuf34 has a non-multiplied offset while
cbuf36 does. On shader decoding this commit multiplies it by four on
cbuf34 queries.
2019-01-30 02:45:50 -03:00
bunnei
3c3d9afd61
Merge pull request #2070 from ReinUsesLisp/cubearray-view
gl_shader_cache: Fix texture view for cubemaps as cubemap arrays
2019-01-29 22:27:08 -05:00
ReinUsesLisp
52c326c301 gl_shader_cache: Use explicit bindings 2019-01-30 00:11:02 -03:00
ReinUsesLisp
9f803299de gl_rasterizer: Implement global memory management 2019-01-30 00:00:15 -03:00
ReinUsesLisp
3b84e04af1 shader_decode: Implement LDG and basic cbuf tracking 2019-01-30 00:00:15 -03:00
Kevin
ba38d91fe2 video_core/GPU Implemented the GPU PFIFO puller semaphore operations. (#1908)
* Implemented the puller semaphore operations.

* Nit: Fix 2 style issues

* Nit: Add Break to default case.

* Fix style.

* Update for comments. Added ReferenceCount method

* Forgot to remove GpuSmaphoreAddress union.

* Fix the clang-format issues.

* More clang formatting.

* two more white spaces for the Clang formatting.

* Move puller members into the regs union

* Updated to use Memory::WriteBlock instead of Memory::Write*

* Fix clang style issues

* White space clang error

* Removing unused funcitons and other pr comment

* Removing unused funcitons and other pr comment

* More union magic for setting regs value.

* union magic refcnt as well

*  Remove local var

* Set up the regs and regs_assert_positions up properly

* Fix clang error
2019-01-29 21:49:18 -05:00
ReinUsesLisp
f58a6152fc gl_shader_cache: Fix texture view for cubemaps as cubemap arrays
Cubemaps are considered layered and to create a texture view the texture
mustn't be a layered texture, resulting in cubemaps being bound as
cubemap arrays. To fix this issue this commit introduces an extra
surface parameter called "is_array" and uses this to query for texture
view creation.

Now that texture views for cubemaps are actually being created, this
also fixes the number of layers created for the texture view (since they
have to be 6 to create a texture view of cubemaps).
2019-01-29 23:49:02 -03:00
ReinUsesLisp
07692230ca gl_rasterizer: Workaround invalid zeta clears
Some games (like Xenoblade Chronicles 2) clear both depth and stencil
buffers while there's a depth-only texture attached (e.g. D16 Unorm).
This commit reads the zeta format of the bound surface on
ConfigureFramebuffers and returns if depth and/or stencil attachments
were set. This is ignored on DrawArrays but on Clear it's used to just
clear those attachments, bypassing an OpenGL error.
2019-01-29 23:47:33 -03:00
Lioncash
b2b98b2f44 shader/shader_ir: Amend three comment typos
Given we're in the area, these are three trivial typos that can be
corrected.
2019-01-28 07:52:04 -05:00
Lioncash
62e08c30b7 shader/shader_ir: Amend constructor initializer ordering for AbufNode
Orders the class members in the same order that they would actually be
initialized in. Gets rid of two compiler warnings.
2019-01-28 07:50:34 -05:00
Lioncash
3e1a9a45a6 shader/decode: Avoid a pessimizing std::move within DecodeRange()
std::moveing a local variable in a return statement has the potential to
prevent copy elision from occurring, so this can just be converted into
a regular return.
2019-01-28 07:43:23 -05:00
ReinUsesLisp
fc6d46c374 video_core: Silent implicit conversion warning 2019-01-26 02:27:14 -03:00
bunnei
1f4ca1e841
Merge pull request #1927 from ReinUsesLisp/shader-ir
video_core: Replace gl_shader_decompiler with an IR based decompiler
2019-01-25 23:42:14 -05:00
bunnei
045b0b70b6 frontend: Refactor ScopeAcquireWindowContext out of renderer_opengl. 2019-01-23 19:19:23 -05:00
ReinUsesLisp
9a82dec74a maxwell_3d: Set rt_separate_frag_data to 1 by default
Commercial games assume that this value is 1 but they never set it. On
the other hand nouveau manually sets this register. On
ConfigureFramebuffers we were asserting for what we are actually
implementing (according to envytools).
2019-01-22 04:14:29 -03:00
James Rowe
ea73ffe202 Rename step 1 and step 2 to be a little more descriptive 2019-01-20 18:40:25 -07:00
James Rowe
e8bd6b1fcc QT: Upgrade the Loading Bar to look much better 2019-01-20 14:47:35 -07:00
bunnei
197d0d9d24
Merge pull request #2008 from ReinUsesLisp/dirty-framebuffers
gl_rasterizer_cache: Use dirty flags for framebuffers
2019-01-20 14:06:26 -05:00
bunnei
cbf8bea9d5
Merge pull request #2002 from ReinUsesLisp/dsa-vao-buffer
gl_rasterizer: Use DSA for VAOs and buffers
2019-01-20 14:06:01 -05:00
ReinUsesLisp
a1b1ea47ed gl_rasterizer: Silent unsafe mix warning 2019-01-18 03:25:28 -03:00
ReinUsesLisp
a63d7c49fc shader_ir: Fixup clang build 2019-01-15 21:06:05 -03:00
ReinUsesLisp
1e40a4b343 gl_shader_decompiler: replace std::get<> with std::get_if<> for macOS compatibility 2019-01-15 17:54:53 -03:00
ReinUsesLisp
51de4e00a6 gl_shader_decompiler: Inline textureGather component 2019-01-15 17:54:53 -03:00
ReinUsesLisp
1c9c4eefeb shader_decode: Fixup XMAD 2019-01-15 17:54:53 -03:00
ReinUsesLisp
170c8212bb shader_ir: Pass to decoder functions basic block's code 2019-01-15 17:54:53 -03:00
ReinUsesLisp
2d6c064e66 shader_decode: Improve zero flag implementation 2019-01-15 17:54:53 -03:00
ReinUsesLisp
d911740e5d shader_ir: Remove composite primitives and use temporals instead 2019-01-15 17:54:53 -03:00
ReinUsesLisp
bb12f99b20 gl_shader_decompiler: Fixup AssignCompositeHalf 2019-01-15 17:54:53 -03:00
ReinUsesLisp
50195b1704 shader_decode: Use proper primitive names 2019-01-15 17:54:53 -03:00
ReinUsesLisp
2faad9bf23 shader_decode: Use BitfieldExtract instead of shift + and 2019-01-15 17:54:53 -03:00
ReinUsesLisp
52223313b1 shader_ir: Remove Ipa primitive 2019-01-15 17:54:53 -03:00
ReinUsesLisp
d6b173d5fe gl_shader_decompiler: Use rasterizer's UBO size limit 2019-01-15 17:54:53 -03:00
ReinUsesLisp
df74ff3c8b gl_shader_gen: Fixup code formatting 2019-01-15 17:54:53 -03:00
ReinUsesLisp
af5d7e2c49 video_core: Rename glsl_decompiler to gl_shader_decompiler 2019-01-15 17:54:53 -03:00
ReinUsesLisp
d9118d324a shader_ir: Remove RZ and use Register::ZeroIndex instead 2019-01-15 17:54:53 -03:00
ReinUsesLisp
5af82a8ed4 shader_decode: Implement TEXS.F16 2019-01-15 17:54:53 -03:00
ReinUsesLisp
c68c13e1aa shader_decode: Fixup R2P 2019-01-15 17:54:53 -03:00
ReinUsesLisp
8b5588e776 glsl_decompiler: Fixup TLDS 2019-01-15 17:54:53 -03:00
ReinUsesLisp
dbed6c6485 glsl_decompiler: Fixup geometry shaders 2019-01-15 17:54:53 -03:00
ReinUsesLisp
ea78c78253 shader_decode: Fixup WriteLogicOperation zero comparison 2019-01-15 17:54:53 -03:00
ReinUsesLisp
ab7f52b279 glsl_decompiler: Fixup permissive member function declarations 2019-01-15 17:54:53 -03:00
ReinUsesLisp
55a10d02e5 shader_decode: Fixup PSET 2019-01-15 17:54:53 -03:00
ReinUsesLisp
a2e22b4359 shader_decode: Fixup clang-format 2019-01-15 17:54:53 -03:00
ReinUsesLisp
e1fea1e0c5 video_core: Implement IR based geometry shaders 2019-01-15 17:54:53 -03:00
ReinUsesLisp
a1b845b651 shader_decode: Implement VMAD and VSETP 2019-01-15 17:54:53 -03:00
ReinUsesLisp
b11e0b94c7 shader_decode: Implement HSET2 2019-01-15 17:54:53 -03:00
ReinUsesLisp
2df55985b6 shader_decode: Rework HSETP2 2019-01-15 17:54:53 -03:00
ReinUsesLisp
8332482c24 shader_decode: Implement R2P 2019-01-15 17:54:53 -03:00
ReinUsesLisp
3f1136ac6f shader_decode: Implement CSETP 2019-01-15 17:54:52 -03:00
ReinUsesLisp
7e13e8bfcb shader_decode: Implement PSET 2019-01-15 17:54:52 -03:00
ReinUsesLisp
dd91650aaf shader_decode: Implement HFMA2 2019-01-15 17:54:52 -03:00
ReinUsesLisp
d6f76307fe glsl_decompiler: Remove HNegate inlining 2019-01-15 17:54:52 -03:00
ReinUsesLisp
027f443e69 shader_decode: Implement POPC 2019-01-15 17:54:52 -03:00
ReinUsesLisp
55e6786254 shader_decode: Implement TLDS (untested) 2019-01-15 17:54:52 -03:00
ReinUsesLisp
ec98e4d842 shader_decode: Update TLD4 reflecting #1862 changes 2019-01-15 17:54:52 -03:00
ReinUsesLisp
03e088a4f4 shader_ir: Fixup TEX and TEXS and partially fix TLD4 decompiling 2019-01-15 17:54:52 -03:00
ReinUsesLisp
2d9136cec6 shader_decode: Fixup FSET 2019-01-15 17:54:52 -03:00
ReinUsesLisp
af5c6e4ccb shader_decode: Implement IADD32I 2019-01-15 17:54:52 -03:00
ReinUsesLisp
4316eaf75c shader_decode: Fixup clang-format 2019-01-15 17:54:52 -03:00
ReinUsesLisp
fc46ecddb3 video_core: Return safe values after an assert hits 2019-01-15 17:54:52 -03:00
ReinUsesLisp
148a6418ed shader_decode: Implement FFMA 2019-01-15 17:54:52 -03:00
ReinUsesLisp
21aff36459 video_core: Address feedback 2019-01-15 17:54:52 -03:00
ReinUsesLisp
59b34b1d76 shader_ir: Fixup file inclusions and clang-format 2019-01-15 17:54:52 -03:00
Mat M
57a900cc45 shader_ir: Move comment node string
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-01-15 17:54:52 -03:00
ReinUsesLisp
d4fae3a699 shader_ir: Address feedback to avoid UB in bit casting 2019-01-15 17:54:52 -03:00
ReinUsesLisp
946c86f0bb shader_decode: Fixup clang-format 2019-01-15 17:54:52 -03:00
ReinUsesLisp
c9cf899d18 shader_decode: Implement LEA 2019-01-15 17:54:52 -03:00
ReinUsesLisp
4fd06efeb9 shader_decode: Implement IADD3 2019-01-15 17:54:52 -03:00
ReinUsesLisp
a40fd07516 shader_decode: Implement LOP3 2019-01-15 17:54:52 -03:00
ReinUsesLisp
b184ca9089 shader_decode: Implement ST_L 2019-01-15 17:54:52 -03:00
ReinUsesLisp
8d42feb09b shader_decode: Implement LD_L 2019-01-15 17:54:52 -03:00
ReinUsesLisp
21f9e9da09 shader_decode: Implement HSETP2 2019-01-15 17:54:52 -03:00
ReinUsesLisp
68c99d2597 shader_decode: Implement HADD2 and HMUL2 2019-01-15 17:54:52 -03:00
ReinUsesLisp
cf4a08d950 shader_decode: Implement HADD2_IMM and HMUL2_IMM 2019-01-15 17:54:52 -03:00
ReinUsesLisp
376a837511 shader_decode: Implement MOV_SYS 2019-01-15 17:54:52 -03:00
ReinUsesLisp
518a2bd206 shader_decode: Implement IMNMX 2019-01-15 17:54:52 -03:00
ReinUsesLisp
07944a2345 shader_decode: Implement F2F_C 2019-01-15 17:54:52 -03:00
ReinUsesLisp
e8235c0215 shader_decode: Implement I2I 2019-01-15 17:54:52 -03:00
ReinUsesLisp
6ca31f544a shader_decode: Implement BRA internal flag 2019-01-15 17:54:52 -03:00
ReinUsesLisp
210620ff31 shader_decode: Implement ISCADD 2019-01-15 17:54:52 -03:00
ReinUsesLisp
b0e7920838 shader_decode: Implement XMAD 2019-01-15 17:54:51 -03:00
ReinUsesLisp
becfdb8638 shader_decode: Implement PBK and BRK 2019-01-15 17:54:51 -03:00
ReinUsesLisp
8f37531f8e shader_decode: Implement LOP 2019-01-15 17:54:51 -03:00
ReinUsesLisp
8486e7f8c8 shader_decode: Implement SEL 2019-01-15 17:54:51 -03:00
ReinUsesLisp
ccb71bece9 shader_decode: Implement IADD 2019-01-15 17:54:51 -03:00
ReinUsesLisp
faadae5814 shader_decode: Implement ISETP 2019-01-15 17:54:51 -03:00
ReinUsesLisp
80183de884 shader_decode: Implement BFI 2019-01-15 17:54:51 -03:00
ReinUsesLisp
078ba28e13 shader_decode: Implement ISET 2019-01-15 17:54:51 -03:00
ReinUsesLisp
acdbbb8885 shader_decode: Implement LD_C 2019-01-15 17:54:51 -03:00
ReinUsesLisp
d79c462af0 shader_decode: Implement SHL 2019-01-15 17:54:51 -03:00
ReinUsesLisp
a2819c204f shader_decode: Implement SHR 2019-01-15 17:54:51 -03:00
ReinUsesLisp
39f1c6246a shader_decode: Implement LOP32I 2019-01-15 17:54:51 -03:00
ReinUsesLisp
501284a81a shader_decode: Implement BFE 2019-01-15 17:54:51 -03:00
ReinUsesLisp
e444a6553f shader_decode: Implement FSET 2019-01-15 17:54:51 -03:00
ReinUsesLisp
3052eae25e shader_decode: Implement F2I 2019-01-15 17:54:51 -03:00
ReinUsesLisp
8abe5ba2c8 shader_decode: Implement I2F 2019-01-15 17:54:51 -03:00
ReinUsesLisp
c849b5b320 shader_decode: Implement F2F 2019-01-15 17:54:50 -03:00
ReinUsesLisp
9118deb990 shader_decode: Stub DEPBAR 2019-01-15 17:54:50 -03:00
ReinUsesLisp
97f33f00cf shader_decode: Implement SSY and SYNC 2019-01-15 17:54:50 -03:00
ReinUsesLisp
abdbafbc20 shader_decode: Implement PSETP 2019-01-15 17:54:50 -03:00
ReinUsesLisp
802c23b8a8 shader_decode: Implement TMML 2019-01-15 17:54:50 -03:00
ReinUsesLisp
2b90637f4b shader_decode: Implement TEX and TXQ 2019-01-15 17:54:50 -03:00
ReinUsesLisp
878672f371 shader_decode: Implement TEXS (F32) 2019-01-15 17:54:50 -03:00
ReinUsesLisp
c703f0aee4 shader_decode: Implement FSETP 2019-01-15 17:54:50 -03:00
ReinUsesLisp
8215ae942c shader_decode: Partially implement BRA 2019-01-15 17:54:50 -03:00
ReinUsesLisp
4f95dc950e shader_decode: Implement IPA 2019-01-15 17:54:50 -03:00
ReinUsesLisp
cacb934f21 shader_decode: Implement EXIT 2019-01-15 17:54:50 -03:00
ReinUsesLisp
0c049e0a21 shader_decode: Implement ST_A 2019-01-15 17:54:50 -03:00
ReinUsesLisp
e3f1233ce1 shader_decode: Implement LD_A 2019-01-15 17:54:50 -03:00
ReinUsesLisp
ea358bd4bf shader_decode: Implement FADD32I 2019-01-15 17:54:50 -03:00
ReinUsesLisp
c9b2a1b051 shader_decode: Implement FMUL32_IMM 2019-01-15 17:54:50 -03:00
ReinUsesLisp
2edee801ce shader_decode: Implement MOV32_IMM 2019-01-15 17:54:50 -03:00
ReinUsesLisp
06cb910c6d shader_decode: Stub RRO_C, RRO_R and RRO_IMM 2019-01-15 17:54:50 -03:00
ReinUsesLisp
5e6a0a08c1 shader_decode: Implement FMNMX_C, FMNMX_R and FMNMX_IMM 2019-01-15 17:54:50 -03:00
ReinUsesLisp
964ddeeb90 shader_decode: Implement MUFU 2019-01-15 17:54:50 -03:00
ReinUsesLisp
4ccaa1402d shader_decode: Implement FADD_C, FADD_R and FADD_IMM 2019-01-15 17:54:50 -03:00
ReinUsesLisp
7c192ec43f shader_decode: Implement FMUL_C, FMUL_R and FMUL_IMM 2019-01-15 17:54:50 -03:00
ReinUsesLisp
4c70d5b8eb shader_decode: Implement MOV_C and MOV_R 2019-01-15 17:54:50 -03:00
ReinUsesLisp
a4f052f6b3 video_core: Replace gl_shader_decompiler 2019-01-15 17:54:50 -03:00
ReinUsesLisp
0c6fb456e0 glsl_decompiler: Implementation 2019-01-15 17:54:50 -03:00
ReinUsesLisp
fbc67a0563 shader_ir: Add condition code helper 2019-01-15 17:54:50 -03:00
ReinUsesLisp
a58abbcfc4 shader_ir: Add predicate combiner helper 2019-01-15 17:54:49 -03:00
ReinUsesLisp
bf07272695 shader_ir: Add comparison helpers 2019-01-15 17:54:49 -03:00
ReinUsesLisp
60f044df56 shader_ir: Add half float helpers 2019-01-15 17:54:49 -03:00
ReinUsesLisp
e3c55e31d7 shader_ir: Add integer helpers 2019-01-15 17:54:49 -03:00
ReinUsesLisp
833d0806f9 shader_ir: Add float helpers 2019-01-15 17:54:49 -03:00
ReinUsesLisp
6b9eea3fe5 shader_ir: Add setters 2019-01-15 17:54:49 -03:00
ReinUsesLisp
12a95ff453 shader_ir: Add local memory getters 2019-01-15 17:54:49 -03:00
ReinUsesLisp
2f87fd060d shader_ir: Add internal flag getters 2019-01-15 17:54:49 -03:00
ReinUsesLisp
15f431f0cb shader_ir: Add attribute getters 2019-01-15 17:54:49 -03:00
ReinUsesLisp
864e8f55cf shader_ir: Add constant buffer getters 2019-01-15 17:54:49 -03:00
ReinUsesLisp
5e639bfcf6 shader_ir: Add register getter 2019-01-15 17:54:49 -03:00
ReinUsesLisp
4aaa2192b9 shader_ir: Add immediate node constructors 2019-01-15 17:54:49 -03:00
ReinUsesLisp
15a0e1481d shader_ir: Initial implementation 2019-01-15 17:54:49 -03:00
ReinUsesLisp
294df41b86 shader_bytecode: Fixup encoding 2019-01-15 17:54:49 -03:00
ReinUsesLisp
a0c8c16d07 shader_header: Make local memory size getter constant 2019-01-15 17:54:49 -03:00
ReinUsesLisp
877a978a22 gl_rasterizer: Workaround Intel VAO DSA bug
There is a bug on Intel's blob driver where it fails to properly build a
vertex array object if it's not bound even after creating it with
glCreateVertexArrays. This workaround binds it after creating it to
bypass the issue.
2019-01-09 02:40:19 -03:00
ReinUsesLisp
3121408a90 gl_global_cache: Add dummy global cache manager 2019-01-08 17:47:45 -03:00
ReinUsesLisp
19cf995225 gl_rasterizer: Skip framebuffer configuration if rendertargets have not been changed 2019-01-07 16:23:23 -03:00
bunnei
23ebd4920e
Merge pull request #1999 from ReinUsesLisp/dirty-shader
gl_shader_cache: Use dirty flags for shaders
2019-01-07 14:22:30 -05:00
ReinUsesLisp
b683e41fca gl_rasterizer_cache: Use dirty flags for the depth buffer 2019-01-07 16:22:28 -03:00
ReinUsesLisp
179ee963db gl_rasterizer_cache: Use dirty flags for color buffers 2019-01-07 16:20:39 -03:00
ReinUsesLisp
0ab17ab406 gl_shader_cache: Use dirty flags for shaders 2019-01-07 16:13:12 -03:00
ReinUsesLisp
5933b3ea96 gl_stream_buffer: Use DSA for buffer management 2019-01-06 16:49:24 -03:00
ReinUsesLisp
35c095898b gl_rasterizer: Use DSA for vertex array objects 2019-01-06 16:49:24 -03:00
ReinUsesLisp
ea4928393f gl_state: Drop uniform buffer state tracking 2019-01-06 00:28:01 -03:00
ReinUsesLisp
fc8a8789da gl_rasterizer_cache: Use GL_STREAM_COPY for PBOs
Since the data is doing the path CPU -> GPU -> GPU copy is the most
approximate hint. Using GL_STREAM_DRAW generated a performance warning
on Nvidia's stack. Changing this hint removed the warning.
2019-01-05 02:27:55 -03:00
bunnei
c91d2bac45
Merge pull request #1961 from ReinUsesLisp/tex-view-2d
gl_rasterizer_cache: Texture view if shader samples array but OGL is not
2019-01-02 17:51:32 -05:00
ReinUsesLisp
97fb6179b9 gl_rasterizer_cache: Texture view if shader samples array but OGL is not
When a shader samples a texture array but that texture in OpenGL is
created without layers, use a texture view to increase the texture
hierarchy. For example, instead of binding a GL_TEXTURE_2D bind a
GL_TEXTURE_2D_ARRAY view.
2018-12-29 23:49:12 -03:00
bunnei
2020ba06e1 gpu: Remove PixelFormat G8R8U and G8R8S, as they do not seem to exist.
- Fixes UI rendering issues in The Legend of Zelda: Breath of the Wild.
2018-12-28 15:36:45 -05:00
Rodolfo Bogado
fbe900ba6d Add missing uintBitsToFloat to SetRegisterToHalfFloat 2018-12-27 14:39:10 -03:00
bunnei
fa9acc26d9
Merge pull request #1892 from Tinob/master
Improve Zero flag implementation
2018-12-27 11:06:59 -05:00
Lioncash
67fa21e143 renderer_opengl: Correct forward declaration of FramebufferLayout
This is actually a struct, not a class, which can lead to compilation
warnings.
2018-12-26 17:32:32 -05:00
Rodolfo Bogado
33056dd833 Apply CC test to the final value to be stored in the register 2018-12-26 18:16:31 -03:00
David
8047873a66 Fixed shader linking error due to TLDS (#1934)
* Fixed shader linking error due to TLDS

coord should be coords

* Fix remaining coords
2018-12-26 15:55:39 -05:00
ReinUsesLisp
aaa0e6c346 shader_bytecode: Fixup TEXS.F16 encoding 2018-12-26 01:35:44 -03:00
bunnei
9a22a94a51
Merge pull request #1886 from FearlessTobi/port-4164
Port citra-emu/citra#4164: "citra_qt, video_core: Screenshot functionality"
2018-12-23 14:36:51 -05:00
Rodolfo Bogado
bbf8d6bf01 Includde saturation in the evaluation of the control code 2018-12-22 19:19:18 -03:00
Rodolfo Bogado
946777601b Handle RZ cases evaluating the expression instead of the register value. 2018-12-22 19:19:18 -03:00
Rodolfo Bogado
7e72b5e453 complete emulation of ZeroFlag 2018-12-22 19:19:18 -03:00
bunnei
e75e8b9580
Merge pull request #1921 from ogniK5377/no-unit
Fixed uninitialized memory due to missing returns in canary
2018-12-21 14:12:54 -05:00
bunnei
42427b9c7a
Merge pull request #1920 from heapo/texture_format_selection
Texture format fixes for RGBA16UI for copies and R16U when used as depth
2018-12-21 13:46:17 -05:00
bunnei
3050f3a7ba
Merge pull request #1909 from heapo/shadow_sampling_fixes
Fix arrayed texture LOD selection and depth comparison ordering
2018-12-19 13:10:37 -05:00
David Marcec
20859802f0 hopefully fix clang format issue 2018-12-19 13:22:09 +11:00
David Marcec
fdd649e2ef Fixed uninitialized memory due to missing returns in canary
Functions which are suppose to crash on non canary builds usually don't return anything which lead to uninitialized memory being used.
2018-12-19 12:52:32 +11:00
zhupengfei
a2be49305d yuzu, video_core: Screenshot functionality
Allows capturing screenshot at the current internal resolution (native for software renderer), but a setting is available to capture it in other resolutions. The screenshot is saved to a single PNG in the current layout.
2018-12-18 22:54:41 +01:00
heapo
37280cf555 Texture format fixes: Flag RGBA16UI as GL_RGBA_INTEGER format, and interpret R16U as Z16 when depth_compare is enabled. 2018-12-18 11:34:51 -08:00
ReinUsesLisp
ef061481c5 shader_bytecode: Fixup half float's operator B encoding 2018-12-18 04:28:50 -03:00
heapo
72599cc667 Implement postfactor multiplication/division for fmul instructions 2018-12-17 07:56:25 -08:00
heapo
a6daed74f5 Fix arrayed shadow sampler array slice/depth comparison ordering, as well as invalid GLSL LOD selection. 2018-12-17 07:53:48 -08:00
bunnei
e1f28afb98
Merge pull request #1893 from lioncash/warn
gl_shader_cache: Resolve truncation compiler warning
2018-12-11 20:47:10 -05:00
bunnei
d63c883e66
Merge pull request #1888 from marcosvitali/glFrontFacing
gl_shader_decompiler: IPA fix FrontFacing.
2018-12-11 11:43:38 -05:00
Lioncash
4c2b94559b gl_shader_cache: Dehardcode constant in CalculateProgramSize()
This constant is related to the size of the instruction.
2018-12-10 23:47:20 -05:00
Lioncash
861bfdbf5d gl_shader_cache: Resolve truncation compiler warning
The previous code would cause a warning, as it was truncating size_t
(64-bit) to a u32 (32-bit) implicitly.
2018-12-10 23:44:18 -05:00
bunnei
5b5d0199fe
Merge pull request #1740 from FernandoS27/shader_props
Implemented Shader Unique Identifiers
2018-12-10 12:43:43 -05:00
Marcos Vitali
430e1f864b gl_shader_decompiler: IPA FrontFacing: the right value when is the front face is 0xFFFFFFFF. 2018-12-09 23:36:21 -03:00
Fernando Sahmkow
d5d77848e6 Implemented a shader unique identifier. 2018-12-09 17:33:33 -04:00
FernandoS27
7b9c982d29 Add more info into textures' object labels 2018-12-09 17:22:29 -04:00
Marcos Vitali
f4fa7ecb0e gl_shader_decompiler: TLDS/TLD4/TLD4S Reworked reflecting the source registers, bugs fixed and modularize. 2018-12-07 19:09:36 -03:00
bunnei
9390452195
Merge pull request #1824 from ReinUsesLisp/fbcache
gl_rasterizer: Implement a framebuffer cache
2018-12-06 11:56:59 -05:00
ReinUsesLisp
59a8df1b14 gl_shader_decompiler: Implement TEXS.F16 2018-12-05 02:06:34 -03:00
ReinUsesLisp
370980fdc3 gl_shader_decompiler: Fixup inverted if 2018-12-05 01:23:04 -03:00
heapo
7853e6b5d4 Improve msvc codegen for hot-path array LUTs
In some constexpr functions, msvc is building the LUT at runtime
(pushing each element onto the stack) out of an abundance of caution. Moving the
arrays into be file-scoped constexpr's avoids this and turns the functions into
simple look-ups as intended.
2018-12-04 17:13:07 -08:00
Marcos
ab2108fb2a Rewrited TEX/TEXS (TEX Scalar). (#1826)
* Rewrited TEX/TEXS (TEX Scalar).

* Style fixes.

* Styles issues.
2018-12-04 12:24:35 -05:00
bunnei
7f6bc284e9
Merge pull request #1854 from Subv/old_command_processor
Don't try to route PFIFO methods (0-0x40) to the other engines.
2018-12-04 08:49:22 -05:00
Subv
c4c19fa6c1 Removed unused file.
This is a leftover from #1792
2018-12-03 23:52:38 -05:00
Subv
b873253da1 GPU: Don't try to route PFIFO methods (0-0x40) to the other engines. 2018-12-03 23:52:18 -05:00
bunnei
8a12daac8c
Merge pull request #1822 from ReinUsesLisp/glsl-scope
gl_shader_decompiler: Introduce a scoped object and style changes
2018-12-03 17:10:02 -05:00
bunnei
7ce17b2cf6
Merge pull request #1827 from ReinUsesLisp/clip-and-shader
gl_rasterizer: Enable clip distances when set in register and in shader
2018-12-01 23:51:47 -05:00
bunnei
80aa124b1d
Merge pull request #1825 from ReinUsesLisp/shader-pipeline-cache
gl_shader_manager: Update pipeline when programs have changed
2018-12-01 23:48:55 -05:00
bunnei
a6805e58ce
Merge pull request #1795 from ReinUsesLisp/vc-cleanup
video_core: Minor style changes
2018-12-01 23:46:18 -05:00
bunnei
0e9be7be37
Merge pull request #1823 from bunnei/fix-surface-copy
gl_rasterizer_cache: Fix several surface copy issues.
2018-12-01 23:44:32 -05:00
Lioncash
e88cdcc912 Fix debug build
A non-existent parameter was left in some formatting calls (the logging
macro for which only does anything meaningful on debug builds)
2018-12-01 02:11:42 -05:00
bunnei
0f43564d09 gl_rasterizer_cache: Update AccurateCopySurface to flush complete source surface.
- Fixes issues with Breath of the Wild with use_accurate_gpu_emulation setting.
2018-11-29 20:10:11 -05:00
ReinUsesLisp
2908d30274 gl_rasterizer: Enable clip distances when set in register and in shader 2018-11-29 16:58:20 -03:00
ReinUsesLisp
1a2bb596db gl_rasterizer: Implement a framebuffer cache 2018-11-29 16:34:46 -03:00
ReinUsesLisp
e8620eaa9a gl_shader_manager: Update pipeline when programs have changed 2018-11-29 16:26:42 -03:00
bunnei
3d3cc35ee7 gl_rasterizer_cache: Remove BlitSurface and replace with more accurate copy.
- BlitSurface with different texture targets is inherently broken.
- When target is the same, we can just use FastCopySurface.
- Fixes rendering issues with Breath of the Wild.
2018-11-28 21:56:21 -05:00
ReinUsesLisp
eb700afcf0 gl_shader_decompiler: Remove texture temporal in TLD4 2018-11-28 23:46:16 -03:00
ReinUsesLisp
8d58e5da71 gl_shader_decompiler: Flip negated if else statement 2018-11-28 23:46:16 -03:00
ReinUsesLisp
f4abebd731 gl_shader_decompiler: Use GLSL scope on instructions unrelated to textures 2018-11-28 23:46:14 -03:00
ReinUsesLisp
78fc8f6b66 gl_shader_decompiler: Move texture code generation into lambdas 2018-11-28 23:45:53 -03:00
ReinUsesLisp
ab13b628d0 gl_shader_decompiler: Clean up texture instructions 2018-11-28 23:45:53 -03:00
ReinUsesLisp
6a642022dd gl_shader_decompiler: Scope GLSL variables with a scoped object 2018-11-28 23:45:51 -03:00
ReinUsesLisp
037449668e gl_rasterizer: Signal UNIMPLEMENTED when rt_separate_frag_data is not zero 2018-11-28 21:26:22 -03:00
ReinUsesLisp
653d7a3f0d gl_rasterizer_cache: Use brackets for two-line single-expresion blocks 2018-11-28 21:18:14 -03:00
ReinUsesLisp
432a9872ed gl_rasterizer: Remove unused struct declarations 2018-11-28 21:18:13 -03:00
ReinUsesLisp
22c7c710b4 gl_rasterizer: Remove extension booleans 2018-11-28 21:18:13 -03:00
bunnei
5a9a84994a
Merge pull request #1808 from Tinob/master
Fix clip distance and viewport
2018-11-28 17:47:28 -05:00
bunnei
3fe8ab0d99
Merge pull request #1786 from Tinob/DepthClamp
Add Depth Clamp Support
2018-11-28 17:46:55 -05:00
bunnei
6f849887c9
Merge pull request #1792 from bunnei/dma-pusher
gpu: Rewrite GPU command list processing with DmaPusher class.
2018-11-28 10:12:37 -05:00
bunnei
881f5ad70f
Merge pull request #1735 from FernandoS27/tex-spacing
Texture decoder: Implemented Tile Width Spacing
2018-11-27 19:21:17 -05:00
bunnei
ac74b71d75 dma_pushbuffer: Optimize to avoid loop and copy on Push. 2018-11-27 19:17:33 -05:00
bunnei
c568f5cea7 gpu: Move command list profiling to DmaPusher::DispatchCalls. 2018-11-27 18:42:21 -05:00
ReinUsesLisp
2e9b90abad gl_shader_decompiler: Fixup clip distance index 2018-11-27 15:35:26 -03:00
Markus Wick
8747f5fc0d gl_rasterizer: Fixup for #1723.
On invalidating the streaming buffer, we need to reupload all vertex buffers.
But we don't need to reconfigure the vertex format.
This was a (silly) misstake in #1723.

Thanks at Rodrigo for discovering the issue.

Fun fact, as configuring the vertex format also invalidate the vertex buffer,
this misstake had no affect on the behavior.
2018-11-27 10:32:41 +01:00
bunnei
abea6fa90c gpu: Rewrite GPU command list processing with DmaPusher class.
- More accurate impl., fixes Undertale (among other games).
2018-11-26 23:14:01 -05:00
Rodolfo Bogado
6710eb4892 remove viewport_transform_enabled as it seems to be inactive when valid transforms are used. 2018-11-27 00:04:33 -03:00
ReinUsesLisp
237c2026e9 morton: Fixup compiler warning 2018-11-26 23:22:57 -03:00
Rodolfo Bogado
dfdbfa69e5 Implement depth clamp 2018-11-26 20:56:32 -03:00
Rodolfo Bogado
8e971f5062 Add support for Clip Distance enabled register 2018-11-26 20:45:21 -03:00
bunnei
1856d0ee8a
Merge pull request #1794 from Tinob/master
Add support for viewport_transfom_enable register
2018-11-26 18:34:09 -05:00
bunnei
67a154e23d
Merge pull request #1723 from degasus/dirty_flags
gl_rasterizer: Skip VB upload if the state is clean.
2018-11-26 18:33:22 -05:00
Marcos
cb8d51e37e GPU States: Implement Polygon Offset. This is used in SMO all the time. (#1784)
* GPU States: Implement Polygon Offset. This is used in SMO all the time.

* Clang Format fixes.

* Initialize polygon_offset in the constructor.
2018-11-26 18:31:44 -05:00
bunnei
7684f4d0cf
Merge pull request #1713 from FernandoS27/bra-cc
Implemented BRA CC conditional and FSET CC Setting
2018-11-26 18:28:03 -05:00
bunnei
a41943dc55
Merge pull request #1798 from ReinUsesLisp/y-direction
gl_shader_decompiler: Implement S2R's Y_DIRECTION
2018-11-26 18:25:42 -05:00
FernandoS27
ddfbe0b58d Implemented Tile Width Spacing 2018-11-26 09:05:12 -04:00
bunnei
f9a211220c
Merge pull request #1763 from ReinUsesLisp/bfi
gl_shader_decompiler: Implement BFI_IMM_R
2018-11-25 23:04:57 -05:00
bunnei
d7d1ab15b6
Merge pull request #1760 from ReinUsesLisp/r2p
gl_shader_decompiler: Implement R2P_IMM
2018-11-25 22:38:42 -05:00
bunnei
0394813401
Merge pull request #1782 from FernandoS27/dc
Fixed Coordinate Encodings in TEX and TEXS instructions
2018-11-25 22:36:25 -05:00
bunnei
8ce90a4f0b
Merge pull request #1783 from ReinUsesLisp/clip-distances
gl_shader_decompiler: Implement clip distances
2018-11-25 22:35:30 -05:00
bunnei
ceb4bc22a4
Merge pull request #1796 from ReinUsesLisp/morton-move
video_core: Move morton functions out of gl_rasterizer_cache
2018-11-25 22:35:12 -05:00
Rodolfo Bogado
415e8383ba Limit the amount of viewports tested for state changes only to the usable ones 2018-11-25 12:18:29 -03:00
ReinUsesLisp
924e834b8f gl_shader_decompiler: Implement S2R's Y_DIRECTION 2018-11-25 04:37:29 -03:00
bunnei
7d544c1b9d
Merge pull request #1787 from bunnei/fix-gpu-mm
memory_manager: Do not allow 0 to be a valid GPUVAddr.
2018-11-24 23:45:00 -05:00
ReinUsesLisp
7ff2131cf9 morton: Style changes 2018-11-25 00:38:53 -03:00
ReinUsesLisp
dad3a6718e video_core: Move morton functions to their own file 2018-11-25 00:37:18 -03:00
FernandoS27
8c797464a2 Fix Texture Overlapping 2018-11-24 17:26:42 -04:00
FernandoS27
33afff1870 Implemented BRA CC conditional and FSET CC Setting 2018-11-24 13:25:54 -04:00
Rodolfo Bogado
13f6a603c2 Add support for viewport_transfom_enable register 2018-11-24 13:17:48 -03:00
bunnei
d01bf170c4
Merge pull request #1725 from FernandoS27/gl43
Update OpenGL's backend version from 3.3 to 4.3
2018-11-23 23:56:57 -05:00
bunnei
e23543918b
Merge pull request #1785 from Tinob/master
Add support for clear_flags register
2018-11-23 23:55:56 -05:00
bunnei
b6b78203cc
Merge pull request #1769 from ReinUsesLisp/cc
gl_shader_decompiler: Rename cc to condition code and name internal flags
2018-11-23 23:31:04 -05:00
Rodolfo Bogado
54c2a4cafc Add support for clear_flags register 2018-11-24 00:16:33 -03:00
FernandoS27
7668ef51d6 Fix TEXS Instruction encodings 2018-11-23 22:46:50 -04:00
FernandoS27
9c2127d5eb Fix one encoding in TEX Instruction 2018-11-23 22:46:49 -04:00
FernandoS27
487d805899 Corrected inputs indexing in TEX instruction 2018-11-23 22:46:48 -04:00
bunnei
69b3f98d3a
Merge pull request #1744 from degasus/shader_cache
shader_cache: Only lock covered instructions.
2018-11-23 21:09:36 -05:00
bunnei
0b1842294f memory_manager: Do not allow 0 to be a valid GPUVAddr.
- Fixes a bug with Undertale using 0 for a render target.
2018-11-23 12:58:55 -05:00
Hexagon12
3135dbc29c Added predicate comparison LessEqualWithNan (#1736)
* Added predicate comparison LessEqualWithNan

* oops

* Clang fix
2018-11-23 08:51:32 -08:00
bunnei
c4b5319446
Merge pull request #1756 from ReinUsesLisp/fix-textures
gl_shader_decompiler: Fix register overwriting on texture calls
2018-11-23 08:49:37 -08:00
bunnei
d77af9f8fd
Merge pull request #1766 from FernandoS27/fix-txq
Properly Implemented TXQ Instruction
2018-11-23 08:48:57 -08:00
ReinUsesLisp
b3853403b7 gl_shader_decompiler: Implement clip distances 2018-11-23 02:14:43 -03:00
ReinUsesLisp
c9ac23683b gl_shader_decompiler: Add a message for unimplemented cc generation 2018-11-22 16:12:27 -03:00
bunnei
50d2abaaa9
Merge pull request #1775 from bunnei/blend-eq
maxwell_3d: Implement alternate blend equations.
2018-11-22 08:44:05 -08:00
bunnei
e633021532
Merge pull request #1764 from bunnei/macrointerpreter
macro_interpreter: Implement AddWithCarry and SubtractWithBorrow.
2018-11-22 08:43:29 -08:00
bunnei
033b46253e macro_interpreter: Implement AddWithCarry and SubtractWithBorrow.
- Used by Undertale.
2018-11-22 00:58:00 -05:00
bunnei
0e6a608245 maxwell_3d: Implement alternate blend equations.
- Used by Undertale.
2018-11-22 00:51:01 -05:00
bunnei
b84f4cfb62
Merge pull request #1737 from FernandoS27/layer-copy
Implemented Fast Layered Copy
2018-11-21 21:39:16 -08:00
ReinUsesLisp
74eb16521f gl_shader_decompiler: Rename internal flag strings 2018-11-21 22:31:42 -03:00
ReinUsesLisp
8a5e6fce07 gl_shader_decompiler: Rename control codes to condition codes 2018-11-21 22:31:16 -03:00
ReinUsesLisp
864cbbaf4c gl_shader_decompiler: Fix register overwriting on texture calls 2018-11-21 21:21:19 -03:00
bunnei
ec38b4e883
Merge pull request #1753 from FernandoS27/ufbtype
Use default values for unknown framebuffer pixel format
2018-11-21 14:15:27 -08:00
bunnei
61586e8794
Merge pull request #1752 from ReinUsesLisp/unimpl-decompiler
gl_shader_decompiler: Use UNIMPLEMENTED when applicable
2018-11-21 14:13:28 -08:00
FernandoS27
4a6a9b6622 Properly Implemented TXQ Instruction 2018-11-21 18:12:36 -04:00
ReinUsesLisp
642dfeda2a gl_shader_decompiler: Implement BFI_IMM_R 2018-11-21 16:12:30 -03:00
bunnei
bb175ab430
Merge pull request #1754 from ReinUsesLisp/zero-register
gl_shader_decompiler: Remove UNREACHABLE when setting RZ
2018-11-21 08:06:29 -08:00
FernandoS27
0368260c99 Removed pre 4.3 ARB extensions 2018-11-21 11:43:17 -04:00
FernandoS27
0a9fedfac9 Use default values for unknown framebuffer pixel format 2018-11-21 07:33:34 -04:00
ReinUsesLisp
d92afc7493 gl_shader_decompiler: Implement R2P_IMM 2018-11-21 04:56:00 -03:00
ReinUsesLisp
423a3ed2c8 gl_shader_decompiler: Remove UNREACHABLE when setting RZ 2018-11-20 22:23:10 -03:00
ReinUsesLisp
bb893188eb gl_shader_decompiler: Use UNIMPLEMENTED instead of LOG+UNREACHABLE when applicable 2018-11-20 22:00:13 -03:00
bunnei
1a543723ab maxwell_3d: Initialize rasterizer color mask registers as enabled.
- Fixes rendering regression with Sonic Mania.
2018-11-20 19:58:06 -05:00
Markus Wick
cfbae58b2b shader_cache: Only lock covered instructions. 2018-11-20 21:58:31 +01:00
FernandoS27
eb36463e03 Implemented Fast Layered Copy 2018-11-19 19:51:13 -04:00
bunnei
f02b125ac8
Merge pull request #1717 from FreddyFunk/swizzle-gob
textures/decoders: Replace magic numbers
2018-11-18 20:13:00 -08:00
bunnei
6dc33fb812
Merge pull request #1693 from Tinob/master
Missing ogl states
2018-11-18 19:59:10 -08:00
Frederic L
11a1442229 Eliminated unnessessary memory allocation and copy (#1702) 2018-11-18 19:53:03 -08:00
ReinUsesLisp
29e7c76d66 gl_rasterizer: Remove default clip distance 2018-11-18 23:57:52 -03:00
Rodolfo Bogado
4d1a0a24cc drop support for non separate alpha as it seems to cause issues in some games 2018-11-18 03:44:48 -03:00
Rodolfo Bogado
81a9c5fe6f fix sampler configuration, thanks to Marcos for his investigation 2018-11-17 19:59:34 -03:00
Rodolfo Bogado
b312cca756 small type fix 2018-11-17 19:59:34 -03:00
Rodolfo Bogado
5297495c87 small fix for alphaToOne bit location 2018-11-17 19:59:34 -03:00
Rodolfo Bogado
e69eb3c760 fix for gcc compilation 2018-11-17 19:59:34 -03:00
Rodolfo Bogado
53b4a1af0f add AlphaToCoverage and AlphaToOne 2018-11-17 19:59:34 -03:00
Rodolfo Bogado
8ed7e1af2c add support for fragment_color_clamp 2018-11-17 19:59:33 -03:00
Rodolfo Bogado
02c22a3440 add missing MirrorOnceBorder support where supported 2018-11-17 19:59:33 -03:00
Rodolfo Bogado
1d60bb6544 set border color not depending on the wrap mode
only enable color mask for the first framebuffer id independent blending is disabled
2018-11-17 19:59:33 -03:00
Rodolfo Bogado
6a2aa6dbdb set default value for point size register 2018-11-17 19:59:33 -03:00
Rodolfo Bogado
1881e86c43 fix viewport and scissor behavior 2018-11-17 19:59:32 -03:00
Markus Wick
97f5c4ffd3 gl_rasterizer: Skip VB upload if the state is clean. 2018-11-17 14:28:54 +01:00
Frederic Laing
7a400e2191 textures/decoders: Replace magic numbers 2018-11-17 01:55:28 +01:00
bunnei
405dd03dae
Merge pull request #1700 from FreddyFunk/cleanup
gl_rasterizer_chache: Minor cleanup
2018-11-16 07:07:30 -08:00
bunnei
241646f423
Merge pull request #1701 from FreddyFunk/decoders
textures/decoders: Minor cleanup
2018-11-16 07:07:11 -08:00
bunnei
0b701751da
Merge pull request #1676 from lioncash/warn
gl_state: Amend compilation warnings
2018-11-16 07:00:03 -08:00
Frederic Laing
95d3965f31 textures/decoders: Minor cleanup 2018-11-15 21:04:17 +01:00
Frederic Laing
3844b5c0c5 gl_rasterizer_chache: Minor cleanup 2018-11-15 20:10:05 +01:00
bunnei
8a537a2021
Merge pull request #1637 from FernandoS27/cache
Improved GPU Caches lookup Speed
2018-11-14 19:07:52 -08:00
bunnei
3bd503d59c
Merge pull request #1662 from FreddyFunk/CopySurface-Optimization
gl_rasterizer_cache: CopySurface optimization
2018-11-13 18:58:12 -08:00
bunnei
d2b2b05b6a
Merge pull request #1685 from lioncash/base
video_core/renderer_base: Remove GL include from the renderer base class files
2018-11-13 18:53:59 -08:00
Lioncash
4ed9ef15c4 video_core/renderer_base: Remove GL include from the renderer base class files
Keeps the base class source files implementation-agnostic.
2018-11-13 14:38:13 -05:00
Frederic L
ab362aa7e5
gl_rasterizer: Minor cleanup
Minor code cleanup from unaddressed feedback in #1654
2018-11-13 14:07:23 +01:00
Lioncash
9a0fb7d9fb gl_state: Amend compilation warnings
Makes float -> integral conversions explicit via casts and also silences
a sign conversion warning.
2018-11-13 07:10:40 -05:00
bunnei
65bd03d74c
Merge pull request #1628 from greggameplayer/Texture2DArray
Implement SurfaceTarget Texture2DArray
2018-11-12 21:13:47 -08:00
greggameplayer
c8b3f09876 Implement ASTC_2D_10X8 & ASTC_2D_10X8_SRGB (#1666)
* Implement ASTC_2D_10X8 & ASTC_2D_10X8_SRGB
( needed by Mario+Rabbids Kingdom Battle )

* Small placement correction
2018-11-12 18:34:54 -08:00
bunnei
2c6efda235
Merge pull request #1660 from Tinob/master
Map more missing opengl states
2018-11-11 19:58:16 -08:00
bunnei
a264fac943
Merge pull request #1664 from FreddyFunk/cast2
gl_rasterizer: Fix compiler warnings
2018-11-11 12:18:29 -08:00
Rodolfo Bogado
72b1fae984 Use core extensions when available to set max anisotropic filtering level 2018-11-11 16:36:53 -03:00
Rodolfo Bogado
4e6c64bf8d Improve state management by splitting some of the states id separated function to avoid a full apply overhead 2018-11-11 16:36:53 -03:00
Rodolfo Bogado
4a6eff3b7b Try to fix problems with stencil test in some games, relax translation to opengl enums to avoid crashing and only generate logs of the errors. 2018-11-11 16:31:00 -03:00
Rodolfo Bogado
e9610ec0dd set sampler max lod, min lod, lod bias and max anisotropy 2018-11-11 16:31:00 -03:00
FernandoS27
3088e36237 Improved GPU Caches lookup Speed 2018-11-11 12:53:25 -04:00
bunnei
eaee73f95d
Merge pull request #1669 from ReinUsesLisp/fixup-gs
gl_shader_decompiler: Guard out of bound geometry shader input reads
2018-11-11 08:28:20 -08:00
bunnei
c82bccab56
Merge pull request #1663 from lioncash/raster
rasterizer_cache: Remove reliance on the System singleton
2018-11-11 08:20:27 -08:00
bunnei
1916213311
Merge pull request #1648 from FernandoS27/texs-3-array
Implement 3 coordinate array in TEXS instruction
2018-11-11 08:18:27 -08:00
bunnei
8ea6261547
Merge pull request #1654 from degasus/dirty_flags
gl_rasterizer: Skip VAO binding if the state is clean.
2018-11-11 08:17:57 -08:00
ReinUsesLisp
8d4bb10d44 gl_shader_decompiler: Guard out of bound geometry shader input reads
Geometry shaders follow a pattern that results in out of bound reads.
This pattern is:
- VSETP to predicate
- Use that predicate to conditionally set a register a big number
- Use the register to access geometry shaders
At the time of writing this commit I don't know what's the intent of
this number. Some drivers argue about these out of bound reads. To avoid
this issue, input reads are guarded limiting reads to the highest
posible vertex input of the current topology (e.g. points to 1 and
triangles to 3).
2018-11-10 03:10:50 -03:00
Frederic Laing
e2bf581e3a gl_rasterizer_cache: Remove unnecessary memory allocation and copy in CopySurface 2018-11-08 16:50:09 +01:00
Frederic Laing
1d36aec267 gl_rasterizer: Fix compiler warnings 2018-11-08 13:33:30 +01:00
Lioncash
9046f764bf rasterizer_cache: Remove reliance on the System singleton
Rather than have a transparent dependency, we can make it explicit in
the interface. This also gets rid of the need to put the core include in
a header.
2018-11-08 06:16:38 -05:00
Lioncash
9de523fd90 rasterizer_cache: Add missing virtual destructor to RasterizerCacheObject
Ensures that destruction will always do the right thing in any context.
2018-11-08 00:31:39 -05:00
Lioncash
29f082775b gl_resource_manager: Amend clang-format discrepancies
Fixes the buildbot.
2018-11-08 00:23:45 -05:00
FernandoS27
d347623d6f Correct issue where texturelod could not be applied to 2darrayshadow 2018-11-07 21:48:45 -04:00
FernandoS27
ad2f47b579 Implement 3 coordinate array in TEXS instruction 2018-11-07 17:04:30 -04:00
bunnei
81ff9e2473
Merge pull request #1630 from bunnei/fix-mapbufferex
memory_manager: Do not MapBufferEx over already in use memory.
2018-11-07 00:14:36 -08:00
bunnei
74bce4d68f
Merge pull request #1635 from Tinob/master
Implement multi-target viewports and blending
2018-11-07 00:11:49 -08:00
Markus Wick
359db6a673 gl_rasterizer: Skip VAO binding if the state is clean. 2018-11-06 22:31:33 +01:00
Markus Wick
0590dd2971 gl_rasterizer: Split VAO and VB setup functions. 2018-11-06 22:31:33 +01:00
greggameplayer
d3b9599b2d
Merge branch 'master' into Texture2DArray 2018-11-06 19:05:57 +01:00
Markus Wick
2c87f10267 gl_rasterizer_cache: Add profiles for Copy and Blit.
They were missed, and Copy is very high in profile here. It doesn't block the GPU,
but it stalls the driver thread. So with our bad GL instructions, this might block quite a while.
2018-11-06 17:45:32 +01:00
Markus Wick
7e59e907ef gl_resource_manager: Profile creation and deletion. 2018-11-06 17:45:32 +01:00
Markus Wick
80e4dbdce7 gl_stream_buffer: Profile orphaning of stream buffer.
This serialize to the driver thread and so it may block for a while.
So if it is in the benchmark, we get noticed if it happens too often.
2018-11-06 17:45:32 +01:00
Markus Wick
54df9fe29e gl_resource_manager: Split implementations in .cpp file.
Those implementations are quite costly, so there is no need to inline them to the caller.
Ressource deletion is often a performance bug, so in this way, we support to add breakpoints to them.
2018-11-06 14:40:39 +01:00
bunnei
cdb19e71fe
Merge pull request #1616 from FernandoS27/cube-array
Implement Cube Arrays
2018-11-05 15:28:48 -05:00
Rodolfo Bogado
19038db489 Add support to color mask to avoid issues in blending caused by wrong values in the alpha channel in some render targets. 2018-11-05 00:24:19 -03:00
Rodolfo Bogado
145ae36963 Implement multi-target viewports and blending 2018-11-04 20:49:48 -03:00
bunnei
38c1c500ab
Merge pull request #1625 from FernandoS27/astc
Implement ASTC Textures 5x5 and fix a bunch of ASTC texture problems
2018-11-04 18:47:26 -05:00
greggameplayer
9249fadb9e
correct syntax 2018-11-02 14:28:28 +01:00
greggameplayer
cb8e4a4633
Merge branch 'master' into Texture2DArray 2018-11-02 14:26:32 +01:00
FernandoS27
60a184455c Fix ASTC Decompressor to support depth parameter 2018-11-01 19:22:12 -04:00
bunnei
4aa9779ae1 memory_manager: Do not MapBufferEx over already in use memory.
- This fixes rendering when changing areas in Super Mario Odyssey.
2018-11-01 18:57:59 -04:00
bunnei
cc1fe93297
Merge pull request #1623 from Tinob/master
Improve OpenGL state handling
2018-11-01 15:53:33 -04:00
FernandoS27
aee93f98f9 Fix ASTC formats 2018-11-01 13:08:19 -04:00
FernandoS27
31930a3334 Implemented ASTC 5x5 2018-11-01 13:06:24 -04:00
FernandoS27
678c18aa5c Implement Cube Arrays 2018-11-01 11:56:19 -04:00
bunnei
9afcbba8e4
Merge pull request #1527 from FernandoS27/assert-flow
Assert Control Flow Instructions using Control Codes
2018-11-01 00:34:56 -04:00
bunnei
de0ab806df maxwell_3d: Restructure macro upload to use a single macro code memory.
- Fixes an issue where macros could be skipped.
- Fixes rendering of distant objects in Super Mario Odyssey.
2018-10-31 23:29:21 -04:00
bunnei
86e70cf302
Merge pull request #1528 from FernandoS27/assert-control-codes
Assert Control Codes Generation on Shader Instructions
2018-10-31 22:34:18 -04:00
greggameplayer
9ae972ab4e Implement SurfaceTarget Texture2DArray
( needed by Mario+Rabbids Kingdom Battle )
2018-10-31 04:29:15 +01:00
Rodolfo Bogado
aca218aea0 Improve OpenGL state handling 2018-10-30 21:19:04 -03:00
ReinUsesLisp
76754f5705 video_core: Move surface declarations out of gl_rasterizer_cache 2018-10-30 16:07:20 -03:00
FernandoS27
5bb80ab009 Assert Control Codes Generation 2018-10-30 13:37:55 -04:00
Frederic L
7a5eda5914 global: Use std::optional instead of boost::optional (#1578)
* get rid of boost::optional

* Remove optional references

* Use std::reference_wrapper for optional references

* Fix clang format

* Fix clang format part 2

* Adressed feedback

* Fix clang format and MacOS build
2018-10-30 00:03:25 -04:00
bunnei
c5a849212f
Merge pull request #1580 from FernandoS27/mm-impl
Implemented Mipmaps
2018-10-29 22:34:00 -04:00
bunnei
0270906dbf
Merge pull request #1613 from ReinUsesLisp/gl-utils
video_core: Move OpenGL specific utils to its renderer
2018-10-29 13:22:14 -04:00
bunnei
5d7167dfca
Merge pull request #1610 from slashiee/dxt1-alpha
renderer_opengl: Enable alpha channel for DXT1 texture format
2018-10-28 21:29:43 -04:00
ReinUsesLisp
80cbd81276 video_core: Move OpenGL specific utils to its renderer 2018-10-28 22:22:30 -03:00
Rodolfo Bogado
e8b565b239 renderer_opengl: Correct bpp value for ASTC_2D_8X5_SRGB 2018-10-28 20:52:57 -03:00
FernandoS27
3aa8b644a9 Assert Control Flow Instructions using Control Codes 2018-10-28 19:16:41 -04:00
FernandoS27
dde3094058 Fixed black textures, pixelation and we no longer require to auto-generate mipmaps 2018-10-28 19:00:49 -04:00
FernandoS27
f0e902a7d6 Fixed mipmap block autosizing algorithm 2018-10-28 19:00:05 -04:00
FernandoS27
87f8181405 Fixed Invalid Image size and Mipmap calculation 2018-10-28 19:00:04 -04:00
FernandoS27
f4432b5d0c Fixed Block Resizing algorithm and Clang Format 2018-10-28 19:00:03 -04:00
FernandoS27
258f0f5c31 Implement Mip Filter 2018-10-28 19:00:01 -04:00
FernandoS27
dc85e3bff1 Zero out memory region of recreated surface before flushing 2018-10-28 19:00:00 -04:00
FernandoS27
bbf3b2da0c Implement Mipmaps 2018-10-28 18:59:59 -04:00
Michael
635d1e5651 Enable alpha channel for DXT1 texture format 2018-10-28 14:11:04 -07:00
Tobias
351d5a2227
Correct bpp value for ASTC_2D_8X5 2018-10-28 19:49:10 +01:00
bunnei
aa1cf608ed
Merge pull request #1601 from FernandoS27/shader-precision
Improved Shader accuracy on Vertex and Geometry Shaders.
2018-10-28 13:06:21 -04:00
FernandoS27
e5ca097e32 Refactor precise usage and add FMNMX, MUFU, FMUL32 and FADD332 2018-10-28 11:38:40 -04:00
Rodolfo Bogado
0287b2be6d Implement sRGB Support, including workarounds for nvidia driver issues and QT sRGB support 2018-10-28 01:13:55 -03:00
bunnei
d63f5acb15
Merge pull request #1594 from FreddyFunk/static-cast
gl_rasterizer_cache: Fix compiler warning
2018-10-27 21:09:06 -04:00
FernandoS27
d8d557df86 Improved Shader accuracy on Vertex and Geometry Shaders with FFMA, FMUL and FADD 2018-10-27 20:09:26 -04:00
bunnei
ed95ce6bb7
Merge pull request #1592 from bunnei/prim-restart
gl_rasterizer: Implement primitive restart.
2018-10-27 13:25:00 -04:00
FernandoS27
705300992e Implement Default Block Height for each format 2018-10-27 10:17:39 -04:00
Frederic Laing
0bf24d310e gl_rasterizer_cache: Fix compiler warning 2018-10-27 13:06:26 +02:00
bunnei
58444a0376 gl_rasterizer: Implement primitive restart. 2018-10-26 00:42:57 -04:00
bunnei
d278f25bda
Merge pull request #1533 from FernandoS27/lmem
Implemented Shader Local Memory
2018-10-26 00:16:25 -04:00
bunnei
949d9a7136 maxwell_3d: Add code for initializing register defaults. 2018-10-25 23:42:39 -04:00
bunnei
8cea598158 gl_rasterizer: Implement depth range. 2018-10-25 21:53:24 -04:00
bunnei
f7a173de6c
Merge pull request #1524 from FernandoS27/layers-fix
rasterizer: Fix Layered Textures Loading and Cubemaps
2018-10-25 00:29:18 -04:00
FernandoS27
ca142f35c0 Implemented LD_L and ST_L 2018-10-24 17:51:53 -04:00
FernandoS27
abefe29398 Implement Shader Local Memory 2018-10-24 17:50:43 -04:00
bunnei
69b35d7615
Merge pull request #1554 from FernandoS27/pointsize
Implement PointSize Output Attribute.
2018-10-24 17:38:38 -04:00
Lioncash
257b7bbfee
decoders: Remove unused variable within SwizzledData() 2018-10-23 23:51:13 -04:00
Lioncash
a97cdb5eb4
maxwell_3d: Remove unused variable within ProcessQueryGet() 2018-10-23 23:50:16 -04:00
FernandoS27
ed8ca608a0 Implement PointSize 2018-10-23 15:08:00 -04:00
FernandoS27
e0ea2f5f6e Fixed Layered Textures Loading and Cubemaps 2018-10-23 14:27:36 -04:00
bunnei
5716496239
Merge pull request #1519 from ReinUsesLisp/vsetp
gl_shader_decompiler: Implement VSETP
2018-10-23 10:22:37 -04:00
bunnei
0f3d8c2574
Merge pull request #1539 from lioncash/dma
maxwell_dma: Silence compilation warnings
2018-10-23 10:22:12 -04:00
bunnei
75d807788c
Merge pull request #1470 from FernandoS27/alpha_testing
Implemented Alpha Test using Shader Emulation
2018-10-23 10:21:30 -04:00
ReinUsesLisp
7d6dca0d0a gl_shader_decompiler: Implement VSETP 2018-10-23 01:07:20 -03:00
ReinUsesLisp
5dfb43531c gl_shader_decompiler: Abstract VMAD into a video subset 2018-10-23 01:07:20 -03:00
bunnei
848a49112a
Merge pull request #1512 from ReinUsesLisp/brk
gl_shader_decompiler: Implement PBK and BRK
2018-10-23 00:01:38 -04:00
bunnei
496d155d7b
Merge pull request #1550 from FernandoS27/fmul32
Added Saturation to FMUL32I
2018-10-22 23:58:09 -04:00
bunnei
4cccfb4190
Merge pull request #1537 from lioncash/shader
gl_shader_decompiler: Minor changes
2018-10-22 22:49:49 -04:00
FernandoS27
259da93567 Added Saturation to FMUL32I 2018-10-22 20:22:15 -04:00
FernandoS27
8e1239fbc5 Assert that multiple render targets are not set while alpha testing 2018-10-22 15:35:45 -04:00
FernandoS27
59a004f915 Use standard UBO and fix/stylize the code 2018-10-22 15:07:33 -04:00
FernandoS27
17315cee16 Cache uniform locations and restructure the implementation 2018-10-22 15:07:32 -04:00
FernandoS27
bcb5b924fd Remove SyncAlphaTest and clang format 2018-10-22 15:07:31 -04:00
FernandoS27
7b39107e3a Added Alpha Func 2018-10-22 15:07:30 -04:00
FernandoS27
aa620c14af Implemented Alpha Testing 2018-10-22 15:07:30 -04:00
bunnei
1226a5706e
Merge pull request #1547 from FernandoS27/fix-fset
Fixed FSETP and FSET
2018-10-22 12:53:47 -04:00
FernandoS27
5c5b4e8e7d Fixed FSETP and FSET 2018-10-22 11:31:17 -04:00
FernandoS27
e2416bbd1f Fixed VAOs Float types only returning GL_FLOAT in cases that they had to return GL_HALF_FLOAT 2018-10-22 09:27:00 -04:00
Lioncash
c1e5525fc6 engines/maxwell_*: Use nested namespace specifiers where applicable
These three source files are the only ones within the engines directory
that don't use nested namespaces. We may as well change these over to
keep things consistent.
2018-10-20 15:58:09 -04:00
Lioncash
d53c73adaa maxwell_dma: Make variables const where applicable within HandleCopy()
These are never modified, so we can make that assumption explicit.
2018-10-20 15:56:01 -04:00
Lioncash
dd1ee39426 maxwell_dma: Make FlushAndInvalidate's size parameter a u64
This prevents truncation warnings at the lambda's usage sites.
2018-10-20 15:54:45 -04:00
Lioncash
08e574eec4 maxwell_dma: Remove unused variables in HandleCopy()
These pointer variables are never used, so we can get rid of them.
2018-10-20 15:53:24 -04:00
Lioncash
8a86c8d48b gl_shader_decompiler: Allow std::move to function in SetPredicate
If the variable being moved is const, then std::move will always perform
a copy (since it can't actually move the data).
2018-10-20 14:25:15 -04:00
Lioncash
381baf783d gl_shader_decompiler: Get rid of variable shadowing warnings
A variable with the same name was previously declared in an outer scope.
2018-10-20 14:22:37 -04:00
Lioncash
61ef8af1e2 gl_shader_decompiler: Fix a few comment typos 2018-10-20 14:19:28 -04:00
ReinUsesLisp
3ec795d95e gl_shader_decompiler: Move position varying declaration back to gl_shader_gen
The intention of declaring them in gl_shader_decompiler was to be able
to use blocks to implement geometry shaders. But that wasn't needed in
the end and it caused issues when both vertex stages were being used,
resulting in a redeclaration of "position".
2018-10-20 02:19:30 -03:00
bunnei
b1f8bff7db
Merge pull request #1501 from ReinUsesLisp/half-float
gl_shader_decompiler: Implement H* instructions
2018-10-19 23:47:19 -04:00
bunnei
7e665c2721 GPU: Improved implementation of maxwell DMA (Subv). 2018-10-18 22:41:53 -04:00
bunnei
bcde71d4d9 decoders: Introduce functions for un/swizzling subrects. 2018-10-18 22:41:43 -04:00
bunnei
a5d853a9f8 GPU: Invalidate destination address of kepler_memory writes. 2018-10-18 22:41:13 -04:00
bunnei
6b333d862b fermi_2d: Add support for more accurate surface copies. 2018-10-18 22:41:12 -04:00
bunnei
6acd8d166a
Merge pull request #1505 from FernandoS27/tex-3d
Implemented 3D Textures
2018-10-18 11:50:42 -04:00
ReinUsesLisp
41fb25349a gl_shader_decompiler: Implement PBK and BRK 2018-10-17 21:30:45 -03:00
bunnei
77e2d68df7
Merge pull request #1489 from FernandoS27/fix-tlds
shader_decompiler: Fix TLDS
2018-10-17 18:58:38 -04:00
FernandoS27
caaa9914fd Clang format and other fixes 2018-10-17 18:52:11 -04:00
FernandoS27
cb9fdc7a26 Implement Reinterpret Surface, to accurately blit 3D textures 2018-10-17 18:52:10 -04:00
FernandoS27
dbc34db6ce Implement GetInRange in the Rasterizer Cache 2018-10-17 18:52:10 -04:00
FernandoS27
fd9e2d0073 Implement 3D Textures 2018-10-17 18:52:08 -04:00
bunnei
f912a82a8e
Merge pull request #1497 from bunnei/flush-framebuffers
Implement flushing in the rasterizer cache
2018-10-17 18:40:34 -04:00
bunnei
86dcf2942b
Merge pull request #1496 from FernandoS27/tex-array
Implement Arrays on Tex Instruction
2018-10-17 18:30:44 -04:00
bunnei
648b55c6b9 gl_rasterizer_cache: Remove unnecessary block_depth=1 on Flush. 2018-10-17 18:20:15 -04:00
bunnei
2a035a1f6f gl_rasterizer_cache: Remove unnecessary temporary buffer with unswizzle. 2018-10-17 18:19:35 -04:00
bunnei
43b9494a0f gl_rasterizer_cache: Use AccurateCopySurface for use_accurate_gpu_emulation. 2018-10-16 17:20:49 -04:00
bunnei
ee7c2dbf5a config: Rename use_accurate_framebuffers -> use_accurate_gpu_emulation.
- This will be used as a catch-all for slow-but-accurate GPU emulation paths.
2018-10-16 17:02:29 -04:00
bunnei
91602de7f2 rasterizer_cache: Refactor to support in-order flushing. 2018-10-16 16:51:53 -04:00
bunnei
0e59291310 gl_rasterizer_cache: Refactor to only call GetRegionEnd on surface creation. 2018-10-16 11:31:02 -04:00
bunnei
949d7832fa gl_rasterizer_cache: Only flush when use_accurate_framebuffers is enabled. 2018-10-16 11:31:02 -04:00
bunnei
5f79ba04bd gl_rasterizer_cache: Separate guest and host surface size managment. 2018-10-16 11:31:01 -04:00
bunnei
58be4dff79 gl_rasterizer_cache: Rename GetGLBytesPerPixel to GetBytesPerPixel.
- This does not really have anything to do with OpenGL.
2018-10-16 11:31:01 -04:00
bunnei
cf7b46c101 gl_rasterizer_cache: Remove unused FlushSurface method. 2018-10-16 11:31:01 -04:00
bunnei
3afdfd7bfa gl_rasterizer: Implement flushing. 2018-10-16 11:31:01 -04:00
bunnei
b4e29ccb81 gl_rasterizer_cache: Remove usage of Memory::Read/Write functions.
- These cannot be used within the cache, as they change cache state.
2018-10-16 11:31:00 -04:00
bunnei
4e9683e9d5 gl_rasterizer_cache: Clamp cached surface size to mapped GPU region size. 2018-10-16 11:31:00 -04:00
bunnei
37575eae65 memory_manager: Add a method for querying the end of a mapped GPU region. 2018-10-16 11:31:00 -04:00
bunnei
0be7e82289 rasterizer_cache: Reintroduce method for flushing. 2018-10-16 11:31:00 -04:00
bunnei
9b929e934b gl_rasterizer_cache: Reintroduce code for handling swizzle and flush to guest RAM. 2018-10-16 11:30:59 -04:00
ReinUsesLisp
936c36a514 shader_bytecode: Add Control Code enum 0xf
Control Code 0xf means to unconditionally execute the instruction. This
value is passed to most BRA, EXIT and SYNC instructions (among others)
but this may not always be the case.
2018-10-15 15:36:47 -03:00
ReinUsesLisp
b461342a84 gl_shader_decompiler: Fixup style inconsistencies 2018-10-15 15:35:26 -03:00
ReinUsesLisp
27916764b1 gl_rasterizer: Silence implicit cast warning in glBindBufferRange 2018-10-15 15:26:50 -03:00
ReinUsesLisp
6312eec5ef gl_shader_decompiler: Implement HSET2_R 2018-10-15 02:55:51 -03:00
ReinUsesLisp
4fc8ad67bf gl_shader_decompiler: Implement HSETP2_R 2018-10-15 02:55:51 -03:00
ReinUsesLisp
3d65aa4caf gl_shader_decompiler: Implement HFMA2 instructions 2018-10-15 02:55:51 -03:00
ReinUsesLisp
d93cdc2750 gl_shader_decompiler: Implement HADD2_IMM and HMUL2_IMM 2018-10-15 02:07:16 -03:00
ReinUsesLisp
d46e2a6e7a gl_shader_decompiler: Implement non-immediate HADD2 and HMUL2 instructions 2018-10-15 02:04:31 -03:00
ReinUsesLisp
08d751d882 gl_shader_decompiler: Setup base for half float unpacking and setting 2018-10-15 01:58:30 -03:00
bunnei
14286f70f0
Merge pull request #1488 from Hexagon12/astc-types
video_core: Added ASTC 5x4; 8x5 types
2018-10-14 14:44:24 -04:00
FernandoS27
1d6559fbd3 Implement Arrays on Tex Instruction 2018-10-14 13:31:02 -04:00
FernandoS27
d880b77698 Fix TLDS 2018-10-13 22:14:25 -04:00
FernandoS27
331ce2942c Shorten the implementation of 3D swizzle to only 3 functions 2018-10-13 20:58:00 -04:00
FernandoS27
1ff20d8538 Fix a Crash on Zelda BotW and Splatoon 2, and simplified LoadGLBuffer 2018-10-13 16:11:11 -04:00
FernandoS27
e0ca938b22 Propagate depth and depth_block on modules using decoders 2018-10-13 15:25:18 -04:00
FernandoS27
d4ae43f9c1 Remove old Swizzle algorithms and use 3d Swizzle 2018-10-13 15:25:17 -04:00
FernandoS27
4d959c6bdc Implement Precise 3D Swizzle 2018-10-13 15:25:16 -04:00
FernandoS27
736db284d2 Implement Fast 3D Swizzle 2018-10-13 15:25:15 -04:00
Hexagon12
cbf723896f Added ASTC 5x4; 8x5 2018-10-13 17:10:26 +03:00
FernandoS27
97b6405a17 Implemented helper function to correctly calculate a texture's size 2018-10-12 14:21:53 -04:00
ReinUsesLisp
17290a4416 gl_shader_decompiler: Implement VMAD 2018-10-11 04:15:10 -03:00
bunnei
6d82c4adf9
Merge pull request #1458 from FernandoS27/fix-render-target-block-settings
Fixed block height settings for RenderTargets and Depth Buffers
2018-10-10 21:24:07 -04:00
bunnei
03ec936ca0
Merge pull request #1460 from FernandoS27/scissor_test
Implemented Scissor Testing
2018-10-10 12:04:10 -04:00
bunnei
ee1b204749
Merge pull request #1425 from ReinUsesLisp/geometry-shaders
gl_shader_decompiler: Implement geometry shaders
2018-10-10 11:51:29 -04:00
FernandoS27
5f4ee6f0c8 Add memory Layout to Render Targets and Depth Buffers 2018-10-09 22:28:19 -04:00
FernandoS27
af653906d0 Fixed block height settings for RenderTargets and Depth Buffers, and added block width and block depth 2018-10-09 21:14:32 -04:00
Lioncash
6e27c5d4d1 gl_shader_decompiler: Remove unused variables in TMML's implementation
Given "y" isn't always used, but "x" is, we can rearrange this to avoid
unused variable warnings by changing the names of op_a and op_b
2018-10-09 15:44:37 -04:00
FernandoS27
be97fc884d Implement Scissor Test 2018-10-08 21:36:23 -04:00
FernandoS27
30ff42b8cc Assert Scissor tests 2018-10-08 20:49:36 -04:00
ReinUsesLisp
7c2d6ef210 gl_shader_decompiler: Move position varying location from 15 to 0 and apply an offset 2018-10-07 17:36:00 -03:00
ReinUsesLisp
ee4d538850 gl_shader_decompiler: Implement geometry shaders 2018-10-07 17:36:00 -03:00
ReinUsesLisp
4d0c682468 video_core: Allow LabelGLObject to use extra info on any object 2018-10-07 17:27:49 -03:00
bunnei
2c0b0ad50d
Merge pull request #1446 from bunnei/fast_fermi_copy
gl_rasterizer: Implement accelerated Fermi2D copies.
2018-10-06 23:18:52 -04:00
bunnei
1cc5e6e9bc
Merge pull request #1437 from FernandoS27/tex-mode2
Implemented Depth Compare, Shadow Samplers and Texture Processing Modes for TEXS and TLDS
2018-10-06 23:14:27 -04:00
ReinUsesLisp
0ecd181cca gl_rasterizer: Fixup undefined behaviour in SetupDraw 2018-10-06 23:22:48 -03:00
FernandoS27
752faff2bc Implemented Depth Compare and Shadow Samplers 2018-10-06 11:27:54 -04:00
bunnei
9aec85d39c fermi_2d: Implement simple copies with AccelerateSurfaceCopy. 2018-10-06 03:20:04 -04:00
bunnei
011cf77796 gl_rasterizer: Add rasterizer cache code to handle accerated fermi copies. 2018-10-06 03:20:04 -04:00
bunnei
749aef3dd0 gl_rasterizer_cache: Implement a simpler surface copy using glCopyImageSubData. 2018-10-06 03:20:04 -04:00
ReinUsesLisp
3e2380327a gl_rasterizer: Implement quads topology 2018-10-04 00:03:44 -03:00
FernandoS27
f664437ae8 Implemented Texture Processing Modes in TEXS and TLDS 2018-10-03 08:41:12 -04:00
ReinUsesLisp
1835911425 gl_rasterizer: Fixup unassigned point sizes 2018-10-01 01:18:24 -03:00
bunnei
bc679c9b8c
Merge pull request #1330 from raven02/tlds
TLDS: Add 1D sampler
2018-09-30 21:53:38 -04:00
bunnei
df3799a008 gl_rasterizer_cache: Fixes to how we do render to cubemap.
- Fixes issues with Splatoon 2.
2018-09-30 15:10:14 -04:00
bunnei
29782273ec gl_rasterizer_cache: Add check for array rendering to cubemap texture. 2018-09-30 14:31:58 -04:00
bunnei
f543b43fd0 gl_rasterizer_cache: Implement render to cubemap. 2018-09-30 14:31:58 -04:00
bunnei
15cc729ebd gl_shader_decompiler: TEXS: Implement TextureType::TextureCube. 2018-09-30 14:31:58 -04:00
bunnei
ed2e0e85c9 gl_rasterizer_cache: Add support for SurfaceTarget::TextureCubemap. 2018-09-30 14:31:58 -04:00
bunnei
871580dcd8 gl_rasterizer_cache: Implement LoadGLBuffer for Texture2DArray. 2018-09-30 14:31:57 -04:00
bunnei
a9aa1db552 gl_rasterizer_cache: Update BlitTextures to support non-Texture2D ColorTexture surfaces. 2018-09-30 14:31:57 -04:00
bunnei
2e1cdde994 gl_rasterizer_cache: Track texture target and depth in the cache. 2018-09-30 14:31:57 -04:00
bunnei
fefb003b23 gl_rasterizer_cache: Workaround for Texture2D -> Texture2DArray scenario. 2018-09-30 14:31:56 -04:00
bunnei
ce452049d3 gl_rasterizer_cache: Keep track of surface 2D size separately from total size. 2018-09-30 14:31:56 -04:00
raven02
b92b4bbeaf Fix trailing whitespace 2018-09-30 23:51:10 +08:00
bunnei
fe5962e073
Merge pull request #1411 from ReinUsesLisp/point-size
video_core: Implement point_size and add point state sync
2018-09-29 11:58:39 -04:00
bunnei
8d8366b602
Merge pull request #1406 from ReinUsesLisp/multibind-samplers
gl_state: Pack sampler bindings into a single ARB_multi_bind
2018-09-29 10:55:45 -04:00
ReinUsesLisp
e3e51d3ddb video_core: Implement point_size and add point state sync 2018-09-28 02:13:29 -03:00
ReinUsesLisp
b8f1506aa5 gl_state: Pack sampler bindings into a single ARB_multi_bind 2018-09-28 02:04:22 -03:00
bunnei
62048edc15
Merge pull request #1377 from FernandoS27/faster-swizzle
Improved Fast Swizzle and Legacy Swizzle
2018-09-27 10:00:04 -04:00
ReinUsesLisp
ab65fde9f4 video_core: Add asserts for CS, TFB and alpha testing
Add asserts for compute shader dispatching, transform feedback being
enabled and alpha testing. These have in common that they'll probably break
rendering without logging.
2018-09-25 21:07:00 -03:00
David
9f3fc067bf Added glObjectLabels for renderdoc for textures and shader programs (#1384)
* Added glObjectLabels for renderdoc for textures and shader programs

* Changed hardcoded "Texture" name to reflect the texture type instead

* Removed string initialize
2018-09-23 17:55:41 -04:00
greggameplayer
b91e2d55f3 correct BC6H 2018-09-23 19:17:22 +02:00
bunnei
fb9f273e90
Merge pull request #1380 from lioncash/const
shader_bytecode: Make operator== and operator!= of IpaMode const qualified
2018-09-22 01:37:36 -04:00
bunnei
73eea61614
Merge pull request #1382 from lioncash/inc
gl_state: Remove unused type alias
2018-09-22 01:36:21 -04:00
Lioncash
90746c33c7 gl_state: Remove unused type alias
This isn't used anywhere within the header, so we can remove it, along
with the include that was previously necessary. This also uncovers an
indirect include in the cpp file for the assertion macros.
2018-09-21 18:22:43 -04:00
Lioncash
a8f5fd787f shader_bytecode: Lay out the Ipa-related enums better
This is more consistent with the surrounding enums.
2018-09-21 16:17:31 -04:00
Lioncash
272517cf7e shader_bytecode: Make operator== and operator!= of IpaMode const qualified
These don't affect the state of the struct and can be const member
functions.
2018-09-21 16:17:27 -04:00
bunnei
4f186de069
Merge pull request #1379 from lioncash/bitwise
gl_stream_buffer: Fix use of bitwise OR instead of logical OR in Map()
2018-09-21 14:02:00 -04:00
FernandoS27
57b44200a2 Reverse stride align restriction on FastSwizzle due to lost performance 2018-09-21 12:09:59 -04:00
FernandoS27
d2dd1289bd Join both Swizzle methods within one interface function 2018-09-21 11:42:34 -04:00
FernandoS27
41c6c4593a Standarized Legacy Swizzle to look alike FastSwizzle and use a Swizzling Table instead 2018-09-21 11:34:54 -04:00
FernandoS27
f020319a45 Remove same output bpp restriction on FastSwizzle 2018-09-21 11:10:44 -04:00
FernandoS27
68aaa83836 Improved Legacy Swizzler to be better documented and work better 2018-09-21 10:57:12 -04:00
Lioncash
ba02dd9ebc gl_stream_buffer: Fix use of bitwise OR instead of logical OR in Map()
This was very likely intended to be a logical OR based off the
conditioning and testing of inversion in one case.

Even if this was intentional, this is the kind of non-obvious thing one
should be clarifying with a comment.
2018-09-21 07:59:03 -04:00
Subv
9cd5c61fcf RasterizerGL: Use the correct framebuffer when clearing via the CLEAR_BUFFERS register.
Previously we were clearing the default backbuffer framebuffer.

Found thanks to a Piglit test :)
2018-09-20 22:31:53 -05:00
FernandoS27
bf2f2a715f Improved fast swizzle and removed restrictions to it 2018-09-20 23:06:53 -04:00
raven02
c8f9bbbf85
Merge branch 'master' into tlds 2018-09-19 19:53:11 +08:00
Markus Wick
f465e4aaf2 gl_rasterizer: Fix StartAddress handling with indexed draw calls.
We uploaded the wrong data before. So the offset on the host GPU pointer may work for the first vertices, the last ones run out bounds.
Let's just offset the upload instead.
2018-09-19 09:22:30 +02:00
bunnei
bd88d4108f
Merge pull request #1342 from lioncash/trunc
gl_shader_decompiler: Avoid truncation warnings within LD_A and ST_A code
2018-09-18 22:11:48 -04:00
bunnei
0284cbe7ec
Merge pull request #1279 from FernandoS27/csetp
shader_decompiler: Implemented (Partialy) Control Codes and CSETP
2018-09-18 22:10:48 -04:00
bunnei
6415f81bb8
Merge pull request #1299 from FernandoS27/texture-sanatize
shader_decompiler: Asserts for Texture Instructions
2018-09-18 22:10:09 -04:00
FernandoS27
567a5524b9 Implemented Internal Flags 2018-09-17 20:50:54 -04:00
Lioncash
9a8dbba1e5 gl_shader_decompiler: Avoid truncation warnings within LD_A and ST_A code
These are internally stored as u64 values, so using u32 here causes
truncation warnings. Instead, we can just use u64 and preserve the bit
width.
2018-09-17 19:25:55 -04:00
bunnei
fafc80d72e
Merge pull request #1290 from FernandoS27/shader-header
Implemented (Partialy) Shader Header
2018-09-17 18:53:14 -04:00
FernandoS27
e4bb759c4b Implemented I2I.CC on the NEU control code, used by SMO 2018-09-17 17:42:46 -04:00
FernandoS27
e2ac8fb36d Implemented CSETP 2018-09-17 17:42:44 -04:00
FernandoS27
aac77bbd18 Implemented Control Codes 2018-09-17 17:42:43 -04:00
FernandoS27
31e52113b3 Added asserts for texture misc modes to texture instructions 2018-09-17 12:56:36 -04:00
FernandoS27
55a4756766 Added texture misc modes to texture instructions 2018-09-17 12:51:05 -04:00
bunnei
a94b623dfb
Merge pull request #1311 from FernandoS27/fast-swizzle
Optimized Texture Swizzling
2018-09-17 12:39:34 -04:00
bunnei
27fe8159c5
Merge pull request #1316 from lioncash/shadow
gl_shader_decompiler: Get rid of variable shadowing within LEA instructions
2018-09-17 12:27:35 -04:00
raven02
b91f7d5d67 Add 1D sampler for TLDS - TexelFetch (Mario Rabbids) 2018-09-17 23:25:18 +08:00
bunnei
076add4ccd
Merge pull request #1326 from FearlessTobi/port-4182
Port #4182 from Citra: "Prefix all size_t with std::"
2018-09-17 09:51:47 -04:00
bunnei
3be048e50a
Merge pull request #1329 from raven02/bgr5a1u
Implement RenderTargetFormat::BGR5A1_UNORM
2018-09-17 09:49:00 -04:00
raven02
2845348608 Implement ASTC_2D_8X8 (Bayonetta 2) 2018-09-17 01:04:27 +08:00
bunnei
ba480ea2fb
Merge pull request #1273 from Subv/ld_sizes
Shaders: Implemented multiple-word loads and stores to and from attribute memory.
2018-09-15 15:27:12 -04:00
bunnei
daee15b058
Merge pull request #1271 from Subv/kepler_engine
GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
2018-09-15 13:27:07 -04:00
raven02
0019a36b41 Implement RenderTargetFormat::BGR5A1_UNORM (Pokken Tournament DX) 2018-09-16 00:21:42 +08:00
Subv
c878a819d7 Shaders: Implemented multiple-word loads and stores to and from attribute memory.
This seems to be an optimization performed by nouveau.
2018-09-15 11:21:21 -05:00
fearlessTobi
63c2e32e20 Port #4182 from Citra: "Prefix all size_t with std::" 2018-09-15 15:21:06 +02:00
FernandoS27
f8e994354f Optimized Texture Swizzling 2018-09-14 12:45:49 -04:00
Lioncash
ae128f0375 gl_shader_decompiler: Get rid of variable shadowing within LEA instructions
These variables are already defined within an outer scope.
2018-09-13 21:53:23 -04:00
ReinUsesLisp
a42376dfad Use ARB_multi_bind for uniform buffers (#1287)
* gl_rasterizer: use ARB_multi_bind for uniform buffers

* address feedback
2018-09-12 20:27:43 -04:00
bunnei
4a43fb7e1d gl_rasterizer_cache: B5G6R5U should use GL_RGB8 as an internal format.
- Fixes a regression with Sonic Mania with ARB_texture_storage.
2018-09-12 18:09:31 -04:00
bunnei
cc50857460
Merge pull request #1263 from FernandoS27/tex-mode
shader_decompiler:  Implemented (Partially) Texture Processing Modes
2018-09-12 16:03:34 -04:00
Subv
bb5eb4f20a GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
This engine writes data from a FIFO register into the configured address.
2018-09-12 13:57:08 -05:00
FernandoS27
a99d9db32f Implemented Texture Processing Modes 2018-09-12 12:28:22 -04:00
bunnei
89825766ee
Merge pull request #1278 from tech4me/bg-color-fix
Port Citra #4047 & #4052: add change background color support
2018-09-11 23:13:11 -04:00
bunnei
522a11a11f
Merge pull request #1295 from bunnei/accurate-copies
gl_rasterizer_cache: Improve accuracy of caching and copies.
2018-09-11 23:12:15 -04:00
bunnei
4a9acc87f9
Merge pull request #1294 from degasus/optimizations
gl_rasterizer: Use ARB_texture_storage.
2018-09-11 23:11:36 -04:00
bunnei
7bb226f22d gl_rasterizer_cache: Always blit on recreate, regardless of format.
- Fixes several rendering issues with Super Mario Odyssey.
2018-09-11 22:54:46 -04:00
bunnei
cdddd71d08 gl_shader_cache: Remove cache_width/cache_height.
- This was once an optimization, but we no longer need it with the cache reserve.
- This is also inaccurate.
2018-09-11 20:12:29 -04:00
Markus Wick
3e973bc4c6 gl_rasterizer: Use ARB_texture_storage.
It allows us to use texture views and it reduces the overhead within the GPU driver.

But it disallows us to reallocate the texture, but we don't do so anyways.

In the end, it is the new way to allocate textures, so there is no need to use the old way.
2018-09-11 22:18:46 +02:00
FernandoS27
5c676dc884 Implemented LEA and PSET 2018-09-11 12:50:52 -04:00
FernandoS27
3f0922715a Implemented encodings for LEA and PSET 2018-09-11 12:50:25 -04:00
FernandoS27
2b48cfd44b Replace old FragmentHeader for the new Header 2018-09-11 12:48:19 -04:00
FernandoS27
e926757c8f Implemented (Partialy) Shader Header 2018-09-11 12:34:27 -04:00
David Marcec
4c3bd33be2 Fixed renderdoc input/output textures not working due to render targets 2018-09-11 13:31:20 +10:00
bunnei
d6e8e16a66
Merge pull request #1286 from bunnei/multi-clear
gl_rasterizer: Implement clear for non-zero render targets.
2018-09-10 20:30:14 -04:00
bunnei
12445b476d
Merge pull request #1285 from bunnei/depth-fix
gl_rasterizer_cache: Only use depth for applicable texture formats.
2018-09-10 20:28:40 -04:00
bunnei
d884e805c5
Merge pull request #1284 from bunnei/bgra8_srgb
gl_rasterizer_cache: Implement RenderTargetFormat::BGRA8_SRGB.
2018-09-10 20:28:00 -04:00
Markus Wick
c1b8cd9058 video_core: Refactor command_processor.
Inline the WriteReg helper as it is called ~20k times per frame.
2018-09-10 22:06:16 +02:00
Markus Wick
0cfb0bacb2 video_core: Move command buffer loop.
This moves the hot loop into video_core. This refactoring shall reduce the CPU overhead of calling ProcessCommandList.
2018-09-10 22:06:13 +02:00
Markus Wick
c560043581 rasterizer: Drop unused handler.
This virtual function is called in a very hot spot, and it does nothing.

If this kind of feature is required, please be more specific and add callbacks
in the switch statement within Maxwell3D::WriteReg. There is no point in having
another switch statement within the rasterizer.
2018-09-10 22:03:10 +02:00
bunnei
4c0b1cc1ae gl_rasterizer_cache: Only use depth for applicable texture formats.
- Fixes an issue with Octopath Traveler leaving stale data here.
2018-09-10 00:50:38 -04:00
bunnei
035e6bd407 gl_rasterizer: Implement clear for non-zero render targets.
- Several misc. changes to ConfigureFramebuffers in support of this.
2018-09-10 00:41:20 -04:00
bunnei
1c34498368 gl_rasterizer_cache: Implement RenderTargetFormat::BGRA8_SRGB.
- Used by Octopath Traveler (with multiple render targets).
2018-09-10 00:37:52 -04:00
bunnei
49b15af054 gl_rasterizer: Implement multiple color attachments. 2018-09-09 22:48:28 -04:00
bunnei
e58855c7a4
Merge pull request #1268 from FernandoS27/tmml
shader_decompiler: Implemented TMML
2018-09-09 21:39:39 -04:00
FernandoS27
00131e752d Implemented TMML 2018-09-09 20:46:31 -04:00
bunnei
223ddb2008
Merge pull request #1272 from Subv/dma_2d
GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
2018-09-09 19:53:17 -04:00
bunnei
fcf81147e7
Merge pull request #1280 from zero334/improvements
video_core: fixed arithmetic overflow warnings & improved code style
2018-09-09 19:51:46 -04:00
FernandoS27
073a21ac0b Implemented TXQ dimension query type, used by SMO. 2018-09-09 11:59:01 -04:00
Patrick Elsässer
64e45b04e0 video_core: fixed arithmetic overflow warnings & improved code style
- Fixed all warnings, for renderer_opengl items, which were indicating a
possible incorrect behavior from integral promotion rules and types
larger than those in which arithmetic is typically performed.
- Added const for variables where possible and meaningful.
- Added constexpr where possible.
2018-09-09 17:51:43 +02:00
tech4me
3dcedb36b4 Port Citra #4047 & #4052: add change background color support 2018-09-08 17:00:21 -07:00
FernandoS27
82a313a14c Change name of TEXQ to TXQ, in order to match NVIDIA's naming 2018-09-08 18:08:57 -04:00
Subv
fdb199290b GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
When not set, this tells the GPU to only use the X size when performing a DMA copy.
This is only implemented for linear->linear and tiled->tiled copies. Conversion copies still retain the assert.

This bit is unset by some games for various purposes, and by nouveau when copying the vertex buffers.
2018-09-08 16:02:16 -05:00
bunnei
af074ee422
Merge pull request #1256 from bunnei/tex-target-support
Initial support for non-2D textures
2018-09-08 16:14:46 -04:00
bunnei
8b08cb925b gl_rasterizer: Use baseInstance instead of moving the buffer points.
This hopefully helps our cache not to redundant upload the vertex buffer.

# Conflicts:
#	src/video_core/renderer_opengl/gl_rasterizer.cpp
2018-09-08 04:05:56 -04:00
Patrick Elsässer
a8974f0556 video_core: Arithmetic overflow warning fix for gl_rasterizer (#1262)
* video_core: Arithmetic overflow fix for gl_rasterizer

- Fixed warnings, which were indicating incorrect behavior from integral
promotion rules and types larger than those in which arithmetic is
typically performed.

- Added const for variables where possible and meaningful.

* Changed the casts from C to C++ style

Changed the C-style casts to C++ casts as proposed.
Took also care about signed / unsigned behaviour.
2018-09-08 02:59:59 -04:00
bunnei
23ae7cf9db gl_rasterizer_cache: Improve accuracy of RecreateSurface for non-2D textures. 2018-09-08 02:53:39 -04:00
bunnei
fdd5c97a14 maxwell_3d: Remove assert that no longer applies. 2018-09-08 02:53:39 -04:00
bunnei
f165a85398 gl_rasterizer_cache: Partially implement several non-2D texture types. 2018-09-08 02:53:38 -04:00
bunnei
0731383124 gl_shader_decompiler: Partially implement several non-2D texture types (Subv). 2018-09-08 02:53:38 -04:00
bunnei
05f6f59ffb gl_rasterizer: Implement texture wrap mode p. 2018-09-08 02:53:38 -04:00
bunnei
ce8291f6c5 gl_rasterizer_cache: Track texture depth. 2018-09-08 02:53:38 -04:00
bunnei
9dccf7e1fa gl_rasterizer_cache: Remove impl. of FlushGLBuffer.
- Will not work for non-2d textures, and was not used anyways.
2018-09-08 02:53:37 -04:00
bunnei
030676b95d gl_rasterizer_cache: Keep track of texture type per surface. 2018-09-08 02:53:37 -04:00
bunnei
a439f7b6e1 gl_rasterizer_cache: Remove unused DownloadGLTexture. 2018-09-08 02:53:37 -04:00
bunnei
b56e5edafc gl_state: Keep track of texture target. 2018-09-08 02:53:37 -04:00
bunnei
9947c6ad59
Merge pull request #1252 from lioncash/header
video_core/CMakeLists: Add missing gl_buffer_cache.h
2018-09-06 19:19:43 -04:00
bunnei
9b50dca2bb
Merge pull request #1253 from lioncash/explicit
video_core/gl_buffer_cache: Minor tidying changes
2018-09-06 19:19:35 -04:00
bunnei
009a2cc9cc
Merge pull request #1255 from bunnei/minor-opt
gl_rasterizer: Call state.Apply only once on SetupShaders.
2018-09-06 19:19:16 -04:00
bunnei
820f646458 gl_rasterizer: Call state.Apply only once on SetupShaders. 2018-09-06 17:41:53 -04:00
bunnei
948f6c0738 gl_shader_decompiler: Implement saturate mode for IPA. 2018-09-06 17:40:03 -04:00
Lioncash
ddcdbce067 gl_buffer_cache: Default initialize member variables
Ensures that the cache always has a deterministic initial state.
2018-09-06 15:07:15 -04:00
Lioncash
8d685a29bc gl_buffer_cache: Make GetHandle() a const member function
GetHandle() internally calls GetHandle() on the stream_buffer instance,
which is a const member function, so this can be made const as well.
2018-09-06 15:07:15 -04:00
Lioncash
14230fe2af gl_buffer_cache: Remove unnecessary includes 2018-09-06 15:05:52 -04:00
Lioncash
68296d9474 gl_buffer_cache: Make constructor explicit
Implicit conversions during construction isn't desirable here.
2018-09-06 14:54:49 -04:00
Lioncash
8f4e09ba07 video_core/CMakeLists: Add missing gl_buffer_cache.h
Without this, the header file won't show up by default within IDEs such
as Visual Studio.
2018-09-06 14:49:51 -04:00
Markus Wick
a781042700 gl_shader_gen: Initialize position.
IMO the old code is fine, but nvidia raises shader compiler warnings.
Trivial fix through...
2018-09-06 13:37:50 +02:00
bunnei
77554ac773
Merge pull request #1243 from degasus/VAO_cache
gl_rasterizer: Implement a VAO cache.
2018-09-05 22:50:52 -04:00
bunnei
6f09c5b128
Merge pull request #1244 from FernandoS27/ipa
shader_decompiler: Implemented IPA Properly (Stage 1)
2018-09-05 21:20:40 -04:00
FernandoS27
e63b229f4a Implemented IPA Properly 2018-09-05 20:15:47 -04:00
Markus Wick
7f15306f78 gl_rasterizer: Skip TODO log.
This is called ~3k times per frame in SMO ingame.
My laptop spends ~3ms per frame on allocating and freeing this string.

Let's just stop printing this kind of redundant information.
2018-09-05 20:20:20 +02:00
Markus Wick
d3ad9469a1 gl_rasterizer: Implement a VAO cache.
This patch caches VAO objects instead of re-emiting all pointers per draw call.
Configuring this pointers is known as a fast task, but it yields too many GL
calls. So for better performance, just bind the VAO instead of 16 pointers.
2018-09-05 18:46:35 +02:00
Markus Wick
50a806ea67 renderer_opengl: Implement a buffer cache.
The idea of this cache is to avoid redundant uploads. So we are going
to cache the uploaded buffers within the stream_buffer and just reuse
the old pointers.
The next step is to implement a VBO cache on GPU memory, but for now,
I want to check the overhead of the cache management. Fetching the
buffer over PCI-E should be quite fast.
2018-09-05 08:03:50 +02:00
Markus Wick
99a71580c4 gl_shader_cache: Use an u32 for the binding point cache.
The std::string generation with its malloc and free requirement
was a noticeable overhead. Also switch to an ordered_map to
avoid the std::hash call. As those maps usually have a size of
two elements, the lookup time shall not matter.
2018-09-04 21:04:41 +02:00
bunnei
9a07e9f805
Merge pull request #1237 from degasus/optimizations
Optimizations
2018-09-04 12:16:06 -04:00
bunnei
26e96d16d0
Merge pull request #1232 from lioncash/copy
gl_shader_decompiler: Use used_shaders member variable directly within GenerateDeclarations()
2018-09-04 11:52:25 -04:00
Markus Wick
2081ed7db2 command_processor: Use std::array for bound_engines.
subchannel is a 3 bit field. So there must not be more than 8 bound engines.
And using a hashmap for up to 8 values is a bit overpowered.
2018-09-04 14:10:05 +02:00
Markus Wick
10bc725944 Update microprofile scopes.
Blame the subsystems which deserve the blame :)

The updated list is not complete, just the ones I've spotted on random sampling the stack trace.
2018-09-04 11:04:26 +02:00
Lioncash
18a89931a9 gl_shader_decompiler: Use used_shaders member variable directly within GenerateDeclarations()
Using the getter function intended for external code here makes an
unnecessary copy of the already-accessible used_shaders vector.
2018-09-02 13:10:11 -04:00
bunnei
325f3e0693
Merge pull request #1213 from DarkLordZach/octopath-fs
filesystem/maxwell_3d: Various changes to boot Project Octopath Traveller
2018-09-02 10:49:18 -04:00
bunnei
89be49d2f3
Merge pull request #1215 from ogniK5377/texs-nodep-assert
Added assert for TEXS nodep
2018-09-02 10:48:27 -04:00
bunnei
177c45e97d
Merge pull request #1214 from ogniK5377/ipa-assert
Added better asserts to IPA, Renamed IPA modes to match mesa
2018-09-02 10:44:43 -04:00
bunnei
9c206fe94d
Merge pull request #1216 from ogniK5377/ffma-assert
Added FFMA asserts and missing fields
2018-09-02 10:44:13 -04:00
David Marcec
60754b4728 Removed saturate assert
Unneeded as we already implement it
2018-09-01 19:33:32 +10:00
David Marcec
2edab4e840 Removed saturate assert
Saturate already implemented
2018-09-01 19:29:20 +10:00
David Marcec
2bc6abb9a1 Changed tab5980_0 default from 0 -> 1 2018-09-01 19:15:03 +10:00
David Marcec
6f8ed9508d Added FMUL asserts 2018-09-01 19:05:10 +10:00
David Marcec
b89fc407d7 Added FFMA asserts 2018-09-01 18:45:14 +10:00
David Marcec
948bc87a59 Added assert for TEXS nodep 2018-09-01 17:00:01 +10:00
David Marcec
ad3dca7e62 Added better asserts to IPA, Renamed IPA modes to match mesa
IpaMode is changed to IpaInterpMode
IpaMode is suppose to be 2 bits not 3
Added IpaSampleMode
Added Saturate

Renamed modes based on
d27c791891/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp (L2530)
2018-09-01 16:34:27 +10:00
Zach Hilman
f32e28c7b8 maxwell_3d: Use CoreTiming for query timestamp 2018-08-31 23:25:18 -04:00
Lioncash
4a587b81b2 core/core: Replace includes with forward declarations where applicable
The follow-up to e2457418da, which
replaces most of the includes in the core header with forward declarations.

This makes it so that if any of the headers the core header was
previously including change, then no one will need to rebuild the bulk
of the core, due to core.h being quite a prevalent inclusion.

This should make turnaround for changes much faster for developers.
2018-08-31 16:30:14 -04:00
bunnei
7f7eb29323 gl_rasterizer_cache: Use accurate framebuffer setting for accurate copies. 2018-08-31 13:07:28 -04:00
bunnei
123c065086 gl_rasterizer_cache: Also use reserve cache for RecreateSurface. 2018-08-31 13:07:28 -04:00
bunnei
9bc71fcc5f rasterizer_cache: Use boost::interval_map for a more accurate cache. 2018-08-31 13:07:28 -04:00
bunnei
d647d9550c gl_renderer: Cache textures, framebuffers, and shaders based on CPU address. 2018-08-31 13:07:27 -04:00
bunnei
16d65182f9 gl_rasterizer: Fix issues with the rasterizer cache.
- Use a single cached page map.
- Fix calculation of ending page.
2018-08-31 13:07:27 -04:00
greggameplayer
06578e89b2 Implement BC6H_UF16 & BC6H_SF16 (#1092)
* Implement BC6H_UF16 & BC6H_SF16
Require by ARMS

* correct coding style

* correct coding style part 2
2018-08-31 12:11:19 -04:00
bunnei
f08d24e9c0
Merge pull request #1204 from lioncash/pimpl
core: Make the main System class use the PImpl idiom
2018-08-31 11:31:20 -04:00
bunnei
6683bf50b5
Merge pull request #1207 from degasus/hotfix
Report correct shader size.
2018-08-31 11:21:15 -04:00
Lioncash
e2457418da core: Make the main System class use the PImpl idiom
core.h is kind of a massive header in terms what it includes within
itself. It includes VFS utilities, kernel headers, file_sys header,
ARM-related headers, etc. This means that changing anything in the
headers included by core.h essentially requires you to rebuild almost
all of core.

Instead, we can modify the System class to use the PImpl idiom, which
allows us to move all of those headers to the cpp file and forward
declare the bulk of the types that would otherwise be included, reducing
compile times. This change specifically only performs the PImpl portion.
2018-08-31 07:16:57 -04:00
Markus Wick
5be8b7a362 Report correct shader size.
Seems like this was an oversee in regards to 1fd979f50a
It changed GLShader::ProgramCode to a std::vector, so sizeof is wrong.
2018-08-31 09:56:37 +02:00
Hexagon12
d626bc8c62 Added predicate comparison GreaterEqualWithNan 2018-08-31 10:40:18 +03:00
Laku
915ab81ec2 gl_shader_decompiler: Implement POPC (#1203)
* Implement POPC

* implement invert
2018-08-30 21:32:58 -04:00
bunnei
d6accf96ff
Merge pull request #1200 from bunnei/improve-ipa
gl_shader_decompiler: Improve IPA for Pass mode with Position attribute.
2018-08-30 10:31:26 -04:00
tech4me
a6dd577d02 Shaders: Implemented IADD3 2018-08-29 13:44:41 -04:00
bunnei
b1ccd88434 gl_shader_decompiler: Improve IPA for Pass mode with Position attribute. 2018-08-29 00:37:29 -04:00
bunnei
4d7e1662c8
Merge pull request #1193 from lioncash/priv
gpu: Make memory_manager private
2018-08-28 12:28:57 -04:00
bunnei
eb4f2d5596
Merge pull request #1192 from lioncash/unused
gl_rasterizer: Remove unused variables
2018-08-28 12:28:13 -04:00
Lioncash
2e7dc4cac9 gl_shader_cache: Remove unused program_code vector in GetShaderAddress()
Given std::vector is a type with a non-trivial destructor, this
variable cannot be optimized away by the compiler, even if unused.
Because of that, something that was intended to be fairly lightweight,
was actually allocating 32KB and deallocating it at the end of the
function.
2018-08-28 11:20:41 -04:00
Lioncash
45fb74d262 gpu: Make memory_manager private
Makes the class interface consistent and provides accessors for
obtaining a reference to the memory manager instance.

Given we also return references, this makes our more flimsy uses of
const apparent, given const doesn't propagate through pointers in the
way one would typically expect. This makes our mutable state more
apparent in some places.
2018-08-28 11:11:50 -04:00
Lioncash
6771a18c6c gl_rasterizer: Remove unused variables 2018-08-28 10:46:29 -04:00
bunnei
b55d8111e6 renderer_opengl: Implement a new shader cache. 2018-08-27 18:26:46 -04:00
bunnei
a0e1566dc5 gl_rasterizer_cache: Update to use RasterizerCache base class. 2018-08-27 18:26:46 -04:00
bunnei
382852418b video_core: Add RasterizerCache class for common cache management code. 2018-08-27 18:26:45 -04:00
bunnei
2f5ed3877c
Merge pull request #1169 from Lakumakkara/sel
shader_bytecode: fix SEL_IMM bitstring
2018-08-27 18:24:57 -04:00
bunnei
f96ded9815
Merge pull request #1174 from lioncash/debug
debug_utils: Minor individual interface changes
2018-08-27 15:44:29 -04:00
bunnei
be2f1eabd7
Merge pull request #1173 from lioncash/batch
maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
2018-08-25 10:59:54 -04:00
bunnei
23b86fd3ea
Merge pull request #1167 from lioncash/assert
gl_rasterizer: Correct assertion condition in SyncLogicOpState()
2018-08-25 10:50:59 -04:00
Lioncash
c65713832c debug_utils: Remove unused includes
Quite a bit of these aren't necessary directly within the debug_utils
header and can be removed or included where actually necessary.
2018-08-24 20:49:14 -04:00
Lioncash
1e6a209649 debug_utils: Make BreakpointObserver class' constructor explicit
Avoids implicit conversions.
2018-08-24 20:49:14 -04:00
Lioncash
b6425c0511 debug_utils: Initialize active_breakpoint member of DebugContext
Ensures that all class members are initialized.
2018-08-24 20:15:50 -04:00
Lioncash
20800f2df7 maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
The start and finish events should likely not be right after one another
like this, otherwise the batch will appear to complete immediately
2018-08-24 19:58:05 -04:00
Laku
36093a3e4d fix SEL_IMM bitstring 2018-08-24 07:18:12 +03:00
Lioncash
8fd9eb71b4 gl_rasterizer: Correct assertion condition in SyncLogicOpState()
Previously the assert would always be hit, since it was the equivalent
of: array == nullptr, which is never true.
2018-08-23 23:00:54 -04:00
tech4me
ba2972bc64 Shaders: Added decodings for IADD3 instructions 2018-08-23 15:46:59 -04:00
bunnei
0dce6d7008
Merge pull request #1160 from bunnei/surface-reserve
gl_rasterizer_cache: Several improvements
2018-08-23 12:04:37 -04:00
bunnei
d65f079cc1 gl_rasterizer_cache: Blit when possible on RecreateSurface. 2018-08-23 11:27:01 -04:00
bunnei
fee8bdd90c gl_rasterizer_cache: Reserve surfaces that have already been created for later use. 2018-08-23 11:27:01 -04:00
bunnei
fde2017a3f gl_rasterizer_cache: Remove assert for RecreateSurface type. 2018-08-23 11:27:00 -04:00
bunnei
ebf5768340 gl_rasterizer_cache: Implement compressed texture copies. 2018-08-23 11:27:00 -04:00
bunnei
a4ac3bed6c gl_rasterizer: Implement stencil test.
- Used by Splatoon 2.
2018-08-23 11:08:49 -04:00
bunnei
da3da6be90 gl_rasterizer: Implement partial color clear and stencil clear. 2018-08-23 11:08:48 -04:00
bunnei
2a472ff54d maxwell_3d: Update to include additional stencil registers. 2018-08-23 11:08:47 -04:00
bunnei
c4ed0b16b1 gl_state: Update to handle stencil front/back face separately. 2018-08-23 11:08:46 -04:00
bunnei
c7f2fb2151
Merge pull request #1157 from lioncash/vec
gl_shader_gen: Use a std::vector to represent program code instead of std::array
2018-08-23 02:19:00 -04:00
bunnei
232b0d9d2a
Merge pull request #1156 from Lakumakkara/lop3
gl_shader_decompiler: Implement LOP3
2018-08-23 02:16:49 -04:00
Lioncash
12ba80a86c gl_shader_gen: Make ShaderSetup's constructor explicit
Prevents implicit conversions.
2018-08-22 17:04:44 -04:00
Lioncash
1fd979f50a gl_shader_gen: Use a std::vector to represent program code instead of std::array
While convenient as a std::array, it's also quite a large set of data as
well (32KB). It being an array also means data cannot be std::moved. Any
situation where the code is being set or relocated means that a full
copy of that 32KB data must be done.

If we use a std::vector we do need to allocate on the heap, however, it
does allow us to std::move the data we have within the std::vector into
another std::vector instance, eliminating the need to always copy the
program data (as std::move in this case would just transfer the pointers
and bare necessities over to the new vector instance).
2018-08-22 17:04:44 -04:00
Laku
b2ca8089ce more fixes 2018-08-23 00:01:40 +03:00
Laku
e70a3c5a5d fixes 2018-08-22 21:33:32 +03:00
Lioncash
dd35b4b18a renderer_opengl: Namespace OpenGL code
Namespaces all OpenGL code under the OpenGL namespace.

Prevents polluting the global namespace and allows clear distinction
between other renderers' code in the future.
2018-08-22 06:14:47 -04:00
Laku
4877e6c2f6 remove debug logging 2018-08-22 11:45:28 +03:00
Laku
8e8326595f implement lop3 2018-08-22 10:09:44 +03:00
bunnei
cea627b0fc
Merge pull request #840 from FearlessTobi/port-3353
Port #3353 from Citra: "citra-qt: Add customizable speed limit target "
2018-08-22 01:19:50 -04:00
bunnei
5abf71fe65
Merge pull request #1154 from OatmealDome/topology-lines
maxwell_to_gl: Implement PrimitiveTopology::Lines
2018-08-22 01:08:34 -04:00
bunnei
125d7122ac
Merge pull request #1124 from Subv/logic_ops
GPU: Implemented logic ops.
2018-08-22 01:05:25 -04:00
OatmealDome
ad1220e1b3
maxwell_to_gl: Implement PrimitiveTopology::Lines
Used by Splatoon 2's debug menu.
2018-08-22 01:01:06 -04:00
bunnei
d63b1d21f1 Revert "Shader: Use the right sampler type in the TEX, TEXS and TLDS instructions."
- This reverts commit 3ef4b3d4b4.
- This commit had broken a lot of games. We really should do a full implementation of this in one change.
2018-08-21 20:07:40 -04:00
Lioncash
a0e2bd85a5 shader_bytecode: Parenthesize conditional expression within GetTextureType()
Resolves a -Wlogical-op-parentheses warning.
2018-08-21 15:08:35 -04:00
bunnei
bf89a99839
Merge pull request #1123 from lioncash/screen
rasterizer_interface: Remove renderer-specific ScreenInfo type from AccelerateDraw() in RasterizerInterface
2018-08-21 01:18:34 -04:00
bunnei
b0f7713fce
Merge pull request #1132 from Subv/gl_FragDepth
Shaders: Implement depth writing in fragment shaders.
2018-08-21 01:17:53 -04:00
bunnei
8c9abe1d41
Merge pull request #1134 from lioncash/log
renderer_opengl: Use LOG_DEBUG for GL_DEBUG_SEVERITY_NOTIFICATION and GL_DEBUG_SEVERITY_LOW logs
2018-08-21 01:17:31 -04:00
bunnei
ca58929eb0
Merge pull request #1121 from Subv/tex_reinterpret
Rasterizer: Use PBOs to reinterpret texture formats when games re-use the same memory.
2018-08-21 01:06:40 -04:00
Lioncash
523e4be02c renderer_opengl: Use LOG_DEBUG for GL_DEBUG_SEVERITY_NOTIFICATION and GL_DEBUG_SEVERITY_LOW logs
LOG_TRACE is only enabled on debug builds which can be quite slow when
trying to debug graphics issues. Instead we can log the messages to the
debug log, which is available on both release and debug builds.
2018-08-21 00:23:09 -04:00
bunnei
fde3b1b6f2
Merge pull request #1133 from lioncash/guard
gl_stream_buffer: Add missing header guard
2018-08-20 23:37:55 -04:00
Lioncash
93a4097e9d gl_stream_buffer: Add missing header guard
Prevents potential compilation errors from occuring due to multiple
inclusions
2018-08-20 23:25:08 -04:00
Subv
e3bddf8137 Shaders: Implement depth writing in fragment shaders.
We'll write <last color output reg + 2> to gl_FragDepth.
2018-08-20 21:57:56 -05:00
bunnei
e33452f7e8
Merge pull request #1131 from bunnei/impl-tex3d-texcube
gl_shader_decompiler: Implement TextureCube/Texture3D for TEX/TEXS.
2018-08-20 22:15:18 -04:00
bunnei
5aaee2ff8d
Merge pull request #1106 from Subv/multiple_rendertargets
Shaders: Write all the enabled color outputs when a fragment shader exits.
2018-08-20 21:56:06 -04:00
bunnei
2ae88feea7 shader_bytecode: Replace some UNIMPLEMENTED logs. 2018-08-20 21:53:49 -04:00
bunnei
16db8b9d9f gl_shader_decompiler: Implement Texture3D for TEXS. 2018-08-20 21:53:18 -04:00
bunnei
948002635f gl_shader_decompiler: Implement TextureCube for TEX. 2018-08-20 21:53:00 -04:00
Subv
eac3cf301c Shaders: Fixed the coords in TEX with Texture2D.
The X and Y coordinates should be in gpr8 and gpr8+1, respectively.

This fixes the cutscene rendering in Sonic Mania.
2018-08-20 20:45:46 -05:00
Subv
fc5b489b0f Shaders: Log and crash when using an unimplemented texture type in a texture sampling instruction. 2018-08-20 20:44:56 -05:00
Subv
2b9eee4d1e GPU: Implemented the logic op functionality of the GPU.
This will ASSERT if blending is enabled at the same time as logic ops.
2018-08-20 18:44:47 -05:00
Subv
f24ab6d9e6 GLState: Allow enabling/disabling GL_COLOR_LOGIC_OP independently from blending. 2018-08-20 18:43:11 -05:00
Lioncash
46ef072cf9 rasterizer_interface: Remove ScreenInfo from AccelerateDraw()'s signature
This is an OpenGL renderer-specific data type. Given that, this type
shouldn't be used within the base interface for the rasterizer. Instead,
we can pass this information to the rasterizer via reference.
2018-08-20 19:43:05 -04:00
Subv
6bcdf37d4f GPU: Added registers for the logicop functionality. 2018-08-20 18:42:36 -05:00
Lioncash
bc16f7f3cc renderer_base: Make creation of the rasterizer, the responsibility of the renderers themselves
Given we use a base-class type within the renderer for the rasterizer
(RasterizerInterface), we want to allow renderers to perform more
complex initialization if they need to do such a thing. This makes it
important to reserve type information.

Given the OpenGL renderer is quite simple settings-wise, this is just a
simple shuffling of the initialization code. For something like Vulkan
however this might involve doing something like:

// Initialize and call rasterizer-specific function that requires
// the full type of the instance created.
auto raster = std::make_unique<VulkanRasterizer>(some, params);
raster->CallSomeVulkanRasterizerSpecificFunction();

// Assign to base class variable
rasterizer = std::move(raster)
2018-08-20 19:28:00 -04:00
fearlessTobi
ba8ff096fd Port #3353 from Citra 2018-08-21 01:14:06 +02:00
Subv
7784ce1854 Shaders: Write all the enabled color outputs when a fragment shader exits.
We were only writing to the first render target before.
Note that this is only the GLSL side of the implementation, supporting multiple render targets requires more changes in the OpenGL renderer.

Dual Source blending is not implemented and stuff that uses it might not work at all.
2018-08-20 17:31:25 -05:00
Subv
d7c68fbb12 Rasterizer: Reinterpret the raw texture bytes instead of blitting (and thus doing format conversion) to a new texture when a game requests an old texture address with a different format. 2018-08-20 15:20:35 -05:00
Subv
3fe77be392 Rasterizer: Don't attempt to copy over the old texture's data when doing a format reinterpretation if we're only going to clear the framebuffer. 2018-08-20 15:20:35 -05:00
bunnei
028d90eb79
Merge pull request #1104 from Subv/instanced_arrays
GLRasterizer: Implemented instanced vertex arrays.
2018-08-20 14:32:50 -04:00
bunnei
296e57fa0e
Merge pull request #1115 from Subv/texs_mask
Shaders/TEXS: Write to the correct output register when swizzling.
2018-08-20 14:31:33 -04:00
bunnei
b20ed93884
Merge pull request #1112 from Subv/sampler_types
Shaders: Use the correct shader type when sampling textures.
2018-08-20 14:30:45 -04:00
David Marcec
23d45715dc Implemented RGBA8_UINT
Needed by kirby
2018-08-20 22:26:54 +10:00
Subv
6cf719a4ab Shaders/TEXS: Fixed the component mask in the TEXS instruction.
Previously we could end up with a TEXS that didn't write any outputs, this was wrong.
2018-08-19 17:09:40 -05:00
bunnei
51ddb130c5
Merge pull request #1089 from Subv/neg_bits
Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
2018-08-19 17:01:48 -04:00
bunnei
9b17486be6
Merge pull request #1105 from Subv/convert_neg
Shader: Remove an unneeded assert, the negate bit is implemented for conversion instructions.
2018-08-19 17:01:20 -04:00
bunnei
0a1d4fbc5c
Merge pull request #1113 from Subv/texs_mask
Shaders/TEXS: Fixed the component mask in the TEXS instruction.
2018-08-19 17:00:59 -04:00
Subv
f7edbcd7a3 Shaders/TEXS: Fixed the component mask in the TEXS instruction.
Previously we could end up with a TEXS that didn't write any outputs, this was wrong.
2018-08-19 14:00:12 -05:00
bunnei
b0eb580931
Merge pull request #1102 from ogniK5377/mirror-clamp-edge
Added WrapMode MirrorOnceClampToEdge
2018-08-19 13:59:41 -04:00
bunnei
85da529f15
Merge pull request #1101 from Subv/ssy_stack
Shaders: Implemented a stack for the SSY/SYNC instructions.
2018-08-19 13:58:45 -04:00
Subv
7fb406c3fc Shader: Implemented the TLD4 and TLD4S opcodes using GLSL's textureGather.
It is unknown how TLD4S determines the sampler type, more research is needed.
2018-08-19 12:57:58 -05:00
Subv
3ef4b3d4b4 Shader: Use the right sampler type in the TEX, TEXS and TLDS instructions.
Different sampler types have their parameters in different registers.
2018-08-19 12:57:54 -05:00
Subv
73b937b190 Shader: Added bitfields for the texture type of the various sampling instructions. 2018-08-19 12:57:51 -05:00
Subv
656758fd81 Shaders: Added decodings for TLD4 and TLD4S 2018-08-19 12:57:08 -05:00
bunnei
29d4f8c2dd
Merge pull request #1109 from Subv/ldg_decode
Shaders: Added decodings for  the LDG and STG instructions.
2018-08-19 13:31:19 -04:00
bunnei
9baf5de90c
Merge pull request #1108 from Subv/front_facing
Shaders: Implemented the gl_FrontFacing input attribute (attr 63).
2018-08-19 13:21:14 -04:00
Subv
1b92ae136f Shaders: Added decodings for the LDG and STG instructions. 2018-08-19 00:46:34 -05:00
Subv
731701a2d2 Shaders: Implemented the gl_FrontFacing input attribute (attr 63). 2018-08-19 00:14:34 -05:00
Subv
9b1c49a9cf Shader: Remove an unneeded assert, the negate bit is implemented for conversion instructions. 2018-08-18 14:48:05 -05:00
Subv
e0f66c1fbf GLRasterizer: Implemented instanced vertex arrays.
Before each draw call, for every enabled vertex array configured as instanced, we take the current instance id and divide it by its configured divisor, then we multiply that by the corresponding stride and increment the start address by the resulting amount. This way we can simulate the vertex array being incremented once per instance without actually using OpenGL's instancing functions.
2018-08-18 14:42:26 -05:00
Subv
8335b2f115 Shader: Implemented the predicate and mode arguments of LOP.
The mode can be used to set the predicate to true depending on the result of the logic operation. In some cases, this means discarding the result (writing it to register 0xFF (Zero)).

This is used by Super Mario Odyssey.
2018-08-18 14:36:37 -05:00
David Marcec
71cc482bbd Added WrapMode MirrorOnceClampToEdge
Used by splatoon 2
2018-08-19 02:26:50 +10:00
Subv
ff358d97e8 Shaders: Implemented a stack for the SSY/SYNC instructions.
The SSY instruction pushes an address into the stack, and the SYNC instruction pops it. The current stack depth is 20, we should figure out if this is enough or not.
2018-08-18 10:48:12 -05:00
Subv
2e95ba2e9c Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
We should definitely audit our shader generator for more errors like this.
2018-08-18 10:22:42 -05:00
David Marcec
63dff47e22 Added predcondition GreaterThanWithNan 2018-08-18 17:49:59 +10:00
bunnei
504cff2b7a
Merge pull request #1096 from bunnei/supported-blits
gl_rasterizer_cache: Remove asserts for supported blits.
2018-08-17 22:41:53 -04:00
bunnei
e341d868ee gl_rasterizer_cache: Remove asserts for supported blits. 2018-08-17 00:10:08 -04:00
bunnei
da7226442f renderer_opengl: Treat OpenGL errors as critical. 2018-08-17 00:09:27 -04:00
bunnei
727136a9c9
Merge pull request #1019 from Subv/vertex_divisor
Rasterizer: Manually implemented instanced rendering.
2018-08-17 00:07:06 -04:00
bunnei
89c3d6a2a3 gl_rasterizer_cache: Treat Depth formats differently from DepthStencil. 2018-08-15 21:24:04 -04:00
Subv
91140f6c0a Shader/Conversion: Implemented the negate bit in F2F and I2I instructions. 2018-08-15 09:27:43 -05:00
Subv
38592a3b5e Shader/I2F: Implemented the negate I2F_C instruction variant. 2018-08-15 09:25:02 -05:00
Subv
40ecdda19e Shader/F2I: Implemented the negate bit in the I2F instruction 2018-08-15 09:18:55 -05:00
Subv
5ef447cc0e Shader/F2I: Implemented the F2I_C instruction variant. 2018-08-15 09:16:35 -05:00
Subv
11c221cc62 Shader/F2I: Implemented the negate bit in the F2I instruction. 2018-08-15 09:15:55 -05:00
bunnei
40f83fee6a
Merge pull request #1077 from bunnei/rgba16u
gl_rasterizer_cache: Add RGBA16U to PixelFormatFromTextureFormat.
2018-08-15 09:25:15 -04:00
bunnei
b1148d269d gl_rasterizer_cache: Cleanup some PixelFormat names and logging. 2018-08-14 23:31:45 -04:00
Subv
c5284efd4f Rasterizer: Implemented instanced rendering.
We keep track of the current instance and update an uniform in the shaders to let them know which instance they are.

Instanced vertex arrays are not yet implemented.
2018-08-14 22:25:07 -05:00
bunnei
8599e1e4fc gl_rasterizer_cache: Add RGBA16U to PixelFormatFromTextureFormat.
- Used by Breath of the Wild.
2018-08-14 23:18:34 -04:00
bunnei
3aad82b1a3
Merge pull request #1069 from bunnei/vtx-sz
maxwell_to_gl: Properly handle UnsignedInt/SignedInt sizes.
2018-08-14 23:14:44 -04:00
bunnei
2a42dea568
Merge pull request #1070 from bunnei/cbuf-sz
gl_rasterizer: Fix upload size for constant buffers.
2018-08-14 23:14:24 -04:00
bunnei
c8cd1785e6
Merge pull request #1071 from bunnei/fix-ldc
gl_shader_decompiler: Several fixes for indirect constant buffer loads.
2018-08-14 23:14:09 -04:00
bunnei
991eb4824c
Merge pull request #1068 from bunnei/g8r8s
gl_rasterizer_cache: Implement G8R8S format.
2018-08-14 23:13:43 -04:00
greggameplayer
6eda9ebbdb Implement Z16_UNORM in PixelFormatFromTextureFormat function
Require by Zelda Breath Of The Wild
2018-08-15 04:14:15 +02:00
bunnei
5e66a24423 gl_shader_decompiler: Several fixes for indirect constant buffer loads. 2018-08-14 20:47:50 -04:00
bunnei
290439a6a5 gl_rasterizer: Fix upload size for constant buffers. 2018-08-14 20:44:19 -04:00
bunnei
dc876fd63a maxwell_to_gl: Properly handle UnsignedInt/SignedInt sizes. 2018-08-14 20:43:02 -04:00
bunnei
d8fd3ef4fe gl_rasterizer_cache: Implement G8R8S format.
- Used by Super Mario Odyssey.
2018-08-14 20:41:49 -04:00
bunnei
4dacb8a4b1
Merge pull request #1058 from greggameplayer/BC7U_Fix
Fix BC7U
2018-08-14 08:03:07 -04:00
greggameplayer
6bfcf13187 Fix BC7U 2018-08-14 02:36:00 +02:00
bunnei
6e52f37d5b renderer_opengl: Implement RenderTargetFormat::RGBA16_UNORM.
- Used by Breath of the Wild.
2018-08-13 18:20:07 -04:00
bunnei
46fbf6dd92
Merge pull request #1052 from ogniK5377/xeno
Implement RG32UI and R32UI
2018-08-13 12:31:39 -04:00
David Marcec
45cc022ea9 Implement RG32UI and R32UI
Needed for xenoblade
2018-08-13 22:55:16 +10:00
bunnei
41b77c4e0a maxwell_to_gl: Implement VertexAttribute::Size::Size_8.
- Used by Breath of the Wild.
2018-08-13 01:34:21 -04:00
bunnei
bdf17fe0cc renderer_opengl: Implement RenderTargetFormat::RGBA16_UINT.
- Used by Breath of the Wild.
2018-08-13 00:06:22 -04:00
bunnei
54ef9302a2
Merge pull request #1045 from bunnei/rg8-unorm
renderer_opengl: Implement RenderTargetFormat::RG8_UNORM.
2018-08-13 00:05:25 -04:00
bunnei
8fe118bcaa maxwell_to_gl: Implement PrimitiveTopology::LineStrip.
- Used by Breath of the Wild.
2018-08-12 23:09:32 -04:00
bunnei
c56a0e3c34 renderer_opengl: Implement RenderTargetFormat::RG8_UNORM.
- Used by Breath of the Wild.
2018-08-12 23:08:50 -04:00
bunnei
534abf9d97 gl_shader_decompiler: Implement XMAD instruction. 2018-08-12 18:30:24 -04:00
Markus Wick
0eb39922f6 gl_rasterizer: Use a shared helper to upload from CPU memory. 2018-08-12 16:10:26 +02:00
Markus Wick
0af7e93763 gl_state: Don't track constant buffer mappings. 2018-08-12 16:10:26 +02:00
Markus Wick
6ff7906ddc gl_rasterizer: Use the stream buffer for constant buffers. 2018-08-12 16:10:26 +02:00
Markus Wick
ce722e317b gl_rasterizer: Use the streaming buffer itself for the constant buffer.
Don't emut copies, especially not for data, which is used once. They just end in a huge GPU overhead.
2018-08-12 15:48:59 +02:00
Markus Wick
6f6bba3ff1 gl_rasterizer: Use a helper for aligning the buffer. 2018-08-12 15:47:35 +02:00
Markus Wick
d7298ec262 Update the stream_buffer helper from Citra.
Please see https://github.com/citra-emu/citra/pull/3666 for more details.
2018-08-12 15:47:35 +02:00
bunnei
639ebb39f6 gl_shader_decompiler: Fix SetOutputAttributeToRegister empty check. 2018-08-12 02:22:42 -04:00
bunnei
c68aa65226 gl_shader_decompiler: Fix GLSL compiler error with KIL instruction. 2018-08-12 00:06:48 -04:00
bunnei
ee07041b3a
Merge pull request #1020 from lioncash/namespace
core: Namespace EmuWindow
2018-08-11 22:40:08 -04:00
bunnei
9c977d2215
Merge pull request #1021 from lioncash/warn
gl_rasterizer: Silence implicit truncation warning in SetupShaders()
2018-08-11 22:39:46 -04:00
bunnei
f2c7b5dcd6
Merge pull request #1024 from Subv/blend_gl
GPU/Maxwell3D: Implemented an alternative set of blend factors.
2018-08-11 22:39:02 -04:00
bunnei
d37da52cb3
Merge pull request #1023 from Subv/invalid_attribs
RasterizerGL: Ignore invalid/unset vertex attributes.
2018-08-11 22:18:40 -04:00
Subv
969326bd58 GPU/Maxwell3D: Implemented an alternative set of blend factors.
These are used by nouveau and some games like SMO.
2018-08-11 20:57:16 -05:00
greggameplayer
224071a652 Implement R8_UINT RenderTargetFormat & PixelFormat (#1014)
- Used by Go Vacation
2018-08-11 21:44:42 -04:00
Subv
2dad1204e8 RasterizerGL: Ignore invalid/unset vertex attributes.
This should make the es2gears example not crash anymore.
2018-08-11 20:36:40 -05:00
Lioncash
28e90fa0e0 gl_rasterizer: Silence implicit truncation warning in SetupShaders()
Previously this would warn of truncating a std::size_t to a u32. This is
safe because we'll obviously never have more than UINT32_MAX amount of
uniform buffers.
2018-08-11 20:32:03 -04:00
Lioncash
0a93b45b6a core: Namespace EmuWindow
Gets the class out of the global namespace.
2018-08-11 20:20:21 -04:00
bunnei
403dfd68fc
Merge pull request #1010 from bunnei/unk-vert-attrib-shader
gl_shader_decompiler: Improve handling of unknown input/output attributes.
2018-08-11 19:56:28 -04:00
bunnei
c519354506
Merge pull request #1009 from bunnei/rg8-rgba8-snorm
Implement render target formats RGBA8_SNORM and RG8_SNORM.
2018-08-11 19:55:41 -04:00
bunnei
0b668d5ff3 gl_shader_decompiler: Improve handling of unknown input/output attributes. 2018-08-11 19:26:45 -04:00
bunnei
670a2c1f80
Merge pull request #1018 from Subv/ssy_sync
GPU/Shader: Implemented SSY and SYNC as a set_target/jump pair.
2018-08-11 19:10:02 -04:00
bunnei
88ffa422d4 gl_rasterizer: Implement render target format RG8_SNORM.
- Used by Super Mario Odyssey.
2018-08-11 19:06:42 -04:00
bunnei
0471976b48 gl_rasterizer: Implement render target format RGBA8_SNORM.
- Used by Super Mario Odyssey.
2018-08-11 18:59:14 -04:00
Subv
c1ad973881 GPU/Shader: Don't predicate instructions that don't have a predicate field (SSY). 2018-08-11 16:00:14 -05:00
Subv
305a05f820 GPU/Shaders: Implemented SSY and SYNC as a way to modify control flow during shader execution.
SSY sets the target label to jump to when the SYNC instruction is executed.
2018-08-11 15:55:11 -05:00
bunnei
d64303d185
Merge pull request #1016 from lioncash/video
video_core: Get rid of global variable g_toggle_framelimit_enabled
2018-08-11 14:10:55 -04:00
bunnei
b8b9f41b6b
Merge pull request #1003 from lioncash/var
video_core: Use variable template variants of type_traits interfaces where applicable
2018-08-11 14:08:12 -04:00
greggameplayer
dfcde52f39 Implement R16S & R16UI & R16I RenderTargetFormats & PixelFormats and more (R16_UNORM needed by Fate Extella) (#848)
* Implement R16S & R16UI & R16I RenderTargetFormats & PixelFormats


Do a separate function in order to get Bytes Per Pixel of DepthFormat


Apply the new function in gpu.h


delete unneeded white space

* correct merging error
2018-08-11 14:01:50 -04:00
Lioncash
20c2928c2b video_core; Get rid of global g_toggle_framelimit_enabled variable
Instead, we make a struct for renderer settings and allow the renderer
to update all of these settings, getting rid of the need for
global-scoped variables.

This also uncovered a few indirect inclusions for certain headers, which
this commit also fixes.
2018-08-10 19:00:09 -04:00
Lioncash
f380496728 renderer_base: Remove unused kFramebuffer enumeration
This is entirely unused and can be removed.
2018-08-10 18:31:13 -04:00
Lioncash
2e80e7480d video_core: Remove unused Renderer enumeration
Currently we only have an OpenGL renderer, so this is unused in code
(and occupies the Renderer identifier in the VideoCore namespace).
2018-08-10 18:27:40 -04:00
bunnei
6b0bc48a42 maxwell_to_gl: Implement VertexAttribute::Size::Size_8_8.
- Used by Super Mario Odyssey.
2018-08-10 12:47:00 -04:00
bunnei
a5b65df9cf maxwell_to_gl: Implement VertexAttribute::Size::Size_32_32_32.
- Used by Super Mario Odyssey.
2018-08-10 12:46:49 -04:00
bunnei
57626fda7b
Merge pull request #1004 from lioncash/unused
gl_rasterizer_cache: Remove unused viewport parameter of GetFramebufferSurfaces()
2018-08-10 12:13:32 -04:00
bunnei
6313d54cef
Merge pull request #1008 from yuzu-emu/revert-697-disable-depth-cull
Revert "gl_state: Temporarily disable culling and depth test."
2018-08-10 12:13:09 -04:00
bunnei
2156cb3cbe
Revert "gl_state: Temporarily disable culling and depth test." 2018-08-10 10:39:46 -04:00
Lioncash
0e1510ac29 gl_rasterizer_cache: Remove unused viewport parameter of GetFramebufferSurfaces() 2018-08-09 20:55:41 -04:00
Lioncash
b8c43b6080 video_core: Use variable template variants of type_traits interfaces where applicable 2018-08-09 20:45:48 -04:00
bunnei
3a67876252 textures: Refactor out for Texture/Depth FormatFromPixelFormat. 2018-08-09 20:36:03 -04:00
bunnei
6828c25498
Merge pull request #995 from bunnei/gl-buff-bounds
gl_rasterizer_cache: Add bounds checking for gl_buffer copies.
2018-08-09 20:23:30 -04:00
bunnei
05c33d89a1
Merge pull request #1001 from lioncash/reserve
gl_shader_decompiler: Reserve element memory beforehand in BuildRegisterList()
2018-08-09 19:27:35 -04:00
bunnei
e8c52d4c89 gl_rasterizer_cache: Add bounds checking for gl_buffer copies. 2018-08-09 19:20:17 -04:00
bunnei
37e1ed3744
Merge pull request #991 from bunnei/ignore-mac
maxwell_3d: Ignore macros that have not been uploaded yet.
2018-08-09 19:16:28 -04:00
Khangaroo
75e12a33ae Implement SNORM for BC5/DXN2 (#998)
* Implement BC5/DXN2 (#996)

- Used by Kirby Star Allies.

* Implement BC5/DXN2 SNORM

UNORM for Kirby Star Allies
SNORM for Super Mario Odyssey
2018-08-09 19:15:32 -04:00
Lioncash
6ef027b958 gl_shader_decompiler: Reserve element memory beforehand in BuildRegisterList()
Avoids potentially perfoming multiple reallocations when we know the
total amount of memory we need beforehand.
2018-08-09 17:29:11 -04:00
Lioncash
59ea37daa7 gl_rasterizer_cache: Avoid iterator invalidation issues within InvalidateRegion()
A range-based for loop can't be used when the container being iterated
is also being erased from.
2018-08-09 15:30:20 -04:00
bunnei
0bfe974281
Merge pull request #992 from bunnei/declr-pred
gl_shader_decompiler: Declare predicates on use.
2018-08-09 14:36:52 -04:00
bunnei
88b18b9ba4
Merge pull request #994 from lioncash/const
gl_rasterizer_cache: Use std::vector::assign vs resize() then copy for the non-tiled case
2018-08-09 14:36:06 -04:00
bunnei
b125137493
Merge pull request #993 from bunnei/smo-vtx-pts
Implement VertexAttribute::Size::Size_16_16_16_16 and PrimitiveTopology::Points.
2018-08-09 13:28:14 -04:00
bunnei
f765a6b902
Merge pull request #984 from bunnei/rt-none
gl_rasterizer: Do not render when no render target is configured.
2018-08-09 13:12:28 -04:00
Khangaroo
5cb6eceecf Implement BC5/DXN2 (#996)
- Used by Kirby Star Allies.
2018-08-09 12:57:13 -04:00
bunnei
c333bfc193
Merge pull request #977 from bunnei/bgr565
gl_rasterizer_cached: Implement RenderTargetFormat::B5G6R5_UNORM.
2018-08-08 23:43:04 -04:00
Lioncash
e831b80d69 gl_rasterizer_cache: Invert conditional in LoadGLBuffer()
It's generally easier to follow code using conditionals that operate in
terms of the true case followed by the false case (no chance of
overlooking the exclamation mark).
2018-08-08 23:34:57 -04:00
Lioncash
434f352eb3 gl_rasterizer_cache: Use std::vector::assign in LoadGLBuffer() for the non-tiled case
resize() causes the vector to expand and zero out the added members to
the vector, however we can avoid this zeroing by using assign().

Given we have the pointer to the data we want to copy, we can calculate
the end pointer and directly copy the range of data without the
need to perform the resize() beforehand.
2018-08-08 23:34:58 -04:00
bunnei
dfc3eed0cb maxwell_to_gl: Implement VertexAttribute::Size::Size_16_16_16_16.
- Used by Super Mario Odyssey (in game).
2018-08-08 23:28:17 -04:00
bunnei
06d0b96ca9 maxwell_to_gl: Implement PrimitiveTopology::Points.
- Used by Super Mario Odyssey (in game).
2018-08-08 23:28:00 -04:00
bunnei
4283019aa0 gl_shader_decompiler: Declare predicates on use.
- Used by Super Mario Odyssey (when going in game).
2018-08-08 23:26:31 -04:00
bunnei
efe6b473c5 maxwell_3d: Ignore macros that have not been uploaded yet.
- Used by Super Mario Odyssey (in game).
2018-08-08 23:25:37 -04:00
Lioncash
557c466994 gl_rasterizer_cache: Make pointer const in LoadGLBuffer()
This is only ever read from, so we can make the data it's pointing to
const.
2018-08-08 23:14:57 -04:00
bunnei
25ba4d1b68
Merge pull request #982 from bunnei/stub-unk-63
gl_shader_decompiler: Stub input attribute Unknown_63.
2018-08-08 22:28:18 -04:00
bunnei
ddec200290 gl_rasterizer: Do not render when no render target is configured.
- Used by Super Mario Odyssey.
2018-08-08 19:29:45 -04:00
bunnei
cf917a5e93
Merge pull request #976 from bunnei/shader-imm
gl_shader_decompiler: Let OpenGL interpret floats.
2018-08-08 19:17:01 -04:00
bunnei
9ceceb212f
Merge pull request #981 from bunnei/cbuf-corrupt
maxwell_3d: Use correct const buffer size and check bounds.
2018-08-08 19:16:34 -04:00
bunnei
cc2526dd51
Merge pull request #985 from bunnei/rt-r11g11b10
gpu: Add R11G11B10_FLOAT to RenderTargetBytesPerPixel.
2018-08-08 18:21:34 -04:00
bunnei
096b04f1a4
Merge pull request #979 from bunnei/vtx88
maxwell_to_gl: Implement VertexAttribute::Size::Size_8_8.
2018-08-08 18:18:50 -04:00
bunnei
7bf422d58c gpu: Add R11G11B10_FLOAT to RenderTargetBytesPerPixel.
- Used by Super Mario Odyssey.
2018-08-08 02:42:14 -04:00
bunnei
7f0d0a93f7 gl_shader_decompiler: Stub input attribute Unknown_63. 2018-08-08 02:35:59 -04:00
bunnei
57982df105 maxwell_3d: Use correct const buffer size and check bounds.
- Fixes mem corruption with Super Mario Odyssey and Pokkén Tournament DX.
2018-08-08 02:10:25 -04:00
bunnei
8c6338b6f9 renderer_opengl: Use trace log in a few places. 2018-08-08 01:53:23 -04:00
bunnei
c120ed7d18 maxwell_to_gl: Implement VertexAttribute::Size::Size_8_8. 2018-08-08 01:50:53 -04:00
bunnei
aaf8d9ac2f gl_rasterizer_cached: Implement RenderTargetFormat::B5G6R5_UNORM.
- Used by Super Mario Odyssey.
2018-08-08 01:48:27 -04:00
bunnei
e542356d0c gl_shader_decompiler: Let OpenGL interpret floats.
- Accuracy is lost in translation to string, e.g. with NaN.
- Needed for Super Mario Odyssey.
2018-08-08 01:45:23 -04:00
bunnei
4fa3511a63
Merge pull request #964 from Hexagon12/lower-logs
Lowered down the logging for command processor methods
2018-08-07 19:00:19 -04:00
Hexagon12
7139f05fc5 Fixed the sRGB pixel format (#963)
* Changed the sRGB pixel format return

* Add a message about SRGBA -> RGBA conversion
2018-08-07 18:59:50 -04:00
Hexagon12
bc6d91a103 Lowered down the logging for methods 2018-08-07 19:51:40 +03:00
bunnei
904d7eaa94 maxwell_3d: Remove outdated assert. 2018-08-05 23:57:19 -04:00
bunnei
57eb936200 gl_rasterizer_cache: Avoid superfluous surface copies. 2018-08-05 23:40:03 -04:00
bunnei
c8e5c74092
Merge pull request #927 from bunnei/fix-texs
gl_shader_decompiler: Fix TEXS mask and dest.
2018-08-05 16:42:21 -04:00
bunnei
c0af42d6eb
Merge pull request #912 from lioncash/global-var
video_core: Eliminate the g_renderer global variable
2018-08-05 16:37:39 -04:00
bunnei
fd715e54a1 gl_shader_decompiler: Fix TEXS mask and dest. 2018-08-05 01:47:09 -04:00
David Marcec
b96010bfa9 added braces for conditions 2018-08-05 11:36:55 +10:00
David Marcec
6d1e30e041 fix the attrib format for ints 2018-08-05 11:29:21 +10:00
bunnei
13d6593753
Merge pull request #919 from lioncash/sign
gl_shader_manager: Amend sign differences in an assertion comparison in SetShaderUniformBlockBinding()
2018-08-04 14:29:59 -04:00
Lioncash
3b678b9e8e gl_shader_manager: Invert conditional in SetShaderUniformBlockBinding()
This lets us indent the majority of the code and places the error case
first.
2018-08-04 02:57:11 -04:00
Lioncash
dde5dce736 gl_shader_manager: Amend sign differences in an assertion comparison in SetShaderUniformBlockBinding()
Ensures both operands have the same sign in the comparison.

While we're at it, we can get rid of the redundant casting of ub_size to
an int. This type will always be trivial and alias a built-in type (not
doing so would break backwards compatibility at a standard level).
2018-08-04 02:55:03 -04:00
Lioncash
2665457f4a renderer_base: Make Rasterizer() return the rasterizer by reference
All calling code assumes that the rasterizer will be in a valid state,
which is a totally fine assumption. The only way the rasterizer wouldn't
be is if initialization is done incorrectly or fails, which is checked
against in System::Init().
2018-08-04 02:36:58 -04:00
Lioncash
6030c5ce41 video_core: Eliminate the g_renderer global variable
We move the initialization of the renderer to the core class, while
keeping the creation of it and any other specifics in video_core. This
way we can ensure that the renderer is initialized and doesn't give
unfettered access to the renderer. This also makes dependencies on types
more explicit.

For example, the GPU class doesn't need to depend on the
existence of a renderer, it only needs to care about whether or not it
has a rasterizer, but since it was accessing the global variable, it was
also making the renderer a part of its dependency chain. By adjusting
the interface, we can get rid of this dependency.
2018-08-04 02:36:57 -04:00
bunnei
762fcaf5de
Merge pull request #911 from lioncash/prototype
video_core: Remove unimplemented Start() function prototype
2018-08-04 02:18:38 -04:00
bunnei
29f31356d8
Merge pull request #910 from lioncash/unused
gl_shader_decompiler: Remove unused variable in GenerateDeclarations()
2018-08-03 15:54:11 -04:00
Lioncash
b4e050e6c4 video_core: Remove unimplemented Start() function prototype
Given this has no definition, we can just remove it entirely.
2018-08-03 12:48:14 -04:00
Lioncash
b45e5c2399 gl_shader_decompiler: Remove unused variable in GenerateDeclarations()
This variable was being incremented, but we were never actually using
it.
2018-08-03 12:18:31 -04:00
Lioncash
555d76d065 gl_shader_manager: Make ProgramManager's GetCurrentProgramStage() a const member function
This function doesn't modify class state, so it can be made const.
2018-08-03 12:08:17 -04:00
bunnei
00ba704a7f
Merge pull request #892 from lioncash/global
video_core: Make global EmuWindow instance part of the base renderer …
2018-08-03 00:31:32 -04:00
bunnei
52da0ce399
Merge pull request #901 from lioncash/ref
gl_shader_manager: Take ShaderSetup instances by const reference in UseProgrammableVertexShader() and UseProgrammableFragmentShader()
2018-08-02 23:00:56 -04:00
bunnei
bae1822aed
Merge pull request #902 from lioncash/array
gl_state: Make texture_units a std::array
2018-08-02 14:57:42 -04:00
greggameplayer
fe64e1d38e Implement RGB32F PixelFormat (#886) (used by Go Vacation) 2018-08-02 14:56:38 -04:00
Lioncash
6b32e24161 gl_state: Make texture_units a std::array
Gets rid of the use of a raw C array.
2018-08-02 11:19:58 -04:00
Lioncash
d92e8ab062 gl_shader_manager: Take ShaderSetup instances by const reference in UseProgrammableVertexShader() and UseProgrammableFragmentShader()
Avoids performing unnecessary copies of 65560 byte sized ShaderSetup
instances, considering it's only used as part of lookup and not
modified.

Given the parameters were already const, it's likely taking these
parameters by reference was intended but the ampersand was forgotten.
2018-08-02 11:09:46 -04:00
Lioncash
0f2ac928f2 video_core: Make global EmuWindow instance part of the base renderer class
Makes the global a member of the RendererBase class. We also change this
to be a reference. Passing any form of null pointer to these functions
is incorrect entirely, especially given the code itself assumes that the
pointer would always be in a valid state.

This also makes it easier to follow the lifecycle of instances being
used, as we explicitly interact the renderer with the rasterizer, rather
than it just operating on a global pointer.
2018-08-01 21:40:30 -04:00
Unknown
0d8fcab136 Implement R32_FLOAT RenderTargetFormat 2018-08-01 15:31:42 +02:00
bunnei
3575c076cb
Merge pull request #869 from Subv/ubsan
Corrected a few error cases detected by asan/ubsan
2018-07-31 09:24:13 -07:00
Subv
8191273a3d MacroInterpreter: Avoid left shifting negative values.
The branch target is signed, so multiply by 4 instead of left shifting by 2
2018-07-30 20:38:24 -05:00
bunnei
e013fdc2b2
Merge pull request #808 from lioncash/mem-dedup
video_core/memory_manager: Avoid repeated unnecessary page slot lookups
2018-07-26 11:50:27 -07:00
Subv
f85cff0f48 GPU: Allow using R16F as a render target format. 2018-07-26 08:52:21 -05:00
Unknown
4672a01cbf Implement R16_G16
correct trailing white spaces


Delete tabs


correct placement
Add RG16F & RG16UI & RG16I & RG16S PixelFormats
Return correct data according to changes done previously
correct PixelFormat declaration
correct coding style error
correct coding style error part 2
correct RG16S Declaration error
correct alignment
2018-07-26 02:01:29 +02:00
bunnei
c88382517c
Merge pull request #819 from Subv/srgb
GPU: Use the right texture format for sRGBA framebuffers.
2018-07-25 14:47:26 -07:00
Subv
c5b838aeef GPU: Use the right texture format for sRGBA framebuffers. 2018-07-25 09:52:39 -05:00
Subv
ee8123bf13 GPU: Allow the use of Z24S8 as a texture format. 2018-07-25 09:41:24 -05:00
bunnei
0686183c3e
Merge pull request #816 from Subv/z32_s8
GPU: Implemented the Z32_S8_X24 depth buffer format.
2018-07-25 07:37:00 -07:00
bunnei
af787744ab
Merge pull request #815 from Subv/z32f_tex
GPU: Allow using Z32 as a texture format.
2018-07-25 07:33:09 -07:00
bunnei
704824d50a
Merge pull request #814 from Subv/rt_r8
GPU: Allow the usage of R8 as a render target format.
2018-07-25 07:32:18 -07:00
bunnei
a6ea6febc9
Merge pull request #809 from lioncash/rasterizer
gl_rasterizer: Minor cleanup
2018-07-24 19:31:34 -07:00
bunnei
e0106a7d68
Merge pull request #811 from Subv/code_address_assert
GPU: Remove the assert that required the CODE_ADDRESS to be 0.
2018-07-24 19:31:09 -07:00
Subv
daf2504d31 GPU: Implemented the Z32_S8_X24 depth buffer format. 2018-07-24 20:41:40 -05:00
Subv
f747a7e35d GPU: Allow using Z32 as a texture format. 2018-07-24 19:54:23 -05:00
Subv
4f574201ea GPU: Allow the usage of R8 as a render target format. 2018-07-24 19:49:36 -05:00
Subv
8f2c4191ab GPU: Remove the assert that required the CODE_ADDRESS to be 0.
Games usually just leave it at 0 but nouveau sets it to something else. This already works fine, the assert is useless.
2018-07-24 13:54:12 -05:00
Subv
4cc1e180ec GPU: Implemented the R16 and R16F texture formats. 2018-07-24 13:39:16 -05:00
Lioncash
0162f8b3a7 gl_rasterizer: Replace magic number with GL_INVALID_INDEX in SetupConstBuffers()
This is just the named constant that OpenGL provides, so we can use that
instead of using a literal -1
2018-07-24 12:24:49 -04:00
Lioncash
16139ed53b gl_rasterizer: Use std::string_view instead of std::string when checking for extensions
We can avoid heap allocations here by just using a std::string_view
instead of performing unnecessary copying of the string data.
2018-07-24 12:10:37 -04:00
Lioncash
b5eb3905cd gl_rasterizer: Use in-class member initializers where applicable
We can just assign to the members directly in these cases.
2018-07-24 12:08:12 -04:00
Lioncash
bf608f125e video_core/memory_manager: Replace a loop with std::array's fill() function in PageSlot()
We already have a function that does what this code was doing, so let's
use that instead.
2018-07-24 11:56:30 -04:00
Lioncash
d71e19fd75 video_core/memory_manager: Avoid repeated unnecessary page slot lookups
We don't need to keep calling the same function over and over again in a
loop, especially when the behavior is slightly non-trivial. We can just
keep a reference to the looked up location and do all the checking and
assignments based off it instead.
2018-07-24 11:19:54 -04:00
bunnei
0f830d08f1
Merge pull request #799 from Subv/tex_r32f
GPU: Implement texture format R32F.
2018-07-24 04:46:07 -07:00
bunnei
b70f757913
Merge pull request #796 from bunnei/gl-uint
maxwell_to_gl: Implement VertexAttribute::Type::UnsignedInt.
2018-07-24 04:44:56 -07:00
bunnei
69c45ce71c gl_rasterizer: Implement texture border color. 2018-07-23 23:34:42 -04:00
bunnei
6b3e54621f maxwell_to_gl: Implement Texture::WrapMode::Border. 2018-07-23 23:34:41 -04:00
Subv
ccc42702b5 GPU: Implement texture format R32F. 2018-07-23 22:21:31 -05:00
bunnei
1ff3bea6c7
Merge pull request #791 from bunnei/rg32f-rgba32f-bgra8
gl_rasterizer_cache: Implement formats BGRA8_UNORM/RGBA32_FLOAT/RG32_FLOAT
2018-07-23 20:13:19 -07:00
bunnei
2ff86f5765 maxwell_to_gl: Implement VertexAttribute::Type::UnsignedInt. 2018-07-23 23:12:14 -04:00
bunnei
92304181d5
Merge pull request #792 from lioncash/retval
gl_shader_decompiler: Correct return value of WriteTexsInstruction()
2018-07-23 20:06:48 -07:00
bunnei
47ac369180
Merge pull request #790 from bunnei/shader-print-instr
gl_shader_decompiler: Print instruction value in shader comments.
2018-07-23 19:48:47 -07:00
bunnei
c2b4ff5d48
Merge pull request #788 from bunnei/shader-check-zero
gl_shader_decompiler: Check if SetRegister result is ZeroIndex.
2018-07-23 19:44:05 -07:00
Lioncash
33e2033af5 gl_shader_decompiler: Correct return value of WriteTexsInstruction()
This should be returning void, not a std::string
2018-07-23 22:31:58 -04:00
bunnei
9505283989 gl_shader_decompiler: Implement shader instruction TLDS. 2018-07-23 22:02:51 -04:00
bunnei
a27c0099ed gl_rasterizer_cache: Implement RenderTargetFormat RG32_FLOAT. 2018-07-23 21:22:54 -04:00
bunnei
3a19c1098d gl_rasterizer_cache: Implement RenderTargetFormat RGBA32_FLOAT. 2018-07-23 21:22:53 -04:00
bunnei
bcc184acfa gl_rasterizer_cache: Implement RenderTargetFormat BGRA8_UNORM. 2018-07-23 21:22:44 -04:00
bunnei
89db8c2171 gl_rasterizer_cache: Add missing log statements. 2018-07-23 21:20:09 -04:00
bunnei
c4322ce87e gl_shader_decompiler: Print instruction value in shader comments. 2018-07-23 21:11:05 -04:00
bunnei
81aa02424b gl_shader_decompiler: Check if SetRegister result is ZeroIndex. 2018-07-23 21:08:40 -04:00
Lioncash
3b88ce3dcb gl_shader_decompiler: Simplify GetCommonDeclarations() 2018-07-23 17:11:18 -04:00
bunnei
e85a528bb9
Merge pull request #769 from bunnei/shader-mask-fixes
shader_bytecode: Implement other TEXS masks.
2018-07-22 18:03:31 -07:00
Lioncash
0797657bc0 gl_shader_decompiler: Remove redundant Subroutine construction in AddSubroutine()
We don't need to toss away the Subroutine instance after the find() call
and reconstruct another instance with the same data right after it.
Particularly give Subroutine contains a std::set.
2018-07-22 03:30:35 -04:00
bunnei
148a5bef7e shader_bytecode: Implement other TEXS masks. 2018-07-22 03:23:15 -04:00
bunnei
af4bde8cd1
Merge pull request #767 from bunnei/shader-cleanup
gl_shader_decompiler: Remove unused state tracking and minor cleanup.
2018-07-22 00:03:17 -07:00
bunnei
f5a2944ab6 gl_shader_decompiler: Remove unused state tracking and minor cleanup. 2018-07-22 01:00:44 -04:00
bunnei
c43eaa94f3 gl_shader_decompiler: Implement SEL instruction. 2018-07-22 00:37:12 -04:00
bunnei
63fbf9a7d3 gl_rasterizer_cache: Blit surfaces on recreation instead of flush and load. 2018-07-21 21:51:06 -04:00
bunnei
4301f0b539 gl_rasterizer_cache: Use GPUVAddr as cache key, not parameter set. 2018-07-21 21:51:06 -04:00
bunnei
cd47391c2d gl_rasterizer_cache: Use zeta_width and zeta_height registers for depth buffer. 2018-07-21 21:51:06 -04:00
bunnei
d8c60029d6 gl_rasterizer: Use zeta_enable register to enable depth buffer. 2018-07-21 21:51:06 -04:00
bunnei
5287991a36 maxwell_3d: Add depth buffer enable, width, and height registers. 2018-07-21 21:51:05 -04:00
bunnei
3ac736c003
Merge pull request #748 from lioncash/namespace
video_core: Use nested namespaces where applicable
2018-07-21 18:50:14 -07:00
bunnei
ff8754f921
Merge pull request #747 from lioncash/unimplemented
gl_shader_manager: Remove unimplemented function prototype
2018-07-21 10:54:58 -07:00
Lioncash
d5bc9aef4e gl_shader_manager: Replace unimplemented function prototype
This was just a linker error waiting to happen.
2018-07-20 18:39:54 -04:00
Lioncash
863579736c gpu: Rename Get3DEngine() to Maxwell3D()
This makes it match its const qualified equivalent.
2018-07-20 18:34:49 -04:00
Lioncash
bb960c8cb4 video_core: Use nested namespaces where applicable
Compresses a few namespace specifiers to be more compact.
2018-07-20 18:23:54 -04:00
bunnei
29f49bd3c1
Merge pull request #738 from lioncash/sign
gl_state: Get rid of mismatched sign conversions in Apply()
2018-07-20 09:21:57 -07:00
bunnei
fbc2bcd4a9
Merge pull request #735 from lioncash/video-unused
maxwell_3d: Remove unused variable within GetStageTextures()
2018-07-20 09:16:15 -07:00
bunnei
204d707ce7
Merge pull request #731 from lioncash/shadow
gl_shader_decompiler: Eliminate variable and declaration shadowing
2018-07-20 09:13:36 -07:00
Lioncash
0faa13baeb gl_state: Make references const where applicable in Apply() 2018-07-20 01:12:29 -04:00
Lioncash
e6b3d3a9ea gl_state: Get rid of mismatched sign conversions
While we're at it, amend the loop variable type to be the same width as
that returned by the .size() call.
2018-07-20 01:11:20 -04:00
Lioncash
8b08f82dc7 maxwell_3d: Remove unused variable within GetStageTextures() 2018-07-19 22:38:28 -04:00
Lioncash
f26866ff6a gl_shader_decompiler: Eliminate variable and declaration shadowing
Ensures that no identifiers are being hidden, which also reduces
compiler warnings.
2018-07-19 20:32:49 -04:00
Lioncash
c2121cb059 gl_shader_decompiler: Remove unnecessary const from return values
This adds nothing from a behavioral point of view, and can inhibit the
move constructor/RVO
2018-07-19 20:11:04 -04:00
bunnei
cf30c4be22 gl_state: Temporarily disable culling and depth test. 2018-07-18 23:21:43 -04:00
bunnei
49b0966003
Merge pull request #687 from lioncash/instance
core: Don't construct instance of Core::System, just to access its live instance
2018-07-18 18:55:58 -07:00
bunnei
b496a9eefe decoders: Fix calc of swizzle image_width_in_gobs. 2018-07-18 21:42:52 -04:00
Lioncash
3a4841e403 core: Don't construct instance of Core::System, just to access its live instance
This would result in a lot of allocations and related object
construction, just to toss it all away immediately after the call.

These are definitely not intentional, and it was intended that all of
these should have been accessing the static function GetInstance()
through the name itself, not constructed instances.
2018-07-18 18:18:27 -04:00
bunnei
b87a71b3fe
Merge pull request #678 from lioncash/astc
astc: Minor changes
2018-07-17 22:06:20 -07:00
Lioncash
6a03badcbc astc: Initialize vector size directly in Decompress
There's no need to perform a separate resize.
2018-07-17 23:58:14 -04:00
Lioncash
0f148548f3 astc: Mark functions as internally linked where applicable 2018-07-17 23:58:14 -04:00
Lioncash
c5803e30d3 astc: const-correctness changes where applicable
A few member functions didn't actually modify class state, so these can
be amended as necessary.
2018-07-17 23:58:14 -04:00
Lioncash
e3fadb9616 astc: Delete Bits' copy contstructor and assignment operator
This also potentially avoids warnings, considering the copy assignment
operator is supposed to have a return value.
2018-07-17 23:58:14 -04:00
Lioncash
4cd52a34b9 astc: In-class initialize member variables where appropriate 2018-07-17 23:58:10 -04:00
bunnei
c3dd456d51 vi: Partially implement buffer crop parameters. 2018-07-17 20:13:17 -04:00
Subv
3d3b10adc7 GPU: Added register definitions for the stencil parameters. 2018-07-17 15:00:21 -05:00
bunnei
3a96670f2d gl_rasterizer_cache: Implement texture format G8R8. 2018-07-15 01:33:42 -04:00
bunnei
aaec0b7e70
Merge pull request #665 from bunnei/fix-z24-s8
gl_rasterizer_cache: Fix incorrect offset in ConvertS8Z24ToZ24S8.
2018-07-14 22:18:55 -07:00
bunnei
3145114190 gl_rasterizer_cache: Fix incorrect offset in ConvertS8Z24ToZ24S8. 2018-07-15 00:02:05 -04:00
bunnei
e21190f47f gl_rasterizer_cache: Implement depth format Z16_UNORM. 2018-07-14 23:43:28 -04:00
bunnei
2cb3fdca86
Merge pull request #598 from bunnei/makedonecurrent
OpenGL: Use MakeCurrent/DoneCurrent for multithreaded rendering.
2018-07-14 20:18:11 -07:00
bunnei
05cb10530f OpenGL: Use MakeCurrent/DoneCurrent for multithreaded rendering. 2018-07-14 02:50:35 -04:00
Subv
b37354cca8 GPU: Always enable the depth write when clearing the depth buffer.
The GPU ignores that register when clearing, but OpenGL obeys the glDepthMask parameter, so we set the depth mask to GL_TRUE when clearing the depth buffer. It will be restored to the correct value automatically on the next draw call.
2018-07-14 00:52:23 -05:00
bunnei
8aeff9cf8e gl_rasterizer: Fix check for if a shader stage is enabled. 2018-07-12 22:57:57 -04:00
bunnei
c4015cd93a gl_shader_gen: Implement dual vertex shader mode.
- When VertexA shader stage is enabled, we combine with VertexB program to make a single Vertex Shader stage.
2018-07-12 22:25:36 -04:00
bunnei
64b5e5d5d9
Merge pull request #655 from bunnei/pred-lt-nan
gl_shader_decompiler: Implement PredCondition::LessThanWithNan.
2018-07-12 18:59:15 -07:00
bunnei
49c0c081c4 gl_shader_decompiler: Implement PredCondition::LessThanWithNan. 2018-07-12 20:04:35 -04:00
bunnei
4757ffdcce gl_shader_decompiler: Use FlowCondition field in EXIT instruction. 2018-07-12 20:00:37 -04:00
Sebastian Valle
274d1fb0fc
Merge pull request #652 from Subv/fadd32i
GPU: Implement the FADD32I shader instruction.
2018-07-12 17:36:51 -05:00
bunnei
3ff21345b4
Merge pull request #651 from Subv/ffma_decode
GPU: Corrected the decoding of FFMA for immediate operands.
2018-07-12 12:42:58 -07:00
Subv
c1ae841f47 GPU: Implement the FADD32I shader instruction. 2018-07-12 12:00:31 -05:00
Subv
0cad310e12 GPU: Corrected the decoding of FFMA for immediate operands. 2018-07-12 10:15:48 -05:00
bunnei
854f474f52 gl_rasterizer: Flip triangles when regs.viewport_transform[0].scale_y is negative.
- Fixes a regression with Binding of Isaac.
2018-07-08 16:16:24 -04:00
bunnei
639346bcfb
Merge pull request #625 from Subv/imnmx
GPU: Implemented the IMNMX shader instruction.
2018-07-07 19:33:50 -07:00
Subv
4633dd9505 GPU: Implemented the BC7U texture format.
Note: Our version of glad exports GL_COMPRESSED_RGBA_BPTC_UNORM as GL_COMPRESSED_RGBA_BPTC_UNORM_ARB, maybe it's time we update it.
2018-07-07 09:17:48 -05:00
bunnei
51bd76a5fd
Merge pull request #629 from Subv/depth_test
GPU: Allow using the old NV04 values for the depth test function.
2018-07-05 16:43:10 -04:00
Subv
9f6a5660e8 GPU: Allow using the old NV04 values for the depth test function.
These seem to be just a valid as the GL token values. Thanks @ReinUsesLisp

This restores graphical output to Disgaea 5
2018-07-05 13:01:31 -05:00
bunnei
762bf6a522
Merge pull request #626 from Subv/shader_sync
GPU: Stub the shader SYNC and DEPBAR instructions.
2018-07-05 12:54:19 -04:00
bunnei
637f9d780a
Merge pull request #624 from Subv/f2f_round
GPU: Implemented the F2F 'round' rounding mode.
2018-07-05 11:30:29 -04:00
bunnei
956b5db52e
Merge pull request #623 from Subv/vertex_types
GPU: Implement the Size_16_16 and Size_10_10_10_2 vertex attribute types
2018-07-05 11:30:01 -04:00
bunnei
8b815877a6
Merge pull request #622 from Subv/unused_tex
GPU: Ignore unused textures and corrected the TEX shader instruction decoding.
2018-07-05 11:29:17 -04:00
bunnei
1b0a74e23f
Merge pull request #621 from Subv/psetp_
GPU: Implemented the PSETP shader instruction.
2018-07-05 11:28:50 -04:00
bunnei
9a3c0b161e
Merge pull request #620 from Subv/depth_z32f
GPU: Implemented the 32 bit float depth buffer format.
2018-07-05 11:09:15 -04:00
Subv
b0c92b80b1 GPU: Implemented the IMNMX shader instruction.
It's similar to the FMNMX instruction but it works on integers.
2018-07-04 15:44:37 -05:00
Subv
d800a02b4b GPU: Implemented the F2F 'round' rounding mode.
It's implemented via the GLSL 'roundEven()' function.
2018-07-04 15:43:21 -05:00
Subv
77cfe4f027 GPU: Stub the shader SYNC and DEPBAR instructions.
It is unknown at this moment if we actually need to do something with these instructions or if the GLSL compiler takes care of that for us.
2018-07-04 15:29:51 -05:00
Subv
ce39ae3e57 GPU: Implement the Size_16_16 and Size_10_10_10_2 vertex attribute types.
Both signed and unsigned variants.
2018-07-04 15:22:34 -05:00
Subv
4bda9693be GPU: Ignore textures that the GLSL compiler deemed unused when binding textures to the shaders. 2018-07-04 15:20:12 -05:00
Subv
c42b818cf9 GPU: Corrected the decoding for the TEX shader instruction. 2018-07-04 15:19:20 -05:00
Subv
53a55bd751 GPU: Implemented the PSETP shader instruction.
It's similar to the isetp and fsetp instructions but it works on predicates instead.
2018-07-04 15:15:03 -05:00
Subv
016e357c75 GPU: Implemented the 32 bit float depth buffer format. 2018-07-04 10:42:33 -05:00
Subv
c1bebdef5e GPU: Flip the triangle front face winding if the GPU is configured to not flip the triangles.
OpenGL's default behavior is already correct when the GPU is configured to flip the triangles.

This fixes 1-2 Switch's splash screen.
2018-07-04 10:26:46 -05:00
Subv
5a9df3c675 GPU: Only configure the used framebuffers during clear.
Don't try to configure the color buffer if it is not being cleared, it may not be completely valid at this point.
2018-07-03 22:32:59 -05:00
bunnei
c996787d84
Merge pull request #609 from Subv/clear_buffers
GPU: Implemented the CLEAR_BUFFERS register.
2018-07-03 19:34:34 -04:00
Subv
78443a7f29 GPU: Factor out the framebuffer configuration code for both Clear and Draw commands. 2018-07-03 16:56:47 -05:00
Subv
c1811ed3d1 GPU: Support clears that don't clear the color buffer. 2018-07-03 16:56:47 -05:00
Subv
be51120d23 GPU: Bind and clear the render target when the CLEAR_BUFFERS register is written to. 2018-07-03 16:56:44 -05:00
Subv
827bb08c91 GPU: Added registers for the CLEAR_BUFFERS and CLEAR_COLOR methods. 2018-07-03 16:56:31 -05:00
bunnei
9da1552417 gl_rasterizer_cache: Implement PixelFormat S8Z24. 2018-07-03 14:58:13 -04:00
bunnei
15e68cdbaa
Merge pull request #607 from jroweboy/logging
Logging - Customizable backends
2018-07-03 00:26:45 -04:00
bunnei
e3ca561ea0
Merge pull request #612 from bunnei/fix-cull
gl_rasterizer: Only set cull mode and front face if enabled.
2018-07-02 23:48:52 -04:00
bunnei
ddb767f1b6
Merge pull request #611 from Subv/enabled_depth_test
GPU: Don't try to parse the depth test function if the depth test is disabled and use only the least significant 3 bits in the depth test func
2018-07-02 23:47:11 -04:00
bunnei
5410b4659d
Merge pull request #610 from Subv/mufu_8
GPU: Implemented MUFU suboperation 8, sqrt.
2018-07-02 22:26:42 -04:00
bunnei
a9cacd03f6 gl_rasterizer: Only set cull mode and front face if enabled. 2018-07-02 22:22:25 -04:00
Subv
6e0eba9917 GPU: Use only the least significant 3 bits when reading the depth test func.
Some games set the full GL define value here (including nouveau), but others just seem to set those last 3 bits.
2018-07-02 21:06:36 -05:00
Subv
65c664560c GPU: Don't try to parse the depth test function if the depth test is disabled. 2018-07-02 21:02:46 -05:00
James Rowe
0d46f0df12 Update clang format 2018-07-02 21:45:47 -04:00
James Rowe
638956aa81 Rename logging macro back to LOG_* 2018-07-02 21:45:47 -04:00
bunnei
92c7135065
Merge pull request #608 from Subv/depth
GPU: Implemented the depth buffer and depth test + culling
2018-07-02 21:24:43 -04:00
Subv
a6d4903aaf GPU: Set up the culling configuration on each draw. 2018-07-02 19:51:29 -05:00
Subv
6e4e0b2b41 GPU: Implemented MUFU suboperation 8, sqrt. 2018-07-02 19:48:15 -05:00
Sebastian Valle
055f1546d7
Merge pull request #606 from Subv/base_vertex
GPU: Fixed the index offset and implement BaseVertex when doing indexed rendering.
2018-07-02 14:07:38 -05:00
Sebastian Valle
9685dd5840
Merge pull request #605 from Subv/dma_copy
GPU: Directly copy the pixels when performing a same-layout DMA.
2018-07-02 14:06:56 -05:00
Subv
18c8ae7750 GPU: Set up the depth test state on every draw. 2018-07-02 13:33:06 -05:00
Subv
d480b63e0d MaxwellToGL: Added conversion functions for depth test and cull mode. 2018-07-02 13:31:49 -05:00
Subv
c1f55c32c8 GPU: Added registers for depth test and cull mode. 2018-07-02 13:31:20 -05:00
Subv
0f929762b3 GPU: Implemented the Z24S8 depth format and load the depth framebuffer. 2018-07-02 12:42:04 -05:00
Subv
4c59105adf GPU: Implement offsetted rendering when using non-indexed drawing. 2018-07-02 11:23:36 -05:00
Subv
fca3d1cc65 GPU: Fixed the index offset rendering, and implemented the base vertex functionality.
This fixes Stardew Valley.
2018-07-02 11:22:17 -05:00
Subv
cc73bad293 GPU: Added register definitions for the vertex buffer base element. 2018-07-02 11:21:23 -05:00
bunnei
3d41fdfbba
Merge pull request #604 from Subv/invalid_textures
GPU: Ignore invalid and disabled textures when drawing.
2018-07-02 11:48:18 -04:00
Subv
ca633a5a3c GPU: Directly copy the pixels when performing a same-layout DMA. 2018-07-02 09:46:33 -05:00
Subv
80c5e8ae99 GPU: Ignore disabled textures and textures with an invalid address. 2018-07-02 09:43:38 -05:00
Subv
e9d147349b GPU: Allow GpuToCpuAddress to return boost::none for unmapped addresses. 2018-07-02 09:42:48 -05:00
bunnei
066d6184d4
Merge pull request #602 from Subv/mufu_subop
GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation.
2018-07-01 11:06:04 -04:00
bunnei
b611d852db
Merge pull request #601 from Subv/rgba32_ui
GPU: Implement the RGBA32_UINT rendertarget format.
2018-07-01 03:22:38 -04:00
Subv
f33e406ff2 GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation. 2018-06-30 14:48:25 -05:00
Subv
c0e2d52758 GPU: Implemented the RGBA32_UINT rendertarget format. 2018-06-30 14:23:13 -05:00
Subv
b11072d54a GLCache: Specify the component type along the texture type in the format tuple. 2018-06-30 14:08:51 -05:00
bunnei
c96da97630 gl_shader_decompiler: Implement predicate NotEqualWithNan. 2018-06-30 03:01:25 -04:00
bunnei
50ef2beb58
Merge pull request #595 from bunnei/raster-cache
Rewrite the OpenGL rasterizer cache
2018-06-29 14:07:28 -04:00
bunnei
c18425ef98 gl_rasterizer_cache: Only dereference color_surface/depth_surface if valid. 2018-06-29 13:08:08 -04:00
bunnei
7fa9177830
gl_shader_decompiler: Add a return path for unknown instructions. 2018-06-27 01:14:34 -04:00
bunnei
1dd754590f gl_rasterizer_cache: Implement caching for texture and framebuffer surfaces.
gl_rasterizer_cache: Improved cache management based on Citra's implementation.

gl_surface_cache: Add some docstrings.
2018-06-27 00:15:44 -04:00
bunnei
8af1ae46aa gl_rasterizer_cache: Various fixes for ASTC handling. 2018-06-27 00:08:04 -04:00
bunnei
c7c379bd19 gl_rasterizer_cache: Use SurfaceParams as a key for surface caching. 2018-06-27 00:08:04 -04:00
bunnei
6a28a66832 maxwell_3d: Add a struct for RenderTargetConfig. 2018-06-27 00:08:04 -04:00
bunnei
3f9f047375 gl_rasterizer: Implement AccelerateDisplay to forward textures to framebuffers. 2018-06-27 00:08:03 -04:00
bunnei
ff6785f3e8 gl_rasterizer_cache: Cache size_in_bytes as a const per surface. 2018-06-27 00:08:03 -04:00
bunnei
9f2f819bb6 gl_rasterizer_cache: Refactor to make SurfaceParams members const. 2018-06-27 00:08:03 -04:00
bunnei
5f57ab1b2a gl_rasterizer_cache: Remove Citra's rasterizer cache, always load/flush surfaces. 2018-06-27 00:08:03 -04:00
bunnei
10422f3c18 gl_rasterizer: Workaround for when exceeding max UBO size. 2018-06-26 23:07:34 -04:00
bunnei
dfac394e60
Merge pull request #593 from bunnei/fix-swizzle
gl_state: Fix state management for texture swizzle.
2018-06-26 22:05:49 -04:00
bunnei
73de9bab1a
Merge pull request #592 from bunnei/cleanup-gl-state
gl_state: Remove unused state management from 3DS.
2018-06-26 22:05:03 -04:00
bunnei
8447d20a11 gl_state: Fix state management for texture swizzle. 2018-06-26 17:15:58 -04:00
bunnei
20b58bab9c gl_state: Remove unused state management from 3DS. 2018-06-26 17:09:25 -04:00
bunnei
41b3725d28 gl_rasterizer_cache: Fix inverted B5G6R5 format. 2018-06-26 17:07:36 -04:00
bunnei
36dedae842
Merge pull request #554 from Subv/constbuffer_ubo
Rasterizer: Use UBOs instead of SSBOs for uploading const buffers.
2018-06-26 10:25:56 -04:00
mailwl
ad39bab271 Fix crash at exit 2018-06-25 18:01:08 +03:00
Subv
a3d82ef5d9 Build: Fixed some MSVC warnings in various parts of the code. 2018-06-20 11:39:10 -05:00
bunnei
7a0bb406d5
Merge pull request #574 from Subv/shader_abs_neg
GPU: Perform negation after absolute value in the float shader instructions.
2018-06-18 22:24:57 -04:00
Subv
38989bef43 GPU: Perform negation after absolute value in the float shader instructions. 2018-06-18 19:56:29 -05:00
Subv
eab7457c00 GPU: Don't mark uniform buffers and registers as used for instructions which don't have them.
Like the MOV32I and FMUL32I instructions.
This fixes a potential crash when using these instructions.
2018-06-18 19:50:35 -05:00
bunnei
0e13d9cb7b
Merge pull request #570 from bunnei/astc
gl_rasterizer: Implement texture format ASTC_2D_4X4.
2018-06-18 19:08:49 -04:00
bunnei
ea080501fb
Merge pull request #571 from Armada651/loose-blend
gl_rasterizer: Get loose on independent blending.
2018-06-18 11:36:50 -04:00
Jules Blok
7c7f4a9be2 gl_rasterizer: Get loose on independent blending. 2018-06-18 09:27:06 +02:00
bunnei
61779fa072 gl_rasterizer: Implement texture format ASTC_2D_4X4. 2018-06-18 01:56:59 -04:00
bunnei
fe906fff36 gl_rasterizer_cache: Loosen things up a bit. 2018-06-18 00:55:59 -04:00
bunnei
afdd657d30 gl_shader_decompiler: Implement LOP instructions. 2018-06-17 15:27:48 -04:00
bunnei
5673ce39c7 gl_shader_decompiler: Refactor LOP32I instruction a bit in support of LOP. 2018-06-17 13:31:39 -04:00
bunnei
d383043e07 gl_shader_decompiler: Implement integer size conversions for I2I/I2F/F2I. 2018-06-15 22:42:02 -04:00
bunnei
fb5bd0920d
Merge pull request #564 from bunnei/lop32i_passb
gl_shader_decompiler: Implement LOP32I LogicOperation PassB.
2018-06-15 22:04:03 -04:00
bunnei
55c49d5bf4 gl_shader_gen: Set position.w to 1. 2018-06-15 20:47:04 -04:00
bunnei
61f9d9c4ab gl_shader_decompiler: Implement LOP32I LogicOperation PassB. 2018-06-15 20:43:33 -04:00
bunnei
019d7208c8
Merge pull request #556 from Subv/dma_engine
GPU: Partially implemented the Maxwell DMA engine.
2018-06-12 14:25:17 -04:00
bunnei
2015a1b180
Merge pull request #558 from Subv/iadd32i
GPU: Implemented the iadd32i shader instruction.
2018-06-12 14:19:25 -04:00
Subv
db0497b808 GPU: Implemented the iadd32i shader instruction. 2018-06-12 11:46:45 -05:00
Subv
987a170665 GPU: Partially implemented the Maxwell DMA engine.
Only tiled->linear and linear->tiled copies that aren't offsetted are supported for now. Queries are not supported. Swizzled copies are not supported.
2018-06-12 11:27:36 -05:00
bunnei
5f3d6c85db gl_shader_decompiler: Implement saturate for float instructions. 2018-06-11 21:46:34 -04:00
Subv
004b1b3830 GPU: Convert the gl_InstanceId and gl_VertexID variables to floats when reading from them.
This corrects the invalid position values in some games when doing attribute-less rendering.
2018-06-10 13:50:19 -05:00
Subv
2a7653142d Rasterizer: Use UBOs instead of SSBOs for uploading const buffers.
This should help a bit with GPU performance once we're GPU-bound.
2018-06-09 18:02:05 -05:00
Subv
b366b885a1 GPU: Implement the iset family of shader instructions. 2018-06-09 16:19:13 -05:00
Subv
3cb753eeb1 GPU: Added decodings for the ISET family of instructions. 2018-06-09 15:56:50 -05:00
bunnei
d81aaa3ed3
Merge pull request #550 from Subv/ssy
GPU: Stub the SSY shader instruction.
2018-06-09 00:42:53 -04:00
bunnei
e2176dc7ce
Merge pull request #551 from bunnei/shr
gl_shader_decompiler: Implement SHR instruction.
2018-06-09 00:42:44 -04:00
bunnei
5440b9c634 gl_shader_decompiler: Implement SHR instruction. 2018-06-09 00:01:17 -04:00
Subv
abec5f82e2 GPU: Stub the SSY shader instruction.
This instruction tells the GPU where the flow reconverges in a non-uniform control flow scenario, we can ignore this when generating GLSL code.
2018-06-08 22:46:10 -05:00
bunnei
bbc4f369ed gl_shader_decompiler: Implement IADD instruction. 2018-06-08 23:25:22 -04:00
bunnei
79e9c2e237 gl_shader_decompiler: Add missing asserts for saturate_a instructions. 2018-06-08 23:24:10 -04:00
Subv
c011b6f67e GPU: Synchronize the blend state on every draw call.
Only independent blending on render target 0 is implemented for now.

This fixes the elongated squids in Splatoon 2's boot screen.
2018-06-08 17:05:52 -05:00
Subv
c712dafaee GPU: Added registers for normal and independent blending. 2018-06-08 17:04:41 -05:00
bunnei
a931cf9e8b
Merge pull request #547 from Subv/compressed_alignment
GLCache: Align compressed texture sizes to their compression ratio, and then align that compressed size to the block height for tiled textures.
2018-06-08 16:40:49 -04:00
Subv
8d9534d830 GLCache: Align compressed texture sizes to their compression ratio, and then align that compressed size to the block height for tiled textures.
This fixes issues with retrieving non-block-aligned tiled compressed textures from the cache.
2018-06-08 12:27:19 -05:00
Subv
47dc5e0dab Rasterizer: Flush the written region when writing shader uniform data before copying it to the uniform buffers.
This fixes the flip_viewport uniform having invalid values when drawing.
2018-06-08 12:22:39 -05:00
bunnei
ee318d4015
Merge pull request #543 from Subv/uniforms
GLRenderer: Write the shader stage configuration UBO data *before* copying it to the GPU.
2018-06-07 11:21:36 -04:00
Subv
86146ef819 GLRenderer: Write the shader stage configuration UBO data *before* copying it to the GPU.
This should fix the bug with the vs_config UBO being uninitialized during shader execution.
2018-06-07 08:33:23 -05:00
bunnei
0639e03055
Merge pull request #542 from bunnei/bfe_imm
gl_shader_decompiler: Implement BFE_IMM instruction.
2018-06-07 01:49:45 -04:00
bunnei
930487c7fb
Merge pull request #541 from Subv/blittextures
GLCache: Fixed copying compressed textures in the rasterizer cache.
2018-06-07 01:35:01 -04:00
bunnei
92209f905f gl_shader_decompiler: Implement BFE_IMM instruction. 2018-06-07 00:58:12 -04:00
Subv
f22e090b86 GLCache: Use the full uncompressed size when blitting from one texture to another.
This avoids the problem of only copying a tiny piece of the textures when they are compressed.
2018-06-06 23:26:36 -05:00
Subv
218a08df93 GLCache: Simplify the logic to copy from one texture to another in BlitTextures.
We now use glCopyImageSubData, this should avoid errors with trying to attach a compressed texture as a framebuffer's color attachment and then blitting to it.

Maybe in the future we can change this to glCopyTextureSubImage which only requires GL_ARB_direct_state_access.
2018-06-06 23:25:24 -05:00
bunnei
128aeba0f3 gl_shader_decompiler: F2F: Implement rounding modes. 2018-06-06 22:21:29 -04:00
bunnei
03f877919d
Merge pull request #537 from bunnei/misc-shader
gl_shader_decompiler: Additional decodings, remove unused stuff from TEX
2018-06-06 21:44:37 -04:00
bunnei
37f50c8773
Merge pull request #535 from Subv/gpu_swizzle
GPU: Support changing the texture swizzles for Maxwell textures.
2018-06-06 21:39:47 -04:00
bunnei
00c830405b gl_shader_decompiler: Remove some attribute stuff that has nothing to do with TEX/TEXS. 2018-06-06 19:47:41 -04:00
bunnei
4b114e1b8a shader_bytecode: Add instruction decodings for BFE, IMNMX, and XMAD. 2018-06-06 19:47:34 -04:00
bunnei
0a49c46353 gl_shader_decompiler: Implement ISETP_IMM instruction. 2018-06-06 19:45:58 -04:00
Subv
47629c89a8 GPU: Support changing the texture swizzles for Maxwell textures. 2018-06-06 18:36:15 -05:00
Subv
89e81a9be2 GLState: Support changing the GL_TEXTURE_SWIZZLE parameter of each texture unit. 2018-06-06 18:36:13 -05:00
bunnei
0ff2929644
Merge pull request #534 from Subv/multitexturing
GPU: Implement sampling multiple textures in the generated glsl shaders.
2018-06-06 19:12:52 -04:00
bunnei
4669f15f8b gl_shader_decompiler: Implement LD_C instruction. 2018-06-06 18:09:06 -04:00
bunnei
4112aa68a6 gl_shader_gen: Add uniform handling for indirect const buffer access. 2018-06-06 18:09:05 -04:00
bunnei
6e386a334b gl_shader_decompiler: Refactor uniform handling to allow different decodings. 2018-06-06 17:57:15 -04:00
Subv
dbfc39d214 GPU: Implement sampling multiple textures in the generated glsl shaders.
All tested games that use a single texture show no regression.

Only Texture2D textures are supported right now, each shader gets its own "tex_fs/vs/gs" sampler array to maintain independent textures between shader stages, the textures themselves are reused if possible.
2018-06-06 12:58:16 -05:00
Sebastian Valle
ce026332a5
Merge pull request #531 from bunnei/fix-shl
gl_shader_decompiler: Fix un/signed mismatch with SHL.
2018-06-06 08:28:42 -05:00
Sebastian Valle
fa220dd709
Merge pull request #530 from bunnei/wrap-mirror
maxwell_to_gl: Implement WrapMode Mirror.
2018-06-06 08:28:27 -05:00
bunnei
9a85277d83
Merge pull request #527 from Subv/rgba32f_texcopy
GPU: Allow the usage of RGBA32_FLOAT and RGBA16_FLOAT in the texture copy engine.
2018-06-06 00:24:13 -04:00
bunnei
05dc93399b
Merge pull request #528 from Subv/rg11b10f
GPU: Implemented the R11FG11FB10F texture and rendertarget formats.
2018-06-06 00:22:54 -04:00
bunnei
566f97b580 gl_shader_decompiler: Fix un/signed mismatch with SHL. 2018-06-05 23:58:06 -04:00
bunnei
bf0543af23 maxwell_to_gl: Implement WrapMode Mirror. 2018-06-05 23:56:45 -04:00
Subv
adf47cd59a GPU: Allow the usage of RGBA16_FLOAT in the texture copy engine. 2018-06-05 22:01:20 -05:00
Subv
c531a92eda GPU: Implemented the R11FG11FB10F texture and rendertarget formats. 2018-06-05 21:57:16 -05:00
Subv
14afc704d4 GPU: Fixed the compression factor for RGBA16F textures.
They're not compressed.
2018-06-05 21:55:17 -05:00
Subv
8d70d1ea45 GPU: Allow the usage of RGBA32_FLOAT in the texture copy engine. 2018-06-05 21:07:40 -05:00
bunnei
5fb99e6a16
Merge pull request #516 from Subv/f2i_r
GPU: Implemented the F2I_R shader instruction.
2018-06-05 22:01:29 -04:00
bunnei
38eb33f150
Merge pull request #521 from Subv/bra
GPU: Corrected the branch targets for the shader bra instruction.
2018-06-05 10:09:35 -04:00
bunnei
b54a72afc0
Merge pull request #520 from bunnei/shader-shl
gl_shader_decompiler: Implement SHL instruction.
2018-06-05 10:08:42 -04:00
Subv
e7dfcdde74 GPU: Corrected the branch targets for the shader bra instruction. 2018-06-04 22:56:28 -05:00
Subv
4b89348c00 GPU: Implemented the F2I_R shader instruction. 2018-06-04 22:06:50 -05:00
bunnei
8c99dd055c
Merge pull request #518 from Subv/incomplete_shaders
GPU: Implemented predicated exit instructions in the shader programs.
2018-06-04 22:43:46 -04:00
bunnei
799e632ccb gl_shader_decompiler: Fix typo with ISCADD instruction. 2018-06-04 22:41:10 -04:00
bunnei
c23c30c76f gl_shader_decompiler: Implement SHL instruction. 2018-06-04 22:36:49 -04:00
bunnei
6ea1576513 gl_shader_decompiler: Implement PredCondition::NotEqual. 2018-06-04 22:00:47 -04:00
Subv
23b1e6eded GPU: Implement the ISCADD shader instructions. 2018-06-04 20:17:41 -05:00
Subv
438a9b70cc GPU: Added decodings for the ISCADD instructions. 2018-06-04 20:17:39 -05:00
bunnei
e8bfff7b4b
Merge pull request #514 from Subv/lop32i
GPU: Implemented the LOP32I instruction.
2018-06-04 20:48:15 -04:00
bunnei
f564822e78
Merge pull request #510 from Subv/isetp
GPU: Implemented the ISETP_R and ISETP_C instructions
2018-06-04 20:47:11 -04:00
Subv
6cf6fa2842 GPU: Implement predicated exit instructions in the shader programs. 2018-06-04 19:18:11 -05:00
Subv
d27279092f GPU: Take into account predicated exits when performing shader control flow analysis. 2018-06-04 19:14:23 -05:00
bunnei
37fd4e6d9b
Merge pull request #512 from Subv/fset
GPU: Corrected the FSET and I2F instructions.
2018-06-04 19:04:20 -04:00
bunnei
cdd92dc692
Merge pull request #501 from Subv/shader_bra
GPU: Partially implemented the bra shader instruction
2018-06-04 18:31:07 -04:00
bunnei
38d25a4cb2
Merge pull request #515 from Subv/viewport_fix
GPU: Calculate the correct viewport dimensions based on the scale and translate registers.
2018-06-04 18:11:36 -04:00
Subv
2933521a08 GPU: Use the bf bit in FSET to determine whether to write 0xFFFFFFFF or 1.0f. 2018-06-04 16:41:28 -05:00
Subv
f6679ce422 GPU: Corrected the I2F_R implementation. 2018-06-04 16:41:27 -05:00
Subv
5d55403f94 GPU: Calculate the correct viewport dimensions based on the scale and translate registers.
This is how nouveau calculates the viewport width and height. For some reason some games set 0xFFFF in the VIEWPORT_HORIZ and VIEWPORT_VERT registers, maybe those are a misnomer and actually refer to something else?
2018-06-04 16:36:54 -05:00
Subv
0c688b421c GPU: Implemented the LOP32I instruction. 2018-06-04 13:56:31 -05:00
Subv
cb47abecc6 GLCache: Corrected a mismatch between storing compressed sizes and verifying the uncompressed alignment in GetSurface. 2018-06-04 13:01:53 -05:00
Subv
90cddf1996 GPU: Use explicit types when retrieving the uniform values for fsetp/fset and isetp instead of the type of an invalid output register. 2018-06-04 11:22:26 -05:00
Subv
7c181fd4f4 GPU: Implemented the ISETP_R and ISETP_C shader instructions. 2018-06-04 11:12:03 -05:00
Subv
b481d8a00d GPU: Partially implemented the shader BRA instruction. 2018-06-03 22:26:36 -05:00
Subv
06c72b4fcf GPU: Added decoding for the BRA instruction. 2018-06-03 22:14:00 -05:00
bunnei
ba117854f9
Merge pull request #500 from Subv/long_queries
GPU: Partial implementation of long GPU queries.
2018-06-03 21:24:50 -04:00
Subv
d57333406d GPU: Partial implementation of long GPU queries.
Long queries write a 128-bit result value to memory, which consists of a 64 bit query value and a 64 bit timestamp.

In this implementation, only select=Zero of the Crop unit is implemented, this writes the query sequence as a 64 bit value, and a 0u64 value for the timestamp, since we emulate an infinitely fast GPU.

This specific type was hwtested, but more rigorous tests should be performed in the future for the other types.
2018-06-03 19:17:31 -05:00
bunnei
1efcba346a gl_shader_decompiler: Implement TEXS component mask. 2018-06-03 12:08:17 -04:00
bunnei
bb9d39b8fe
Merge pull request #494 from bunnei/shader-tex
gl_shader_decompiler: Implement TEX, fixes for TEXS.
2018-06-03 12:05:38 -04:00
bunnei
27c0f9e02d
Merge pull request #495 from bunnei/improve-rro
gl_shader_decompiler: Implement RRO as a register move.
2018-06-03 12:05:26 -04:00
bunnei
e54ea773fc gl_shader_decompiler: Implement RRO as a register move. 2018-06-03 11:14:31 -04:00
Subv
99f9d47d16 GPU: Implemented the DXN1 (BC4) texture format. 2018-06-02 13:17:09 -05:00
bunnei
888eb345c0 gl_shader_decompiler: Implement TEX instruction. 2018-05-31 23:36:45 -04:00
bunnei
4c727d0ba8 gl_shader_decompiler: Support multi-destination for TEXS. 2018-05-31 22:57:32 -04:00
bunnei
49309b5848 gl_rasterizer_cache: Assert that component type is UNorm or format is RGBA16F. 2018-05-30 22:50:41 -04:00
bunnei
ca5a4a704b gl_rasterizer_cache: Implement PixelFormat RGBA16F. 2018-05-30 22:24:07 -04:00
bunnei
15086a22be
Merge pull request #489 from Subv/vertexid
Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader.
2018-05-30 14:10:48 -04:00
Subv
99f12b05fa Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader. 2018-05-30 10:58:03 -05:00
Sebastian Valle
8df011a57f
Merge pull request #483 from bunnei/sonic
Several GPU fixes to boot Sonic Mania
2018-05-30 07:31:46 -05:00
bunnei
6fcc7e9c36 gl_shader_decompiler: F2F_R instruction: Implement abs. 2018-05-29 23:52:54 -04:00
bunnei
68937a662d gl_shader_decompiler: Partially implement F2F_R instruction. 2018-05-29 23:10:44 -04:00
Subv
734106dcb9 GPU: Implemented the R8 texture format (0x1D) 2018-05-29 21:49:37 -05:00
bunnei
0d843eaba6 gl_rasterize_cache: Invert order of tex format RGB565. 2018-05-29 22:16:18 -04:00
greggameplayer
220d4672df add all the known TextureFormat (#474) 2018-05-28 19:26:17 -04:00
bunnei
d809f65827
Merge pull request #472 from bunnei/greater-equal
gl_shader_decompiler: Implement GetPredicateComparison GreaterEqual.
2018-05-27 12:14:30 -04:00
bunnei
7f155ba713
Merge pull request #476 from Subv/a1bgr5
GPU: Implemented the A1B5G5R5 texture format (0x14)
2018-05-27 12:14:08 -04:00
Subv
7ddc872b52 GPU: Implemented the A1B5G5R5 texture format (0x14) 2018-05-27 09:02:05 -05:00
bunnei
c23ce3365d gl_shader_decompiler: Implement GetPredicateComparison GreaterEqual. 2018-05-25 23:21:29 -04:00
bunnei
ee53688ca7 shader_bytecode: Implement other variants of FMNMX. 2018-05-25 23:18:50 -04:00
bunnei
aee356bd10
Merge pull request #468 from Subv/compound_preds
Shader: Implemented compound predicates in the fset and fsetp instructions
2018-05-25 22:28:47 -04:00
Subv
e2cdf54177 Shader: Implemented compound predicates in fset.
You can specify a predicate in the fset instruction:

Result = ((Value1 Comp Value2) OP P0) ? 1.0 : 0.0;
2018-05-24 17:39:59 -05:00
Subv
e2db7a83f6 GPU: Allow command lists to rebind a channel to another engine in the middle of the command list. 2018-05-24 17:32:46 -05:00
Subv
126270d963 Shader: Implemented compound predicates in fsetp.
You can specify three predicates in an fsetp instruction:

P1 = (Value1 Comp Value2) OP P0;
P2 = !(Value1 Comp Value2) OP P0;
2018-05-24 17:22:36 -05:00
bunnei
58857b9f46
Merge pull request #456 from Subv/unmap_buffer
Implemented nvhost-as-gpu's UnmapBuffer and nvmap's Free ioctls.
2018-05-20 23:54:50 -04:00
bunnei
898f0fa029
Merge pull request #458 from Subv/fmnmx
Shaders: Implemented the FMNMX shader instruction.
2018-05-20 23:44:07 -04:00
Sebastian Valle
6486544e09
Merge pull request #452 from Subv/psetp
ShadersDecompiler: Added decoding for the PSETP instruction.
2018-05-20 20:00:55 -05:00
Sebastian Valle
2dbfcd32d7
Merge pull request #451 from Subv/gl_array_size
GLRenderer: Remove unused vertex buffer and increase the size of the stream buffer to 128 MB.
2018-05-20 20:00:40 -05:00
Subv
8440cef223 Shaders: Implemented the FMNMX shader instruction. 2018-05-20 17:53:06 -05:00
Subv
72b5c448cf GPU: Implemented nvhost-as-gpu's UnmapBuffer ioctl.
It removes a mapping previously created with the MapBufferEx ioctl.
2018-05-20 14:25:56 -05:00
Subv
a056d5ad8c ShadersDecompiler: Added decoding for the PSETP instruction. 2018-05-19 11:41:14 -05:00
Subv
98b143c2d6 GLRenderer: Remove unused hw_vao_enabled_attributes variable. 2018-05-19 11:36:38 -05:00
Subv
370ab5df9b GLRenderer: Remove unused vertex buffer and increase the size of the stream buffer to 128 MB.
The stream buffer is where all the vertex data is copied, some games require this to be much bigger than the 4 MB we used to have.
2018-05-19 11:36:09 -05:00
Subv
21959ddfef GLRenderer: Log the shader source code when program linking fails. 2018-05-19 11:19:34 -05:00
Lioncash
7c9644646f
general: Make formatting of logged hex values more straightforward
This makes the formatting expectations more obvious (e.g. any zero padding specified
is padding that's entirely dedicated to the value being printed, not any pretty-printing
that also gets tacked on).
2018-05-02 09:49:36 -04:00
bunnei
225ff1130f
Merge pull request #422 from bunnei/shader-mov
Shader instructions MOV_C, MOV_R, and several minor GPU things
2018-04-29 21:47:42 -04:00
bunnei
f41eb95e13 maxwell_3d: Reset vertex counts after drawing. 2018-04-29 16:23:31 -04:00
bunnei
08b8fcbe6d gl_shader_decompiler: Implement MOV_R. 2018-04-29 16:05:18 -04:00
bunnei
316327f487 maxwell_to_gl: Implement type SignedNorm, Size_8_8_8_8. 2018-04-29 16:05:17 -04:00
bunnei
c7ce472eeb shader_bytecode: Add decoding for FMNMX instruction. 2018-04-29 16:05:17 -04:00
Subv
da32c648bf Shaders: Implemented predicate condition 3 (LessEqual) in the fset and fsetp instructions. 2018-04-29 12:49:41 -05:00
bunnei
a71346cd7c gl_shader_decompiler: Implement MOV_C. 2018-04-29 13:13:13 -04:00
bunnei
6c464a2a4a
Merge pull request #416 from bunnei/shader-ints-p3
gl_shader_decompiler: Implement MOV32I, partially implement I2I, I2F
2018-04-29 12:56:16 -04:00
bunnei
f87ea8fa8b fermi_2d: Fix surface copy block height. 2018-04-28 20:40:03 -04:00
bunnei
0c01c34eff gl_shader_decompiler: Partially implement I2I_R, and I2F_R. 2018-04-28 20:03:19 -04:00
bunnei
e73927cfc2 gl_shader_decompiler: More cleanups, etc. with how we handle register types. 2018-04-28 20:03:19 -04:00
bunnei
c691fa4074 GLSLRegister: Simplify register declarations, etc. 2018-04-28 20:03:19 -04:00
bunnei
f2dcb39049 shader_bytecode: Add decodings for i2i instructions. 2018-04-28 20:03:18 -04:00
bunnei
a7b5ab4d9a gl_shader_decompiler: Implement MOV32_IMM instruction. 2018-04-28 20:03:18 -04:00
bunnei
6b365f7703
Merge pull request #408 from bunnei/shader-ints-p2
gl_shader_decompiler: Add GLSLRegisterManager class to track register state.
2018-04-27 16:06:09 -04:00
Lioncash
16198f979e
renderer_opengl: Replace usages of LOG_GENERIC with fmt-capable equivalents 2018-04-27 12:09:35 -04:00
bunnei
e6242ab5e6 gl_shader_decompiler: Add GLSLRegisterManager class to track register state. 2018-04-27 11:49:26 -04:00
Lioncash
8475496630
general: Convert assertion macros over to be fmt-compatible 2018-04-27 10:04:02 -04:00
bunnei
c9d7abe9c9 gl_shader_decompiler: Boilerplate for handling integer instructions. 2018-04-26 14:38:42 -04:00
bunnei
37fa9a15cd gl_shader_decompiler: Move color output to EXIT instruction. 2018-04-26 14:38:41 -04:00
bunnei
f81b915fd8
Merge pull request #396 from Subv/shader_ops
Shaders: Implemented the FSET instruction.
2018-04-25 22:42:54 -04:00
Subv
20d86d8a36 GPU: Partially implemented the Fermi2D surface copy operation.
The hardware allows for some rather complicated operations to be performed on the data during the copy, this is not implemented.
Only same-format same-size raw copies are implemented for now.
2018-04-25 12:54:26 -05:00
Subv
e9ad8e9185 Shaders: Added bit decodings for the I2I instruction. 2018-04-25 12:52:55 -05:00
Subv
1740aa5444 Shaders: Implemented the FSET instruction.
This instruction is similar to the FSETP instruction, but it doesn't set a predicate, it sets the destination register to 1.0 if the condition holds, and 0 otherwise.
2018-04-25 12:52:32 -05:00
Subv
1dd4861d38 GPU: Make the Textures::CopySwizzledData function accessible from the outside of the file. 2018-04-25 11:55:30 -05:00
Subv
a6da2b93c1 GPU: Added a function to retrieve the bytes per pixel of the render target formats. 2018-04-25 11:55:29 -05:00
Subv
378c881427 GPU: Added surface copy registers to Fermi2D 2018-04-25 11:55:29 -05:00
Subv
b1109931b9 GPU: Added boilerplate code for the Fermi2D engine 2018-04-25 11:55:29 -05:00
Subv
c16cfbbc6c GPU: Reduce the number of registers of Maxwell3D to 0xE00.
The rest are just macro shim registers.
2018-04-25 11:55:28 -05:00
Subv
a994446b6e GPU: Move the Maxwell3D macro uploading code to the inside of the Maxwell3D processor.
It doesn't belong in the PFIFO handler.
2018-04-25 11:55:27 -05:00
Subv
e2f2a49d2d GPU: Corrected the upper bound of the PFIFO method ids in the command processor. 2018-04-25 11:53:54 -05:00
Lioncash
b7551e457b
video-core: Move logging macros over to new fmt-capable ones 2018-04-25 09:13:57 -04:00
Subv
0369ee7248 Shaders: Added decodings for the FSET instructions. 2018-04-24 22:42:54 -05:00
bunnei
c30cd898fc renderer_opengl: Use correct byte order for framebuffer pixel format ABGR8. 2018-04-24 22:31:46 -04:00
bunnei
f1a4a004fb gl_rasterizer_cache: Use CHAR_BIT for bpp conversions instead of 8. 2018-04-24 22:31:46 -04:00
bunnei
0a023cfb4f gl_rasterizer_cache: Use GPU PAGE_BITS/SIZE, not CPU. 2018-04-24 22:31:46 -04:00
bunnei
9022d926eb gl_rasterizer_cache: Use new logger. 2018-04-24 22:31:46 -04:00
bunnei
fbb3cd110c gl_rasterizer_cache: Add a function for finding framebuffer GPU address. 2018-04-24 22:31:46 -04:00
bunnei
bc0f1896fc gl_rasterizer_cache: Handle compressed texture sizes. 2018-04-24 22:31:46 -04:00
bunnei
4415e00181 gl_rasterizer_cache: Update to be based on GPU addresses, not CPU addresses. 2018-04-24 22:31:45 -04:00
bunnei
10c6d89119 memory_manager: Add implement CpuToGpuAddress. 2018-04-24 17:49:20 -04:00
bunnei
239ac8abe2 memory_manager: Make GpuToCpuAddress return an optional. 2018-04-24 17:49:19 -04:00
bunnei
9e11a76e92 memory_manager: Use GPUVAdddr, not PAddr, for GPU addresses. 2018-04-24 17:40:43 -04:00
bunnei
e8c2bb24b2
Merge pull request #386 from Subv/gpu_query
GPU: Added asserts to our code for handling the QUERY_GET GPU command.
2018-04-24 16:13:51 -04:00
Lioncash
d1b23b2b51
renderer_opengl: Silence a -Wdangling-else warning in DrawScreenTriangles() 2018-04-24 11:13:08 -04:00
bunnei
07dc0bbf3e
Merge pull request #379 from Subv/multi_buffers
GPU: Support multiple enabled vertex arrays.
2018-04-24 01:09:02 -04:00
Subv
f208953585 GPU: Added asserts to our code for handling the QUERY_GET GPU command.
This is based on research from nouveau. Many things are currently unknown and will require hwtests in the future.
This commit also stubs QueryMode::Write2 to do the same as Write. Nouveau code treats them interchangeably, it is currently unknown what the difference is.
2018-04-23 17:06:57 -05:00
bunnei
3967f9c6ef
Merge pull request #383 from Subv/gpu_mmu
GPU: Make the GPU virtual memory manager use 16 page bits and 10 pagetable bits.
2018-04-23 14:00:52 -04:00
Subv
9531a29283 GPU: Support multiple enabled vertex arrays.
The vertex arrays will be copied to the stream buffer one after the other, and the attributes will be set using the ARB_vertex_attrib_binding extension.

yuzu now thus requires OpenGL 4.3 or the ARB_vertex_attrib_binding extension.
2018-04-23 11:34:50 -05:00
Subv
f823c1d599 GPU: Make the GPU virtual memory manager use 16 page bits and 10 page table bits.
Also removed some dead code and added memory map consistency asserts.
2018-04-23 10:57:12 -05:00
Subv
010227e149 GPU: Implement the RGB10_A2 RenderTarget format, it will use the same format as the A2BGR10 texture format. 2018-04-23 10:50:28 -05:00
Subv
c079cf4eec GPU: Implement the A2BGR10 texture format. 2018-04-21 17:32:25 -05:00
bunnei
f8764bb5d3
Merge pull request #376 from bunnei/shader-decoder
Shader opcode decoding
2018-04-21 00:04:51 -04:00
bunnei
f8a037ead4
Merge pull request #375 from lioncash/header
opengl: Remove unnecessary header inclusions
2018-04-20 23:08:47 -04:00
bunnei
d08fd7e86d gl_shader_decompiler: Skip RRO instruction. 2018-04-20 22:30:56 -04:00
bunnei
8b28dc55e6 gl_shader_decompiler: Cleanup error logging. 2018-04-20 22:30:56 -04:00
bunnei
e1630c4d43 shader_bytecode: Add several more instruction decodings. 2018-04-20 22:30:56 -04:00
bunnei
9f6d305eab shader_bytecode: Decode instructions based on bit strings. 2018-04-20 22:30:56 -04:00
bunnei
8ac3a3f45e
Merge pull request #369 from Subv/shader_instr2
ShaderGen: Implemented fsetp/kil and predicated instruction execution.
2018-04-20 22:29:39 -04:00
bunnei
634d9ee18b
Merge pull request #374 from lioncash/noexcept
gl_resource_manager: Add missing noexcept specifiers to move constructors and assignment operators
2018-04-20 22:28:47 -04:00
Subv
17a0ef1e1e ShaderGen: Implemented the KIL instruction, which is equivalent to 'discard'. 2018-04-20 21:09:34 -05:00
Subv
c3a8ea76f1 ShaderGen: Implemented predicated instruction execution.
Each predicated instruction will be wrapped in an `if (predicate) { instruction_body; }` in the GLSL, where `predicate` is one of the predicate boolean variables previously set by fsetp.
2018-04-20 21:09:33 -05:00
Subv
0a5e01b710 ShaderGen: Implemented the fsetp instruction.
Predicate variables are now added to the generated shader code in the form of 'pX' where X is the predicate id.
These predicate variables are initialized to false on shader startup and are set via the fsetp instructions.

TODO:

* Not all the comparison types are implemented.
* Only the single-predicate version is implemented.
2018-04-20 21:09:33 -05:00
Lioncash
eafdcc1b8a opengl: Remove unnecessary header inclusions 2018-04-20 20:19:37 -04:00
Lioncash
ab71997b2c gl_resource_manager: Add missing noexcept specifiers to move constructors and assignment operators
Standard library containers may use std::move_if_noexcept to perform
move operations. If a move cannot be performed under these
circumstances, then a copy is attempted. Given we only intend for these
types to be move-only this can be somewhat problematic. By defining
these to be noexcept we prevent cases where copies may be attempted.
2018-04-20 20:04:00 -04:00
Lioncash
7db0b8d74f gl_rasterizer_cache: Make MatchFlags an enum class
Prevents implicit conversions and scope pollution.
2018-04-20 19:50:05 -04:00
Subv
d03fc77475 ShaderGen: Register id 255 is special and is hardcoded to return 0 (SR_ZERO). 2018-04-20 14:57:40 -05:00
Subv
2e0a9f66a0 ShaderGen: Ignore the 'sched' instruction when generating shaders.
The 'sched' instruction has a very convoluted encoding, but fortunately it seems to only appear on a fixed interval (once every 4 instructions).
2018-04-20 14:57:40 -05:00
bunnei
326b044c19
Merge pull request #367 from lioncash/clamp
math_util: Remove the Clamp() function
2018-04-20 14:18:03 -04:00
Lioncash
fae2dd0344
math_util: Remove the Clamp() function
C++17 adds clamp() to the standard library, so we can remove ours in
favor of it.
2018-04-20 10:14:13 -04:00
bunnei
701dd649e6
Merge pull request #363 from lioncash/array-size
common_funcs: Remove ARRAY_SIZE macro
2018-04-20 09:43:02 -04:00
Lioncash
d9e316e353 common_funcs: Remove ARRAY_SIZE macro
C++17 has non-member size() which we can just call where necessary.
2018-04-19 22:36:52 -04:00
Lioncash
3841ec4200 renderer_opengl: Add missing header guards 2018-04-19 21:13:59 -04:00
bunnei
17ad56c1dc
Merge pull request #356 from lioncash/shader
glsl_shader_decompiler: Minor API changes to ShaderWriter
2018-04-19 21:09:25 -04:00
Lioncash
e3b6f6c016 glsl_shader_decompiler: Use std::string_view instead of std::string for AddLine()
This function doesn't need to take ownership of the string data being
given to it, considering all we do is append the characters to the
internal string instance.

Instead, use a string view to simply reference the string data without
any potential heap allocation.

Now anything that is a raw const char* won't need to be converted to a
std::string before appending.
2018-04-19 20:12:58 -04:00
Lioncash
412b31ad72 glsl_shader_decompiler: Add AddNewLine() function to ShaderWriter
Avoids constructing a std::string just to append a newline character
2018-04-19 20:09:27 -04:00
Lioncash
aa26baa3db glsl_shader_decompiler: Add char overload for ShaderWriter's AddLine()
Avoids constructing a std::string just to append a character.
2018-04-19 20:04:09 -04:00
Lioncash
4ef392906b glsl_shader_decompiler: Append indentation without constructing a separate std::string
The interface of std::string already lets us append N copies of a
character to an existing string.
2018-04-19 19:59:25 -04:00
Subv
fe84842137 ShaderGen: Implemented the fmul32i shader instruction. 2018-04-19 13:46:32 -05:00
Subv
5367935d35 ShaderGen: Fixed a case where the TEXS instruction would use the same registers for the input and the output.
It will now save the coords before writing the outputs in a subscope.
2018-04-19 13:33:17 -05:00
Subv
057170928c GPU: Add support for the DXT23 and DXT45 compressed texture formats. 2018-04-18 20:48:53 -05:00
bunnei
60e6e8953e
Merge pull request #351 from Subv/tex_formats
GPU: Implemented the B5G6R5 format.
2018-04-18 20:20:51 -04:00
Subv
2985056340 GPU: Implemented the B5G6R5 format. 2018-04-18 18:16:45 -05:00
bunnei
ce4f159b1c
gl_shader_gen: Support vertical/horizontal viewport flipping. (#347)
* gl_shader_gen: Support vertical/horizontal viewport flipping.

* fixup! gl_shader_gen: Support vertical/horizontal viewport flipping.
2018-04-18 16:42:40 -04:00
Subv
43d98ca8fe GLCache: Added boilerplate code to make supporting configurable texture component types.
For now only the UNORM type is supported.
2018-04-18 14:17:28 -05:00
Subv
5b3fab6766 GLCache: Unify texture and framebuffer formats when converting to OpenGL. 2018-04-18 14:17:28 -05:00
Subv
b2c1672e10 GPU: Texture format 8 and framebuffer format 0xD5 are actually ABGR8. 2018-04-18 14:17:27 -05:00
Subv
48d4efbd69 GPU: Pitch textures are now supported, don't assert when encountering them. 2018-04-18 12:52:53 -05:00
Subv
a3e82e8e1f GLCache: Take into account the texture's block height when caching and unswizzling. 2018-04-18 12:52:53 -05:00
Subv
ac09b5a2e9 GLCache: Added a function to convert cached PixelFormats back to texture formats.
TODO: The way we handle cached formats must change, framebuffer and texture formats are too different to keep them in the same place.
2018-04-18 12:52:52 -05:00
Subv
6b63aaa5b4 GPU: Allow using a configurable block height when unswizzling textures. 2018-04-18 12:52:51 -05:00
Subv
db5f2bfa7e GPU/TIC: Added the pitch and block height fields to the TIC structure. 2018-04-18 11:38:39 -05:00
bunnei
c93ea96366
Merge pull request #346 from bunnei/misc-gpu-improvements
Misc gpu improvements
2018-04-17 22:17:07 -04:00
bunnei
71b4a3b9f6
Merge pull request #344 from bunnei/shader-decompiler-p2
Shader decompiler changes part 2
2018-04-17 22:10:53 -04:00
bunnei
7222d9a4c3 gl_rasterizer_cache: Add missing LOG statements. 2018-04-17 21:44:36 -04:00
bunnei
9df8e924fb texture: Add missing formats. 2018-04-17 21:41:36 -04:00
bunnei
3ed8a1cac7 gpu: Add several framebuffer formats to RenderTargetFormat. 2018-04-17 21:40:38 -04:00
bunnei
4a8eb6745e maxwell3d: Allow Texture2DNoMipmap as Texture2D. 2018-04-17 21:39:15 -04:00
bunnei
531c25386e shader_bytecode: Make ctor's constexpr and explicit. 2018-04-17 21:27:07 -04:00
bunnei
174cba5c58 renderer_opengl: Implement BlendEquation and BlendFunc. 2018-04-17 18:11:48 -04:00
bunnei
1f6fe062ca gl_shader_decompiler: Fix warnings with MarkAsUsed. 2018-04-17 16:36:44 -04:00
bunnei
ed542a7309 gl_shader_decompiler: Cleanup logging, updating to NGLOG_*. 2018-04-17 16:36:44 -04:00
bunnei
ef2d5ab0c1 gl_shader_decompiler: Implement several MUFU subops and abs_d. 2018-04-17 16:36:43 -04:00
bunnei
59f4ff4659 gl_shader_decompiler: Fix swizzle in GetRegister. 2018-04-17 16:36:42 -04:00
bunnei
5a28dce9eb gl_shader_decompiler: Implement FMUL/FADD/FFMA immediate instructions. 2018-04-17 16:36:42 -04:00
bunnei
8d4899d6ea gl_shader_decompiler: Allow vertex position to be used in fragment shader. 2018-04-17 16:36:40 -04:00
bunnei
95144cc39c gl_shader_decompiler: Implement IPA instruction. 2018-04-17 16:36:39 -04:00
bunnei
8b4443c966 gl_shader_decompiler: Add support for TEXS instruction. 2018-04-17 16:36:38 -04:00
bunnei
5ba71369ac gl_shader_decompiler: Use fragment output color for GPR 0-3. 2018-04-17 15:25:54 -04:00
bunnei
5d529698c9 gl_shader_decompiler: Partially implement MUFU. 2018-04-17 15:25:54 -04:00
bunnei
2b082e2710
Merge pull request #343 from Subv/tex_wrap_4
GPU: Implement some wrap modes
2018-04-17 12:25:24 -04:00
Subv
636ad34707 MaxwellToGL: Implemented tex wrap mode 1 (Wrap, GL_REPEAT). 2018-04-17 10:17:18 -05:00
Subv
7fc516cc1a MaxwellToGL: Added a TODO and partial implementation of maxwell wrap mode 4 (Clamp, GL_CLAMP).
This clamp mode was removed from OpenGL as of 3.1, we can emulate it by using GL_CLAMP_TO_BORDER to get the border color of the texture, and then manually sampling the edge to mix them in the fragment shader.
2018-04-17 10:16:50 -05:00
bunnei
77bdc49343 gl_rendering: Use NGLOG* for changed code. 2018-04-16 21:23:28 -04:00
bunnei
1a1af3fda3 gl_rasterizer: Implement indexed vertex mode. 2018-04-16 21:10:15 -04:00
Subv
477aab5960 GPU: Use the same buffer names in the generated GLSL and the buffer uploading code. 2018-04-15 15:02:50 -05:00
Subv
14ac40436e GPU: Don't use explicit binding points when uploading the constbuffers to opengl.
The bindpoints will now be dynamically calculated based on the number of buffers used by the previous shader stage.
2018-04-15 14:14:57 -05:00
Subv
e128e90350 GPU: Don't use GetPointer when uploading the constbuffer data to the GPU. 2018-04-15 11:18:09 -05:00
Subv
7da47da66e GPU: Use the buffer hints from the shader decompiler to upload only the necessary const buffers for each shader stage. 2018-04-15 11:15:54 -05:00
bunnei
73d9c494ea shaders: Expose hints about used const buffers. 2018-04-15 11:50:10 -04:00
Subv
c9b511da08 GPU: Upload the entirety of each constbuffer for each shader stage as SSBOs.
We're going to need the shader generator to give us a mapping of the actual used const buffers to properly bind them to the shader.
2018-04-14 23:02:05 -05:00
Subv
1957640ea2 GPU: Allow configuring ssbos in the opengl state manager. 2018-04-14 22:54:23 -05:00
Subv
ae58e46036 GPU: Added a function to determine whether a shader stage is enabled or not. 2018-04-14 22:54:23 -05:00
bunnei
1b41b875dc shaders: Add NumTextureSamplers const, remove unused #pragma. 2018-04-14 18:50:06 -04:00
bunnei
e6224fec27 shaders: Address PR review feedback. 2018-04-14 16:01:41 -04:00
bunnei
eabeedf6af gl_shader_decompiler: Cleanup log statements. 2018-04-14 16:01:41 -04:00
bunnei
0d408b965b shaders: Fix GCC and clang build issues. 2018-04-14 16:01:40 -04:00
bunnei
86135864da gl_shader_decompiler: Implement negate, abs, etc. and lots of cleanup. 2018-04-14 16:01:40 -04:00
bunnei
7639667562 shader_bytecode: Add FSETP and KIL to GetInfo. 2018-04-14 16:01:40 -04:00
bunnei
5a47832221 shader_bytecode: Add SubOp decoding. 2018-04-14 16:01:40 -04:00
bunnei
50023bdae7 gl_shader_decompiler: Add shader stage hint. 2018-04-14 16:01:39 -04:00
bunnei
a992aac5eb renderer_opengl: Fix Morton copy byteswap, etc. 2018-04-14 16:01:39 -04:00
bunnei
0ca8fce9d0 gl_shader_manager: Implement SetShaderSamplerBindings. 2018-04-13 23:48:30 -04:00
bunnei
beddc8afd2 gl_rasterizer: Generate shaders and upload uniforms. 2018-04-13 23:48:29 -04:00
bunnei
85d77a3d24 gl_shader_decompiler: Basic impl. for very simple vertex shaders.
- Tested with Puyo Puyo Tetris and Cave Story+
2018-04-13 23:48:28 -04:00
bunnei
51f37f5061 gl_shader_manager: Cleanup and consolidate uniform handling. 2018-04-13 23:48:28 -04:00
bunnei
35aca0bf1f maxwell_3d: Make memory_manager public. 2018-04-13 23:48:27 -04:00
bunnei
33bb53571b maxwell_3d: Fix shader_config decodings. 2018-04-13 23:48:26 -04:00
bunnei
5617831d5f gl_rasterizer: Use shader program manager, remove test shader. 2018-04-13 23:48:26 -04:00
bunnei
459826a705 renderer_opengl: Add gl_shader_manager class. 2018-04-13 23:48:25 -04:00
bunnei
8aa21a03b3 maxwell_to_gl: Add a few types, etc. 2018-04-13 23:48:24 -04:00
bunnei
10953495c1 gl_shader_gen: Add hashable setup/config structs. 2018-04-13 23:48:23 -04:00
bunnei
2fcbb35ad2 gl_shader_util: Add missing includes. 2018-04-13 23:48:23 -04:00
bunnei
da1114ca59 renderer_opengl: Use OGLProgram instead of OGLShader. 2018-04-13 23:48:21 -04:00
bunnei
4f2b2d0bc5 gl_shader_util: Grab latest upstream. 2018-04-13 23:48:21 -04:00
bunnei
dbfd106ba0 gl_resource_manager: Grab latest upstream. 2018-04-13 23:48:20 -04:00
bunnei
ed7e597b44 gl_shader_decompiler: Add skeleton code from Citra for shader analysis. 2018-04-13 23:48:20 -04:00
bunnei
4e7e0f8112 shader_bytecode: Add initial module for shader decoding. 2018-04-13 23:48:19 -04:00
James Rowe
0b855f1c21 Fix clang format issues 2018-04-06 22:00:48 -06:00
Subv
dcc27d6dc1 GPU: Assert when finding a texture with a format type other than UNORM. 2018-04-06 20:44:46 -06:00
Subv
b0ca330e14 GL: Set up the textures used for each draw call.
Each Maxwell shader stage can have an arbitrary number of textures, but we're limited to a certain number in OpenGL. We try to only use the minimum amount of host textures by not keeping a 1:1 relation between guest texture ids and host texture ids, ie, guest texture id 8 can be host texture id 0 if it's the only texture used in the guest shader program.
This mapping will have to be passed to the shader decompiler so it can rewrite the texture accesses.
2018-04-06 20:44:46 -06:00
Subv
cb3183212d GL: Bind the textures to the shaders used for drawing. 2018-04-06 20:44:46 -06:00
Subv
65faeb9b2a GLCache: Specialize the MortonCopy function for the DXT1 texture format.
It will now use the UnswizzleTexture function instead of the MortonCopyPixels128, which doesn't seem to work for textures.
2018-04-06 20:44:46 -06:00
Subv
b258403f0d GLCache: Implemented GetTextureSurface. 2018-04-06 20:44:45 -06:00
Subv
65ea52394b GLCache: Support uploading compressed textures to the GPU.
Compressed texture formats like DXT1, DXT2, DXT3, etc will use this to ease the load on the CPU.
2018-04-06 20:44:45 -06:00
Subv
73eaef9c05 GL: Remove remaining references to 3DS-specific pixel formats 2018-04-06 20:44:42 -06:00
Subv
b305646c44 RasterizerCache: Remove 3DS-specific pixel formats.
We're only left with RGB8 and DXT1 for now. More will be added as they are needed.
2018-04-06 20:40:24 -06:00
Subv
c28ed85875 GL: Create the sampler objects when starting up the GL rasterizer. 2018-04-06 20:40:24 -06:00
Subv
ca96b04a0c GL: Ported the SamplerInfo struct from citra. 2018-04-06 20:40:24 -06:00
Subv
0171ec606b GL: Rename PicaTexture to MaxwellTexture. 2018-04-06 20:40:24 -06:00
Subv
f73a280eeb GL: Added functions to convert Maxwell tex filters and wrap modes to OpenGL. 2018-04-06 20:40:23 -06:00
Subv
ad1810e895 Textures: Added a helper function to know if a texture is blocklinear or pitch. 2018-04-06 20:40:23 -06:00
N00byKing
d1d7582a5b
rasterizer_interface.h: Update from citra to yuzu 2018-04-04 23:07:58 +02:00
N00byKing
27dbbd8227
gl_rasterizer_cache.cpp: Update from citra to yuzu 2018-04-04 23:05:10 +02:00
N00byKing
cfc28e0c1a
gl_rasterizer_cache.h: Update from citra to yuzu 2018-04-04 23:04:24 +02:00
N00byKing
ca17f581f5
renderer_opengl.h: Update from citra to yuzu 2018-04-04 23:03:02 +02:00
Subv
11b4ab9685 GPU: Use the MacroInterpreter class to execute the GPU macros instead of HLEing them. 2018-04-01 12:07:26 -05:00
Subv
1ec8d2123d GPU: Implemented a gpu macro interpreter.
The Ryujinx macro interpreter and envydis were used as reference.

Macros are programs that are uploaded by the games during boot and can later be called by writing to their method id in a GPU command buffer.
2018-04-01 12:07:26 -05:00
bunnei
5e343edc9e renderer_opengl: Use better naming for DrawScreens and DrawSingleScreen. 2018-03-26 21:17:07 -04:00
bunnei
c33abac275 gl_rasterizer: Move code to bind framebuffer surfaces before draw to its own function. 2018-03-26 21:17:05 -04:00
bunnei
d30110348b gl_rasterizer: Add a SyncViewport method. 2018-03-26 21:17:04 -04:00
bunnei
67bc2f5ecd gl_rasterizer: Move PrimitiveTopology check to MaxwellToGL. 2018-03-26 21:17:03 -04:00
bunnei
666d53299c graphics_surface: Fix merge conflicts. 2018-03-26 21:17:03 -04:00
bunnei
ac19e3d061 gl_rasterizer: Use ReadBlock instead of GetPointer for SetupVertexArray. 2018-03-26 21:17:02 -04:00
bunnei
a6cab532f8 gl_rasterizer: Normalize vertex array data as appropriate. 2018-03-26 21:17:02 -04:00
bunnei
527ce12ce4 maxwel_to_gl: Fix string formatting in log statements. 2018-03-26 21:17:01 -04:00
bunnei
d89bfec5f5 rasterizer: Rename DrawTriangles to DrawArrays. 2018-03-26 21:17:00 -04:00
bunnei
1bfc0dc2db gl_rasterizer: Use passthrough shader for SetupVertexShader. 2018-03-26 21:17:00 -04:00
bunnei
0a5832798a renderer_opengl: Logging, etc. cleanup. 2018-03-26 21:16:59 -04:00
bunnei
7504df52fc renderer_opengl: Remove framebuffer RasterizerFlushVirtualRegion hack. 2018-03-26 21:16:58 -04:00
bunnei
c1ccbf332f gl_rasterizer_cache: Implement UpdatePagesCachedCount. 2018-03-26 21:16:58 -04:00
bunnei
c2dbdefedf gl_rasterizer: Implement SetupVertexArray. 2018-03-26 21:16:56 -04:00
bunnei
cd8bb6ea9b gl_rasterizer_cache: Fix an ASSERT_MSG. 2018-03-26 21:16:56 -04:00
bunnei
4369af6b7e maxwell_to_gl: Add module and function for decoding VertexType. 2018-03-26 21:16:55 -04:00
bunnei
3754e0fdfd maxwell_3d: Use names that match envytools for VertexType. 2018-03-26 21:16:55 -04:00
bunnei
15925b8293 maxwell_3d: Add VertexAttribute struct and cleanup. 2018-03-26 21:16:54 -04:00
bunnei
0ee38e1363 gl_rasterizer: Use 32 texture units instead of 3. 2018-03-26 21:16:53 -04:00