Commit graph

251 commits

Author SHA1 Message Date
bunnei 8a76f816a4
Merge pull request #3228 from ReinUsesLisp/ptp
shader/texture: Implement AOFFI and PTP for TLD4 and TLD4S
2019-12-26 21:43:44 -05:00
bunnei 16dcfacbfc
Merge pull request #3235 from ReinUsesLisp/ldg-u8
shader/memory: Implement LDG.U8 and unaligned U8 loads
2019-12-21 22:50:28 -05:00
bunnei 7be65c6a68
Merge pull request #3234 from ReinUsesLisp/i2f-u8-selector
shader/conversion: Implement byte selector in I2F
2019-12-19 22:36:26 -05:00
ReinUsesLisp ae8d4b6c0c
shader/memory: Implement LDG.U8 and unaligned U8 loads
LDG can load single bytes instead of full integers or packs of integers.
These have the advantage of loading bytes that are not aligned to 4
bytes.

To emulate these this commit gets the byte being referenced (by doing
"address & 3" and then using that to extract the byte from the loaded
integer:

result = bitfieldExtract(loaded_integer, (address % 4) * 8, 8)
2019-12-18 01:21:46 -03:00
ReinUsesLisp a7d6bd1ef1
shader/conversion: Implement byte selector in I2F
I2F's byte selector is used to choose what bytes to convert to float.
e.g. if the input is 0xaabbccdd and the selector is ".B3" it will
convert 0xaa. The default (when it's not shown in nvdisasm) is ".B0", in
that example the default would convert 0xdd to float.
2019-12-18 00:41:22 -03:00
ReinUsesLisp 15a753b9a5
shader/texture: Properly shrink unused entries in size mismatches
When a image format mismatches we were inserting zeroes to the texture
itself. This was not handling cases were the mismatch uses less
coordinates than the guest shader code. Address that by resizing the
vector.
2019-12-17 23:38:10 -03:00
ReinUsesLisp e09c1fbc1f
shader/texture: Implement TLD4.PTP 2019-12-16 04:09:24 -03:00
ReinUsesLisp 844e4a297b
shader/texture: Enable arrayed TLD4 2019-12-16 02:37:21 -03:00
ReinUsesLisp 3d2c44848b
shader/texture: Implement AOFFI for TLD4S 2019-12-16 02:06:42 -03:00
ReinUsesLisp 3d9fff82c0
shader/texture: Remove unnecesary parenthesis 2019-12-16 01:52:33 -03:00
Fernando Sahmkow c0ee0aa1a8 Shader_IR: Correct TLD4S Depth Compare. 2019-12-11 19:53:17 -04:00
Fernando Sahmkow af89723fa3 Shader_Ir: Correct TLD4S encoding and implement f16 flag. 2019-12-11 19:53:17 -04:00
Fernando Sahmkow 271a3264f3 Shader_Ir: default failed tracks on bindless samplers to null values. 2019-12-11 19:53:16 -04:00
ReinUsesLisp 425a254fa2
shader: Implement MEMBAR.GL
Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.
2019-12-10 16:45:03 -03:00
ReinUsesLisp 0b5b93053d
shader_ir/other: Implement S2R InvocationId 2019-12-09 23:52:28 -03:00
ReinUsesLisp 9ad6327fbd
shader: Keep track of shaders using warp instructions 2019-12-09 23:40:41 -03:00
ReinUsesLisp 6233b1db08
shader_ir/memory: Implement patch stores 2019-12-09 23:25:21 -03:00
bunnei e36814d6d5
Merge pull request #3109 from FernandoS27/new-instr
Implement FLO & TXD Instructions on GPU Shaders
2019-12-06 18:18:16 -05:00
ReinUsesLisp dc9961f341
shader/texture: Handle TLDS texture type mismatches
Some games like "Fire Emblem: Three Houses" bind 2D textures to offsets
used by instructions of 1D textures. To handle the discrepancy this
commit uses the the texture type from the binding and modifies the
emitted code IR to build a valid backend expression.

E.g.: Bound texture is 2D and instruction is 1D, the emitted IR samples
a 2D texture in the coordinate ivec2(X, 0).
2019-11-22 21:28:47 -03:00
ReinUsesLisp 32c1bc6a67
shader/texture: Deduce texture buffers from locker
Instead of specializing shaders to separate texture buffers from 1D
textures, use the locker to deduce them while they are being decoded.
2019-11-22 21:28:47 -03:00
ReinUsesLisp 24f4198cee
shader/other: Reduce DEPBAR log severity
While DEPBAR is stubbed it doesn't change anything from our end. Shading
languages handle what this instruction does implicitly. We are not
getting anything out fo this log except noise.
2019-11-19 21:26:40 -03:00
Fernando Sahmkow c8473f399e Shader_IR: Address Feedback 2019-11-18 07:34:34 -04:00
Fernando Sahmkow cd0f5dfc17 Shader_IR: Implement TXD instruction. 2019-11-14 11:15:27 -04:00
Fernando Sahmkow f3d1b370aa Shader_IR: Implement FLO instruction. 2019-11-14 11:15:27 -04:00
ReinUsesLisp 56e237d1f9
shader_ir/warp: Implement FSWZADD 2019-11-07 20:08:41 -03:00
ReinUsesLisp 08b2b1080a
gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsics 2019-11-07 20:08:41 -03:00
ReinUsesLisp 39c66abd91
shader/decode: Reduce severity of arithmetic rounding warnings 2019-11-07 01:43:38 -03:00
ReinUsesLisp c4374d0d41
shader/arithmetic: Reduce RRO stub severity 2019-11-07 01:43:38 -03:00
ReinUsesLisp 35d40b74b3
shader/texture: Remove NODEP warnings
These warnings don't offer meaningful information while decoding
shaders. Remove them.
2019-11-07 01:43:38 -03:00
Rodrigo Locatti 654b77d2ec
Merge pull request #3039 from ReinUsesLisp/cleanup-samplers
shader/node: Unpack bindless texture encoding
2019-11-06 04:54:11 +00:00
Fernando Sahmkow 23cabc98db Shader_IR: Fix regression on TLD4
Originally on the last commit I thought TLD4 acted the same as TLD4S and 
didn't have a mask. It actually does have a component mask. This commit 
corrects that.
2019-10-30 21:14:57 -04:00
Fernando Sahmkow 9293c3a0f2 Shader_IR: Fix TLD4 and add Bindless Variant.
This commit fixes an issue where not all 4 results of tld4 were being
written, the color component was defaulted to red, among other things.
It also implements the bindless variant.
2019-10-30 12:02:03 -04:00
ReinUsesLisp a993df1ee2
shader/node: Unpack bindless texture encoding
Bindless textures were using u64 to pack the buffer and offset from
where they come from. Drop this in favor of separated entries in the
struct.

Remove the usage of std::set in favor of std::list (it's not std::vector
to avoid reference invalidations) for samplers and images.
2019-10-29 20:53:48 -03:00
Rodrigo Locatti 26f3e18c5c
Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased
Implement Fast BRX, fix TXQ and addapt the Shader Cache for it
2019-10-26 16:56:13 -03:00
Fernando Sahmkow be856a38d6 Shader_IR: Address Feedback. 2019-10-26 15:38:30 -04:00
Rodrigo Locatti d52598173d
Merge pull request #3013 from FernandoS27/tld4s-fix
Shader_Ir: Fix TLD4S from using a component mask.
2019-10-25 20:06:26 -03:00
Fernando Sahmkow 33fcec3502 Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it 2019-10-25 09:01:30 -04:00
Lioncash 1f5401c89c video_core/shader: Resolve instances of variable shadowing
Silences a few -Wshadow warnings.
2019-10-23 23:00:31 -04:00
Fernando Sahmkow 1509d2ffbd Shader_Ir: Fix TLD4S from using a component mask.
TLD4S always outputs 4 values, the previous code checked a component 
mask and omitted those values that weren't part of it. This commit 
corrects that and makes sure all 4 values are set.
2019-10-22 10:59:07 -04:00
ReinUsesLisp 1ea07954fb shader_ir/memory: Ignore global memory when tracking fails
Ignore global memory operations instead of invoking undefined behaviour
when constant buffer tracking fails and we are blasting through asserts,
ignore the operation.

In the case of LDG this means filling the destination registers with
zeroes; for STG this means ignore the instruction as a whole.

The default behaviour is still to abort execution on failure.
2019-10-22 02:49:17 -03:00
ReinUsesLisp 3d0f357307
shader/half_set_predicate: Fix HSETP2 for constant buffers
HSETP2 when used with a constant buffer parses the second operand type
as F32. This is not configurable.
2019-10-07 14:49:47 -03:00
ReinUsesLisp 632c9e4ee3
shader/half_set_predicate: Reduce DEBUG_ASSERT to LOG_DEBUG 2019-10-07 14:48:58 -03:00
bunnei 376f1a4432
Merge pull request #2869 from ReinUsesLisp/suld
shader/image: Implement SULD and fix SUATOM
2019-09-23 21:47:03 -04:00
Rodrigo Locatti 9286976948
Merge pull request #2878 from FernandoS27/icmp
shader_ir: Implement ICMP
2019-09-21 18:06:07 -03:00
ReinUsesLisp 44000971e2
gl_shader_decompiler: Use uint for images and fix SUATOM
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as
these require a distinction between U32 and S32. These have to be
implemented with imageCompSwap loop.
2019-09-21 17:33:52 -03:00
ReinUsesLisp 675f23aedc
shader/image: Implement SULD and remove irrelevant code
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21 17:32:48 -03:00
Fernando Sahmkow 527b841c15 Shader_IR: ICMP corrections and fixes 2019-09-21 14:28:03 -04:00
bunnei 88d857499b
Merge pull request #2855 from ReinUsesLisp/shfl
shader_ir/warp: Implement SHFL for Nvidia devices
2019-09-20 17:10:42 -04:00
Fernando Sahmkow 4b81d19a1a Shader_IR: Implement ICMP. 2019-09-19 20:56:29 -04:00
bunnei b31880dc5e
Merge pull request #2784 from ReinUsesLisp/smem
shader_ir: Implement shared memory
2019-09-18 16:26:05 -04:00