Commit graph

3508 commits

Author SHA1 Message Date
ReinUsesLisp 287ae2b9e8
gl_shader_cache: Specialize local memory size for compute shaders
Local memory size in compute shaders was stubbed with an arbitary size.
This commit specializes local memory size from guest GPU parameters.
2019-11-22 21:28:48 -03:00
ReinUsesLisp dbeb523879
gl_shader_cache: Specialize shared memory size
Shared memory was being declared with an undefined size. Specialize from
guest GPU parameters the compute shader's shared memory size.
2019-11-22 21:28:47 -03:00
ReinUsesLisp 4f5d8e4342
gl_shader_cache: Specialize shader workgroup
Drop the usage of ARB_compute_variable_group_size and specialize compute
shaders instead. This permits compute to run on AMD and Intel
proprietary drivers.
2019-11-22 21:28:47 -03:00
ReinUsesLisp dc9961f341
shader/texture: Handle TLDS texture type mismatches
Some games like "Fire Emblem: Three Houses" bind 2D textures to offsets
used by instructions of 1D textures. To handle the discrepancy this
commit uses the the texture type from the binding and modifies the
emitted code IR to build a valid backend expression.

E.g.: Bound texture is 2D and instruction is 1D, the emitted IR samples
a 2D texture in the coordinate ivec2(X, 0).
2019-11-22 21:28:47 -03:00
ReinUsesLisp 32c1bc6a67
shader/texture: Deduce texture buffers from locker
Instead of specializing shaders to separate texture buffers from 1D
textures, use the locker to deduce them while they are being decoded.
2019-11-22 21:28:47 -03:00
ReinUsesLisp 73aaf365e7
buffer_cache: Remove brace initialized for objects with default constructor 2019-11-20 16:00:40 -03:00
Fernando Sahmkow cc81c0ce64 Texture_Cache: Redo invalid Surfaces handling.
This commit aims to redo the full setup of invalid textures and
guarantee correct behavior across backends in the case of finding one by
using black dummy textures that match the target of the expected
texture.
2019-11-20 14:59:35 -04:00
ReinUsesLisp 24f4198cee
shader/other: Reduce DEPBAR log severity
While DEPBAR is stubbed it doesn't change anything from our end. Shading
languages handle what this instruction does implicitly. We are not
getting anything out fo this log except noise.
2019-11-19 21:26:40 -03:00
ReinUsesLisp bc10714dcf
gl_shader_gen: Apply default value to gl_Position
Nvidia has sane default output values for varyings, but the other
vendors don't apply these. To properly emulate this we would have to
analyze the shader header. For the time being, apply the same default
Nvidia applies so we get the same behaviour on non-Nvidia drivers.
2019-11-19 20:32:01 -03:00
bunnei b0819e2ffb
Merge pull request #3086 from ReinUsesLisp/format-lookups
texture_cache: Use a flat table instead of switch for texture format lookups
2019-11-19 18:29:17 -05:00
Fernando Sahmkow c8473f399e Shader_IR: Address Feedback 2019-11-18 07:34:34 -04:00
bunnei a8295d2c53
Merge pull request #3047 from ReinUsesLisp/clip-control
gl_rasterizer: Emulate viewport flipping with ARB_clip_control
2019-11-15 12:09:19 -05:00
ReinUsesLisp 4681381a34
format_lookup_table: Address feedback
format_lookup_table: Drop bitfields

format_lookup_table: Use std::array for definition table

format_lookup_table: Include <limits> instead of <numeric>
2019-11-14 20:57:30 -03:00
ReinUsesLisp 80eacdf89b
texture_cache: Use a table instead of switch for texture formats
Use a large flat array to look up texture formats. This allows us to
properly implement formats with different component types. It should
also be faster.
2019-11-14 20:57:10 -03:00
ReinUsesLisp 48a1687f51
texture_cache: Drop abstracted ComponentType
Abstracted ComponentType was not being used in a meaningful way.
This commit drops its usage.

There is one place where it was being used to test compatibility between
two cached surfaces, but this one is implied in the pixel format.
Removing the component type test doesn't change the behaviour.
2019-11-14 18:21:42 -03:00
greggameplayer c6bc13d0aa correct the implementation of RGBA16UI 2019-11-14 21:37:39 +01:00
Fernando Sahmkow cd0f5dfc17 Shader_IR: Implement TXD instruction. 2019-11-14 11:15:27 -04:00
Fernando Sahmkow f3d1b370aa Shader_IR: Implement FLO instruction. 2019-11-14 11:15:27 -04:00
Fernando Sahmkow 95137a04e1 Shader_Bytecode: Add encodings for FLO, SHF and TXD 2019-11-14 11:15:26 -04:00
Fernando Sahmkow b6f6733131
Merge pull request #3081 from ReinUsesLisp/fswzadd-shuffles
shader: Implement FSWZADD and reimplement SHFL
2019-11-14 10:27:27 -04:00
ReinUsesLisp 7990220df7
maxwell_3d: Fix stencil_back_func_mask offset
stencil_back_func_mask and stencil_back_mask were misplaced. This commit
addresses that issue.
2019-11-13 16:35:17 -03:00
Rodrigo Locatti cf770a68a5
Merge pull request #3084 from ReinUsesLisp/cast-warnings
video_core: Treat implicit conversions as errors
2019-11-13 02:16:22 -03:00
Rodrigo Locatti fb9418798d
video_core: Enable sign conversion warnings
Enable sign conversion warnings but don't treat them as errors.
2019-11-11 18:00:37 -03:00
bunnei 0fc596de6e
Merge pull request #3082 from ReinUsesLisp/fix-lockers
gl_shader_cache: Fix locker constructors
2019-11-09 13:58:36 -05:00
ReinUsesLisp 18c1cb68fd video_core: Treat implicit conversions as errors 2019-11-08 22:49:39 +00:00
ReinUsesLisp 096f339a2a video_core: Silence implicit conversion warnings 2019-11-08 22:48:50 +00:00
bunnei a056d8de16
Merge pull request #3080 from FernandoS27/glsl-fix
GLSLDecompiler: Correct Texture Gather Offset.
2019-11-08 15:56:29 -05:00
ReinUsesLisp bfa973a62b
gl_shader_cache: Fix locker constructors
Properly pass engine when a shader is being constructed from memory.
2019-11-07 20:43:31 -03:00
ReinUsesLisp 3ab0514698
gl_shader_cache: Enable extensions only when available
Silence GLSL compilation warnings.
2019-11-07 20:08:42 -03:00
ReinUsesLisp cd66395944
gl_shader_decompiler: Add safe fallbacks when ARB_shader_ballot is not available 2019-11-07 20:08:42 -03:00
ReinUsesLisp 56e237d1f9
shader_ir/warp: Implement FSWZADD 2019-11-07 20:08:41 -03:00
ReinUsesLisp 08b2b1080a
gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsics 2019-11-07 20:08:41 -03:00
Fernando Sahmkow 3d7c284e0f GLSLDecompiler: Correct Texture Gather Offset.
This commit corrects the argument ordering in textureGatherOffset.
2019-11-07 11:43:56 -04:00
bunnei b6ae48966d
Merge pull request #3032 from ReinUsesLisp/simplify-control-flow-brx
shader/control_flow: Abstract repeated code chunks in BRX tracking
2019-11-07 01:30:01 -05:00
Morph 0e8a3bf3e5 buffer_cache: Add missing includes (#3079)
`boost::make_iterator_range` is available when `boost/range/iterator_range.hpp` is included.
Also include `boost/icl/interval_map.hpp` and `boost/icl/interval_set.hpp`.
2019-11-07 06:25:53 +00:00
bunnei 344d15f61e
Merge pull request #3070 from ReinUsesLisp/shader-warnings
shader_ir: Reduce severity of warnings
2019-11-07 00:47:24 -05:00
ReinUsesLisp e9d2fad984
gl_rasterizer: Remove front facing hack 2019-11-07 01:52:18 -03:00
ReinUsesLisp f1facaeaef
gl_shader_decompiler: Fix typo "y_negate"->"y_direction" 2019-11-07 01:52:18 -03:00
ReinUsesLisp e2ea0c3e11
gl_shader_manager: Remove unused variable in SetFromRegs 2019-11-07 01:52:18 -03:00
ReinUsesLisp f019817f8f
gl_rasterizer: Emulate viewport flipping with ARB_clip_control
Emulates negative y viewports with ARB_clip_control. This allows us to
more easily emulated pipelines with tessellation and/or geometry shader
stages. It also avoids corrupting games with transform feedbacks and
negative viewports (gl_Position.y was being modified).
2019-11-07 01:52:18 -03:00
Rodrigo Locatti ff5a0f370c
shader/control_flow: Specify constness on caller lambdas
Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>
2019-11-07 01:44:09 -03:00
ReinUsesLisp 7b069252f8
shader/control_flow: Use callable template instead of std::function 2019-11-07 01:44:08 -03:00
ReinUsesLisp 46c3047283
shader/control_flow: Abstract repeated code chunks in BRX tracking
Remove copied and pasted for cycles into a common templated function.
2019-11-07 01:44:08 -03:00
ReinUsesLisp ae7dfa93be
shader/control_flow: Silence Intellisense cast warnings 2019-11-07 01:44:08 -03:00
ReinUsesLisp deb1b54eed
shader/control_flow: Remove brace initializer in std containers
These containers have a default constructor.
2019-11-07 01:44:08 -03:00
ReinUsesLisp 39c66abd91
shader/decode: Reduce severity of arithmetic rounding warnings 2019-11-07 01:43:38 -03:00
ReinUsesLisp c4374d0d41
shader/arithmetic: Reduce RRO stub severity 2019-11-07 01:43:38 -03:00
ReinUsesLisp 35d40b74b3
shader/texture: Remove NODEP warnings
These warnings don't offer meaningful information while decoding
shaders. Remove them.
2019-11-07 01:43:38 -03:00
bunnei 468576284d
Merge pull request #3057 from ReinUsesLisp/buffer-sub-data
gl_rasterizer: Upload constant buffers with glNamedBufferSubData
2019-11-06 10:08:55 -05:00
Rodrigo Locatti 654b77d2ec
Merge pull request #3039 from ReinUsesLisp/cleanup-samplers
shader/node: Unpack bindless texture encoding
2019-11-06 04:54:11 +00:00
bunnei 21e07df7b7
Merge pull request #2914 from FernandoS27/fermi-fix
Fermi2D: limit blit area to only available area
2019-11-05 20:45:24 -05:00
bunnei 1bdae0fe29 common_func: Use std::array for INSERT_PADDING_* macros.
- Zero initialization here is useful for determinism.
2019-11-03 22:22:41 -05:00
ReinUsesLisp 442a1cc021
gl_rasterizer: Re-enable stream buffer memory due to global memory
Global memory is still using the stream buffer when it shouldn't. As a
temporary fix re-enable the stream buffer on compute.
2019-11-02 13:19:19 -03:00
ReinUsesLisp 76ca2a5f82
gl_rasterizer: Upload constant buffers with glNamedBufferSubData
Nvidia's OpenGL driver maps gl(Named)BufferSubData with some requirements
to a fast. This path has an extra memcpy but updates the buffer without
orphaning or waiting for previous calls. It can be seen as a better
model for "push constants" that can upload a whole UBO instead of 256
bytes.

This path has some requirements established here:
http://on-demand.gputechconf.com/gtc/2014/presentations/S4379-opengl-44-scene-rendering-techniques.pdf#page=24

Instead of using the stream buffer, this commits moves constant buffers
uploads to calls of glNamedBufferSubData and from my testing it brings a
performance improvement. This is disabled when the vendor is not Nvidia
since it brings performance regressions.
2019-11-02 05:05:34 -03:00
Fernando Sahmkow 23cabc98db Shader_IR: Fix regression on TLD4
Originally on the last commit I thought TLD4 acted the same as TLD4S and 
didn't have a mask. It actually does have a component mask. This commit 
corrects that.
2019-10-30 21:14:57 -04:00
Rodrigo Locatti 658489ebf7
Merge pull request #3050 from FernandoS27/fix-tld4
shader_ir: Fix TLD4 and add bindless variant
2019-10-30 18:37:17 +00:00
Fernando Sahmkow 9293c3a0f2 Shader_IR: Fix TLD4 and add Bindless Variant.
This commit fixes an issue where not all 4 results of tld4 were being
written, the color component was defaulted to red, among other things.
It also implements the bindless variant.
2019-10-30 12:02:03 -04:00
bunnei 2382bbe3ac
Merge pull request #3046 from ReinUsesLisp/clean-gl-state
gl_state: Miscellaneous clean up
2019-10-29 22:50:04 -04:00
bunnei b5138f3c35
Merge pull request #3035 from ReinUsesLisp/rasterizer-accelerated
rasterizer_accelerated: Add intermediary for GPU rasterizers
2019-10-29 22:06:41 -04:00
Rodrigo Locatti 3d0cde6a75
gl_state: Use std::array::fill instead of std::fill
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2019-10-30 01:30:31 +00:00
ReinUsesLisp ce20ed8e4e
gl_state: Move dirty checks to individual apply calls instead of Apply
This requires removing constness from some methods, but for consistency
it's removed in all methods.
2019-10-29 21:27:25 -03:00
ReinUsesLisp 3c6557c235
gl_state: Remove ApplyDefaultState
OpenGL has defaults values we can trust. Remove these.
2019-10-29 21:27:25 -03:00
ReinUsesLisp d3651b0b82
gl_state: Change SetDefaultViewports to use default constructor 2019-10-29 21:27:24 -03:00
ReinUsesLisp c7698d0bc8
gl_state: Minor style changes 2019-10-29 21:27:24 -03:00
ReinUsesLisp a14d202ac2
gl_state: Remove unused Citra TextureUnits 2019-10-29 21:27:24 -03:00
ReinUsesLisp 28fece8e9b
gl_state: Move initializers from constructor to class declaration 2019-10-29 21:27:23 -03:00
ReinUsesLisp a993df1ee2
shader/node: Unpack bindless texture encoding
Bindless textures were using u64 to pack the buffer and offset from
where they come from. Drop this in favor of separated entries in the
struct.

Remove the usage of std::set in favor of std::list (it's not std::vector
to avoid reference invalidations) for samplers and images.
2019-10-29 20:53:48 -03:00
Rodrigo Locatti 2ec5b55ee3
Merge pull request #3004 from ReinUsesLisp/maxwell3d-cleanup
maxwell_3d: Remove unused entries
2019-10-29 23:46:33 +00:00
Rodrigo Locatti c5d9589942
Merge pull request #3037 from FernandoS27/new-formats
video_core: Implement texture format E5B9G9R9_SHAREDEXP.
2019-10-28 01:36:58 -03:00
ReinUsesLisp fa31e5b868
maxwell_3d/kepler_compute: Remove unused arguments in GetTexture 2019-10-28 00:23:42 -03:00
ReinUsesLisp 538ddd220e
video_core/textures: Remove unused index entry in FullTextureInfo 2019-10-28 00:14:38 -03:00
ReinUsesLisp 961fe4d19b
maxwell_3d: Remove unused method GetStageTextures 2019-10-28 00:14:29 -03:00
Fernando Sahmkow 3f9262195b Video_Core: Implement texture format E5B9G9R9_SHAREDEXP.
This commit implements the E5B9G9R9 Texture format into the general 
system and OpenGL backend.
2019-10-27 16:44:09 -04:00
bunnei 6909b2f0f9
Merge pull request #3034 from ReinUsesLisp/w4244-maxwell3d
maxwell_3d: Silence implicit conversion warnings
2019-10-27 15:08:59 -04:00
ReinUsesLisp 3e469cecc1
maxwell_3d: Silence implicit conversion warnings
While we are at it, unify types for dirty reg pointers.
2019-10-27 15:22:17 -03:00
ReinUsesLisp bd2aff3e26
rasterizer_accelerated: Add intermediary for GPU rasterizers
Add an intermediary class that implements common functions across GPU
accelerated rasterizers. This avoids code repetition on different
backends.
2019-10-27 03:40:08 -03:00
ReinUsesLisp a5aa1bb174
astc: Silence implicit conversion warnings 2019-10-27 03:04:50 -03:00
Rodrigo Locatti 26f3e18c5c
Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased
Implement Fast BRX, fix TXQ and addapt the Shader Cache for it
2019-10-26 16:56:13 -03:00
Fernando Sahmkow be856a38d6 Shader_IR: Address Feedback. 2019-10-26 15:38:30 -04:00
Rodrigo Locatti a0d79085c4
Merge pull request #3027 from lioncash/lookup
shader_ir: Use std::array with std::pair instead of std::unordered_map
2019-10-26 05:49:15 -03:00
Rodrigo Locatti d52598173d
Merge pull request #3013 from FernandoS27/tld4s-fix
Shader_Ir: Fix TLD4S from using a component mask.
2019-10-25 20:06:26 -03:00
Fernando Sahmkow e3afd6595a Shader_IR: Clang format 2019-10-25 09:01:32 -04:00
ReinUsesLisp 78f3e8a757 gl_shader_cache: Implement locker variants invalidation 2019-10-25 09:01:32 -04:00
ReinUsesLisp ec85648af3 gl_shader_disk_cache: Store and load fast BRX 2019-10-25 09:01:31 -04:00
ReinUsesLisp fa2c297f3e const_buffer_locker: Minor style changes 2019-10-25 09:01:31 -04:00
ReinUsesLisp 7b81ba4d8a gl_shader_decompiler: Move entries to a separate function 2019-10-25 09:01:31 -04:00
Fernando Sahmkow 1244f2d368 Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. 2019-10-25 09:01:31 -04:00
Fernando Sahmkow a05120ec0b Shader_IR: Correct typo in Consistent method. 2019-10-25 09:01:30 -04:00
Fernando Sahmkow 33fcec3502 Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it 2019-10-25 09:01:30 -04:00
Fernando Sahmkow 8909f52166 Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. 2019-10-25 09:01:30 -04:00
Fernando Sahmkow acd6441134 Shader_Cache: setup connection of ConstBufferLocker 2019-10-25 09:01:29 -04:00
Fernando Sahmkow 1a58f45d76 VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders. 2019-10-25 09:01:29 -04:00
Fernando Sahmkow 2ef696c85a Shader_IR: Implement BRX tracking. 2019-10-25 09:01:29 -04:00
Rodrigo Locatti 5062728669
Merge pull request #3028 from lioncash/constexpr
shader_bytecode: Make Matcher constexpr capable
2019-10-24 15:10:40 -03:00
Lioncash 7fdf991097 shader_bytecode: Make Matcher constexpr capable
Greatly shrinks the amount of generated code for GetDecodeTable().

Collapses an assembly output of 9000+ lines down to ~3621 with Clang,
and 6513 down to ~2616 with GCC, given it's now allowed to construct all
the entries as a sequence of constant data.
2019-10-24 01:10:10 -04:00
Lioncash 382717172e shader_ir: Use std::array with pair instead of unordered_map
Given the overall size of the maps are very small, we can use arrays of
pairs here instead of always heap allocating a new map every time the
functions are called. Given the small size of the maps, the difference
in container lookups are negligible, especially given the entries are
already sorted.
2019-10-24 00:25:38 -04:00
Lioncash 1f5401c89c video_core/shader: Resolve instances of variable shadowing
Silences a few -Wshadow warnings.
2019-10-23 23:00:31 -04:00
Fernando Sahmkow c4a0aa9207
Merge pull request #2995 from ReinUsesLisp/ignore-gmem
shader_ir/memory: Ignore global memory when tracking fails
2019-10-22 13:22:43 -04:00
Fernando Sahmkow 7ecf9f7228
Merge pull request #2983 from lioncash/fallthrough
gl_shader_decompiler/vk_shader_decompiler: Resolve implicit fallthrough cases
2019-10-22 13:16:46 -04:00
Fernando Sahmkow 1509d2ffbd Shader_Ir: Fix TLD4S from using a component mask.
TLD4S always outputs 4 values, the previous code checked a component 
mask and omitted those values that weren't part of it. This commit 
corrects that and makes sure all 4 values are set.
2019-10-22 10:59:07 -04:00