Commit graph

1796 commits

Author SHA1 Message Date
Subv
fdb199290b GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
When not set, this tells the GPU to only use the X size when performing a DMA copy.
This is only implemented for linear->linear and tiled->tiled copies. Conversion copies still retain the assert.

This bit is unset by some games for various purposes, and by nouveau when copying the vertex buffers.
2018-09-08 16:02:16 -05:00
bunnei
8b08cb925b gl_rasterizer: Use baseInstance instead of moving the buffer points.
This hopefully helps our cache not to redundant upload the vertex buffer.

# Conflicts:
#	src/video_core/renderer_opengl/gl_rasterizer.cpp
2018-09-08 04:05:56 -04:00
Patrick Elsässer
a8974f0556 video_core: Arithmetic overflow warning fix for gl_rasterizer (#1262)
* video_core: Arithmetic overflow fix for gl_rasterizer

- Fixed warnings, which were indicating incorrect behavior from integral
promotion rules and types larger than those in which arithmetic is
typically performed.

- Added const for variables where possible and meaningful.

* Changed the casts from C to C++ style

Changed the C-style casts to C++ casts as proposed.
Took also care about signed / unsigned behaviour.
2018-09-08 02:59:59 -04:00
bunnei
9947c6ad59
Merge pull request #1252 from lioncash/header
video_core/CMakeLists: Add missing gl_buffer_cache.h
2018-09-06 19:19:43 -04:00
bunnei
9b50dca2bb
Merge pull request #1253 from lioncash/explicit
video_core/gl_buffer_cache: Minor tidying changes
2018-09-06 19:19:35 -04:00
bunnei
009a2cc9cc
Merge pull request #1255 from bunnei/minor-opt
gl_rasterizer: Call state.Apply only once on SetupShaders.
2018-09-06 19:19:16 -04:00
bunnei
820f646458 gl_rasterizer: Call state.Apply only once on SetupShaders. 2018-09-06 17:41:53 -04:00
bunnei
948f6c0738 gl_shader_decompiler: Implement saturate mode for IPA. 2018-09-06 17:40:03 -04:00
Lioncash
ddcdbce067 gl_buffer_cache: Default initialize member variables
Ensures that the cache always has a deterministic initial state.
2018-09-06 15:07:15 -04:00
Lioncash
8d685a29bc gl_buffer_cache: Make GetHandle() a const member function
GetHandle() internally calls GetHandle() on the stream_buffer instance,
which is a const member function, so this can be made const as well.
2018-09-06 15:07:15 -04:00
Lioncash
14230fe2af gl_buffer_cache: Remove unnecessary includes 2018-09-06 15:05:52 -04:00
Lioncash
68296d9474 gl_buffer_cache: Make constructor explicit
Implicit conversions during construction isn't desirable here.
2018-09-06 14:54:49 -04:00
Lioncash
8f4e09ba07 video_core/CMakeLists: Add missing gl_buffer_cache.h
Without this, the header file won't show up by default within IDEs such
as Visual Studio.
2018-09-06 14:49:51 -04:00
Markus Wick
a781042700 gl_shader_gen: Initialize position.
IMO the old code is fine, but nvidia raises shader compiler warnings.
Trivial fix through...
2018-09-06 13:37:50 +02:00
bunnei
77554ac773
Merge pull request #1243 from degasus/VAO_cache
gl_rasterizer: Implement a VAO cache.
2018-09-05 22:50:52 -04:00
bunnei
6f09c5b128
Merge pull request #1244 from FernandoS27/ipa
shader_decompiler: Implemented IPA Properly (Stage 1)
2018-09-05 21:20:40 -04:00
FernandoS27
e63b229f4a Implemented IPA Properly 2018-09-05 20:15:47 -04:00
Markus Wick
7f15306f78 gl_rasterizer: Skip TODO log.
This is called ~3k times per frame in SMO ingame.
My laptop spends ~3ms per frame on allocating and freeing this string.

Let's just stop printing this kind of redundant information.
2018-09-05 20:20:20 +02:00
Markus Wick
d3ad9469a1 gl_rasterizer: Implement a VAO cache.
This patch caches VAO objects instead of re-emiting all pointers per draw call.
Configuring this pointers is known as a fast task, but it yields too many GL
calls. So for better performance, just bind the VAO instead of 16 pointers.
2018-09-05 18:46:35 +02:00
Markus Wick
50a806ea67 renderer_opengl: Implement a buffer cache.
The idea of this cache is to avoid redundant uploads. So we are going
to cache the uploaded buffers within the stream_buffer and just reuse
the old pointers.
The next step is to implement a VBO cache on GPU memory, but for now,
I want to check the overhead of the cache management. Fetching the
buffer over PCI-E should be quite fast.
2018-09-05 08:03:50 +02:00
Markus Wick
99a71580c4 gl_shader_cache: Use an u32 for the binding point cache.
The std::string generation with its malloc and free requirement
was a noticeable overhead. Also switch to an ordered_map to
avoid the std::hash call. As those maps usually have a size of
two elements, the lookup time shall not matter.
2018-09-04 21:04:41 +02:00
bunnei
9a07e9f805
Merge pull request #1237 from degasus/optimizations
Optimizations
2018-09-04 12:16:06 -04:00
bunnei
26e96d16d0
Merge pull request #1232 from lioncash/copy
gl_shader_decompiler: Use used_shaders member variable directly within GenerateDeclarations()
2018-09-04 11:52:25 -04:00
Markus Wick
2081ed7db2 command_processor: Use std::array for bound_engines.
subchannel is a 3 bit field. So there must not be more than 8 bound engines.
And using a hashmap for up to 8 values is a bit overpowered.
2018-09-04 14:10:05 +02:00
Markus Wick
10bc725944 Update microprofile scopes.
Blame the subsystems which deserve the blame :)

The updated list is not complete, just the ones I've spotted on random sampling the stack trace.
2018-09-04 11:04:26 +02:00
Lioncash
18a89931a9 gl_shader_decompiler: Use used_shaders member variable directly within GenerateDeclarations()
Using the getter function intended for external code here makes an
unnecessary copy of the already-accessible used_shaders vector.
2018-09-02 13:10:11 -04:00
bunnei
325f3e0693
Merge pull request #1213 from DarkLordZach/octopath-fs
filesystem/maxwell_3d: Various changes to boot Project Octopath Traveller
2018-09-02 10:49:18 -04:00
bunnei
89be49d2f3
Merge pull request #1215 from ogniK5377/texs-nodep-assert
Added assert for TEXS nodep
2018-09-02 10:48:27 -04:00
bunnei
177c45e97d
Merge pull request #1214 from ogniK5377/ipa-assert
Added better asserts to IPA, Renamed IPA modes to match mesa
2018-09-02 10:44:43 -04:00
bunnei
9c206fe94d
Merge pull request #1216 from ogniK5377/ffma-assert
Added FFMA asserts and missing fields
2018-09-02 10:44:13 -04:00
David Marcec
60754b4728 Removed saturate assert
Unneeded as we already implement it
2018-09-01 19:33:32 +10:00
David Marcec
2edab4e840 Removed saturate assert
Saturate already implemented
2018-09-01 19:29:20 +10:00
David Marcec
2bc6abb9a1 Changed tab5980_0 default from 0 -> 1 2018-09-01 19:15:03 +10:00
David Marcec
6f8ed9508d Added FMUL asserts 2018-09-01 19:05:10 +10:00
David Marcec
b89fc407d7 Added FFMA asserts 2018-09-01 18:45:14 +10:00
David Marcec
948bc87a59 Added assert for TEXS nodep 2018-09-01 17:00:01 +10:00
David Marcec
ad3dca7e62 Added better asserts to IPA, Renamed IPA modes to match mesa
IpaMode is changed to IpaInterpMode
IpaMode is suppose to be 2 bits not 3
Added IpaSampleMode
Added Saturate

Renamed modes based on
d27c791891/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp (L2530)
2018-09-01 16:34:27 +10:00
Zach Hilman
f32e28c7b8 maxwell_3d: Use CoreTiming for query timestamp 2018-08-31 23:25:18 -04:00
Lioncash
4a587b81b2 core/core: Replace includes with forward declarations where applicable
The follow-up to e2457418da, which
replaces most of the includes in the core header with forward declarations.

This makes it so that if any of the headers the core header was
previously including change, then no one will need to rebuild the bulk
of the core, due to core.h being quite a prevalent inclusion.

This should make turnaround for changes much faster for developers.
2018-08-31 16:30:14 -04:00
bunnei
7f7eb29323 gl_rasterizer_cache: Use accurate framebuffer setting for accurate copies. 2018-08-31 13:07:28 -04:00
bunnei
123c065086 gl_rasterizer_cache: Also use reserve cache for RecreateSurface. 2018-08-31 13:07:28 -04:00
bunnei
9bc71fcc5f rasterizer_cache: Use boost::interval_map for a more accurate cache. 2018-08-31 13:07:28 -04:00
bunnei
d647d9550c gl_renderer: Cache textures, framebuffers, and shaders based on CPU address. 2018-08-31 13:07:27 -04:00
bunnei
16d65182f9 gl_rasterizer: Fix issues with the rasterizer cache.
- Use a single cached page map.
- Fix calculation of ending page.
2018-08-31 13:07:27 -04:00
greggameplayer
06578e89b2 Implement BC6H_UF16 & BC6H_SF16 (#1092)
* Implement BC6H_UF16 & BC6H_SF16
Require by ARMS

* correct coding style

* correct coding style part 2
2018-08-31 12:11:19 -04:00
bunnei
f08d24e9c0
Merge pull request #1204 from lioncash/pimpl
core: Make the main System class use the PImpl idiom
2018-08-31 11:31:20 -04:00
bunnei
6683bf50b5
Merge pull request #1207 from degasus/hotfix
Report correct shader size.
2018-08-31 11:21:15 -04:00
Lioncash
e2457418da core: Make the main System class use the PImpl idiom
core.h is kind of a massive header in terms what it includes within
itself. It includes VFS utilities, kernel headers, file_sys header,
ARM-related headers, etc. This means that changing anything in the
headers included by core.h essentially requires you to rebuild almost
all of core.

Instead, we can modify the System class to use the PImpl idiom, which
allows us to move all of those headers to the cpp file and forward
declare the bulk of the types that would otherwise be included, reducing
compile times. This change specifically only performs the PImpl portion.
2018-08-31 07:16:57 -04:00
Markus Wick
5be8b7a362 Report correct shader size.
Seems like this was an oversee in regards to 1fd979f50a
It changed GLShader::ProgramCode to a std::vector, so sizeof is wrong.
2018-08-31 09:56:37 +02:00
Hexagon12
d626bc8c62 Added predicate comparison GreaterEqualWithNan 2018-08-31 10:40:18 +03:00