Commit graph

2545 commits

Author SHA1 Message Date
degasus
b35fa222f5 VideoCommon: perf querys by async events 2015-02-22 08:41:15 +01:00
degasus
edbd402101 VideoCommon: bbox by async events 2015-02-22 08:41:15 +01:00
degasus
ad7264da7d VideoCommon: implement swap requests in the full async way 2015-02-22 08:41:15 +01:00
degasus
bc248f8941 VideoCommon: use a new async event system for efb access 2015-02-22 08:41:15 +01:00
Jules Blok
ff4127cf50 VertexShaderManager: Turn the epsilon hack back on for 3D Vision.
The bug is fixed in version 347.52 of the drivers.
2015-02-21 12:09:49 +01:00
Jules Blok
139ad3b2b9 TextureConversionShader: Use a Texture2DArray to match the shader resource view. 2015-02-21 11:50:20 +01:00
Markus Wick
6bbf774507 Merge pull request #2075 from magumagu/titantron-fix
Partially fix WWE12 titantron videos.
2015-02-21 10:09:47 +01:00
Scott Mansell
355be1719e Fix regression with directx when zfreeze=true and ztest=false. 2015-02-21 10:52:29 +13:00
magumagu
074397c12d Explicitly set up AllocateTexture configuration for palette conversion.
No functional change.
2015-02-19 15:57:05 -08:00
magumagu
ddc815dd7a Remove TextureAddress struct. 2015-02-19 15:36:32 -08:00
magumagu
c0a4760f0e Decode EFB copies used as paletted textures.
A number of games make an EFB copy in I4/I8 format, then use it as a
texture in C4/C8 format.  Detect when this happens, and decode the copy on
the GPU using the specified palette.

This has a few advantages: it allows using EFB2Tex for a few more games,
it, it preserves the resolution of scaled EFB copies, and it's probably a
bit faster.

D3D only at the moment, but porting to OpenGL should be straightforward..
2015-02-19 15:09:27 -08:00
magumagu
4cdf9f543f Partially fix WWE12 titantron videos.
The obvious question here is, why does it matter if we round or truncate?
The key is that GC/Wii does fixed-point interpolation, where PC GPUs do
floating-point interpolation. Discarding fractional bits makes the conversion
from floating-point to fixed point give more consistent results.

I'm not confident this is really the right fix, or that my explanation is
completely correct; ideally, we don't want to depend on floating-point
interpolation at all.
2015-02-18 19:41:00 -08:00
mimimi085181
2f8e0c9bb9 Allow multiple texture cache entries for textures at the same address
This is the same trick which is used for Metroid's fonts/texts, but for all textures. If 2 different textures at the same address are loaded during the same frame, create a 2nd entry instead of overwriting the existing one. If the entry was overwritten in this case, there wouldn't be any caching, which results in a big performance drop.

The restriction to textures, which are loaded during the same frame, prevents creating lots of textures when textures are used in the regular way. This restriction is new. Overwriting textures, instead of creating new ones is faster, if the old ones are unlikely to be used again.

Since this would break efb copies, don't do it for efb copies.

Castlevania 3 goes from 80 fps to 115 fps for me.

There might be games that need a higher texture cache accuracy with this, but those games should also see a performance boost from this PR.

Some games, which use paletted textures, which are not efb copies, might be faster now. And also not require a higher texture cache accuracy anymore. (similar sitation as PR https://github.com/dolphin-emu/dolphin/pull/1916)
2015-02-18 23:54:40 +01:00
Ryan Houdek
3aa605236d Merge pull request #2041 from Sonicadvance1/AArch64_vertex_loader
[AArch64] Vertex loader and things
2015-02-17 00:51:51 -06:00
Ryan Houdek
ed008c3a69 [AArch64] Change the vertex loader over to using unscaled loadstores.
In nearly all direct loadstore cases we can use unscaled loadstores.
Still have a fallback in case we hit a situation that we /can't/ do a unscaled loadstore.
2015-02-16 22:03:09 -06:00
Ryan Houdek
b4b03641b3 [AArch64] Implement vertex loader recompiler.
Shows a noticeable reduction in time spent in the vertex loader.
2015-02-16 16:51:32 -06:00
Pierre Bourdon
3500740dd4 Windows AVIDump: support "silent" frame dumping
When enabled, the silent option will avoid popping up dialog boxes for
overwrite confirmation or codec selection. The codec selection defaults to
uncompressed RGB.

This is required for FifoCI on Windows which needs to drive Dolphin from the
command line exclusively.
2015-02-14 23:38:14 +01:00
Markus Wick
405444d4fe Merge pull request #1803 from lioncash/rgb
OnScreenDisplay: Allow for different colored messages
2015-02-14 10:47:47 +01:00
Tillmann Karras
1a52cff1c9 VertexLoaderManager: reset stats properly 2015-02-14 02:30:05 +01:00
Ryan Houdek
15e41c67f8 Change RunVertices' function arguments.
This reduces some dumb state shuffling when calling the emitted vertex loaders.
2015-02-13 12:16:06 -06:00
degasus
c404e87226 ShaderGen: Fix pixel offset correction
We want to move the vertex by 1/12 pixel, but the old code
did miss the perspective division. So by multiplying with pos.w,
the position is moved correctly after the perspective division.
2015-02-11 20:54:15 +01:00
magumagu
d9988ee9b5 Merge pull request #1987 from magumagu/thread-safety
Cleanup usage of atomic/threadsafe functions
2015-02-10 13:48:12 -08:00
Lioncash
9d5c6c55fe OnScreenDisplay: Allow for different colored messages 2015-02-07 17:35:21 -05:00
Lioncash
e07679114b Use emplace_* functions where in-place construction is preferable 2015-02-04 11:39:08 -05:00
magumagu
57d94de2ad Fix regression for D3D EFB depth copies.
On D3D, we read from the depth buffer using the format
DXGI_FORMAT_R24_UNORM_X8_TYPELESS (essentially, the "r" component contains
the depth, and the other components contain nothing).
2015-02-03 11:27:27 -08:00
Markus Wick
3c475b91ea Merge pull request #1993 from Armada651/line-perspective
GeometryShaderGen: Perspective divide the line coordinates before comparing the angle.
2015-01-31 23:45:54 +01:00
Jules Blok
8c55ec0d51 GeometryShaderGen: Perspective divide the line coordinates before comparing the angle. 2015-01-31 23:32:23 +01:00
Tillmann Karras
1aac65f988 VertexLoaderManager: assimilate GetVertexSize() 2015-01-31 09:23:50 +01:00
magumagu
47be9d8e6b Clean up usage of ScheduleEvent_Threadsafe. 2015-01-30 14:48:23 -08:00
degasus
20628b6e5d OpcodeDecoder: Calculate decoding time for vertices 2015-01-29 19:55:28 +01:00
Gabriel Corona
a4adfe194a JitRegister: overload Register with a [start,end) variant 2015-01-28 09:50:19 +01:00
Markus Wick
beaa9905a6 Merge pull request #1966 from magumagu/unify-efb-encode
Unify EFB encoding shader generation
2015-01-27 23:14:18 +01:00
Markus Wick
43605f8716 Merge pull request #1948 from magumagu/remove-efb-cache
Remove EFB to RAM cache, and simplify code.
2015-01-27 09:42:15 +01:00
Tillmann Karras
3dbd6cd384 VertexLoaderX64: save XMM0 if the ABI requires it 2015-01-26 22:24:06 +01:00
Tillmann Karras
8416a86b6d VertexLoaderBase: fix crash on invalid formats 2015-01-26 22:24:06 +01:00
Tillmann Karras
66f28707e7 VertexLoader: small clean up 2015-01-26 22:24:06 +01:00
magumagu
b56025e6eb Don't use boolean negation. 2015-01-25 23:28:59 -08:00
magumagu
92189823f3 Fix RGBA8 encoding. 2015-01-25 22:53:30 -08:00
magumagu
b0b99b6922 Fix shader so it's possible to use with D3D Map().
Well, that's not strictly true, but trying to memcpy between two buffers
using different row lengths and different strides is at minimum extremely
unintuitive.
2015-01-25 19:57:09 -08:00
magumagu
6c1bdfe04c More work. 2015-01-25 19:57:07 -08:00
magumagu
ef75f3005d WIP. 2015-01-25 15:49:35 -08:00
Jules Blok
5c4ee2f71e PostProcessing: Move default pixel shader to PostProcessingShaderConfiguration.
Reduces code complexity and fixes a bug where the shader is not properly invalidated.
2015-01-25 23:08:49 +01:00
Jules Blok
262c3b19ec PostProcessing: Add support for user-supplied anaglyph shaders.
There are lots of different anaglyph glasses out there and there may be even more creative uses for stereoscopic post-processing shaders.
2015-01-25 22:07:03 +01:00
Scott Mansell
61215e7180 Fix a buffer underrun in CalculateZSlope. 2015-01-25 20:31:20 +13:00
Lioncash
9cdfe889af Coding style cleanup from the zfreeze merge 2015-01-24 15:16:48 -05:00
Markus Wick
ae514cb0f2 Merge pull request #1955 from degasus/master
TexCache: Rewrite the texID generation for paletted textures
2015-01-24 15:37:25 +01:00
degasus
51990fcdfa TexCache: Rewrite the texID generation for paletted textures
This changes the behavior if both texture are available. The old code did
try to load the modfied texID, the new code tries the unmodified texID first.
2015-01-24 13:58:20 +01:00
Markus Wick
4f6d0049a7 Merge pull request #1951 from Sonicadvance1/Remove_old_defines
Remove an old GLES define that I missed.
2015-01-24 13:38:26 +01:00
Tony Wasserka
43036af944 Merge pull request #1812 from phire/real_zfreeze
Add proper zfreeze support.
2015-01-24 13:29:57 +01:00
Scott Mansell
14baf038e7 Stop doing nastly shit to OpenGL stream buffers.
Instead we keep the loaded vertices in CPU memory.
2015-01-24 14:41:51 +13:00
Ryan Houdek
189528171b Remove an old GLES define that I missed. 2015-01-23 14:30:23 -06:00
magumagu
6659c15bed Remove EFB to RAM cache, and simplify code. 2015-01-23 10:48:15 -08:00
Scott Mansell
5510c86b81 Move Zfreeze code out individual backends into videoCommon
Also:
 * Implement support for per-vertex PosMatrixIndex
 * Only update zslope constant once when zfreeze is activated.
 * Added a bunch of comments.
2015-01-24 03:22:27 +13:00
Scott Mansell
daf760b202 A few small cleanups based on code review. 2015-01-23 04:38:36 +13:00
Scott Mansell
e88c02dece Ensure that ZSlopes save/restore state correctly.
Had to re-do *ShaderManager so they saved their constant arrays
instead of completly rebuilding them on restore state.
2015-01-23 03:32:31 +13:00
Scott Mansell
128d303656 Reduce number of divisions in screenspace transform.
This is closer to what the hardware does anyway.
2015-01-23 03:32:31 +13:00
NanoByte011
add59b3bea Fixes Mario Tennis Gimmick Courts and adds support for FastDepthCalc
- Calculate ZSlope every flush but only set PixelShader Constant on Reset Buffer when zfreeze
- Fixed another Pixel Shader bug in D3D that was giving me grief
2015-01-23 03:32:31 +13:00
Scott Mansell
6d5065c58d Fix pixelshader constant offsets. 2015-01-23 03:32:31 +13:00
Scott Mansell
88c7afd315 Make zfreeze use screenspace coordinates independant of IR.
OpenGL requires the y coordinates to be flipped.

Also refactored PixelGen code to remove duplicate code.
2015-01-23 03:32:31 +13:00
Scott Mansell
418296961c Fix various issues with zfreeze implemntation.
Results are still not correct, but things are getting closer.

 * Don't cull CULLALL primitives so early so they can be used as reference
        planes.
 * Convert CalculateZSlope to screenspace coordinates.
 * Convert Pixelshader to screenspace coordinates (instead of worldspace
        xy coordinates, which is totally wrong)
 * Divide depth by 2^24 instead of clamping to 0.0-1.0 as was done
        before.

Progress:
 * Rouge Squadron 2/3 appear correct in game (videos in rs2 save file
         selection are missing)
 * Shadows draw 100% correctly in NHL 2003.
 * Mario golf menu renders correctly.
 * NFS: HP2, shadows sometimes render on top of car or below the road.
 * Mario Tennis, courts and shadows render correctly, but at wrong depth
 * Blood Omen 2, doesn't work.
2015-01-23 03:32:31 +13:00
NanoByte011
613781c765 Cleanup and refactor of zfreeze port
Based on the feedback from pull request #1767 I have put in most of
degasus's suggestions in here now.

I think we have a real winner here as moving the code to
VertexManagerBase for a function has allowed OGL to utilize zfreeze now
:)

Correct use of the vertex pointer has also corrected most of the issue
found in pull request #1767 that JMC47 stated.  Which also for me now
has Mario Tennis working with no polygon spikes on the characters
anymore!  Shadows are still an issue and probably in the other games
with shadow problems.  Rebel Strike also seems better but random skybox
glitches can show up.
2015-01-23 03:32:31 +13:00
NanoByte011
937844b9e3 Initial port of zfreeze branch (3.5-1729)
Initial port of original zfreeze branch (3.5-1729) by neobrain into
most recent build of Dolphin.

Makes Rogue Squadron 2 very playable at full speed thanks to recent core
speedups made to Dolphin. Works on DirectX Video plugin only for now.

Enjoy!  and Merry Xmas!!
2015-01-23 03:31:54 +13:00
skidau
d27bd9d291 Merge pull request #1885 from degasus/custom_texture
CustomTexture: new name format
2015-01-23 00:43:39 +11:00
NanoByte011
0a9257ad37 Cleaned up whitespace
Fixed Directional Attenuation (assumed, data was light dir vector already, but it was not!)
2015-01-21 22:30:41 -07:00
NanoByte011
f475e367f2 Lighting Attenuation Fixes 2015-01-21 15:55:32 -07:00
degasus
7cf4dd63e4 CustomTexture: fix texture format 2015-01-21 23:33:42 +01:00
degasus
1d0557a5e6 CustomTexture: use xxhash 2015-01-21 21:47:18 +01:00
degasus
84c8645d22 CustomTexture: Convert old format automatically 2015-01-21 21:22:55 +01:00
degasus
f9ced4eb13 CustomTexture: also support the legacy format 2015-01-21 21:22:55 +01:00
degasus
62402efa6c CustomTexture: Mark textures with mipmaps 2015-01-21 21:22:55 +01:00
degasus
ee9d05d67f CustomTexture: Use another file name with wildcards 2015-01-21 21:22:55 +01:00
degasus
a353ead3cb CustomTexture: Use always safe texture hash 2015-01-21 21:22:55 +01:00
degasus
eeaad06a07 CustomTexture: check for min/max index on paletted textures 2015-01-21 21:22:55 +01:00
Ryan Houdek
1c62c2f935 Merge pull request #1924 from degasus/xxhash
VideoCommon: xxhash
2015-01-21 14:19:35 -06:00
Ryan Houdek
80e6367e46 Merge pull request #1869 from Stevoisiak/GeneralConsistency
Minor consistency changes
2015-01-21 13:46:53 -06:00
Ryan Houdek
50d495b581 Merge pull request #1916 from mimimi085181/master
Make efb to texture less broken for paletted textures that are efb copies
2015-01-21 13:40:36 -06:00
Ryan Houdek
7fba4856ce Merge pull request #1931 from Sonicadvance1/Fix_PP_Config
Fix the Post Processing shader configuration dialog.
2015-01-21 13:29:01 -06:00
degasus
402fb4bd20 xxhash: Add cmake + VS files
Based on riking's PR.
2015-01-21 07:35:34 +01:00
Ryan Houdek
d348bfea46 Fix the Post Processing shader configuration dialog.
On locales that don't use period as a separator this would break us.
For vector values in a configuration, we use comma as a separator which causes the configuration to balloon to massive sizes due to never saving them
correctly. Loading would then break since it would load a million configuration options.
Fixes issue #7569.
2015-01-20 16:40:46 -06:00
Jules Blok
f40cd04a29 PixelShaderGen: Fix uninitialized variables. 2015-01-20 23:15:01 +01:00
Tillmann Karras
1dcf49237b VertexLoaderX64: support VAT.ByteDequant=0 2015-01-20 09:23:15 +01:00
Tillmann Karras
46ab5d63d6 VertexLoader: never reset alpha in 8888 colors
Fixes the opening menu of Xenoblade Chronicles.
2015-01-20 09:22:55 +01:00
Tillmann Karras
80617ec6bd VertexLoader: remove weird line 2015-01-20 01:53:52 +01:00
Tillmann Karras
873902b4a3 VertexLoader: remove non-JIT SSE code 2015-01-20 01:51:07 +01:00
Markus Wick
0d0f7ec662 Merge pull request #1894 from Armada651/exclusive-fix
D3D: Fix Dolphin immediately exiting exclusive fullscreen.
2015-01-19 23:29:43 +01:00
Jules Blok
332d5888eb VideoConfig: Add exclusive mode flag.
Allows the UI to easily check the current exclusive mode state.
This simplifies a few checks and prevents the user from ever getting stuck in fullscreen.
2015-01-19 22:55:21 +01:00
Tillmann Karras
804341d4fe VertexLoader: fix position offset bug 2015-01-19 17:38:40 +01:00
Tillmann Karras
4b323096ec VertexLoader_Position: remove old JIT ideas 2015-01-19 17:36:24 +01:00
Ryan Houdek
7e64869185 Merge pull request #1887 from Tilka/vertex_loader_jit
VertexLoader: rewrite x64 JIT
2015-01-18 19:48:14 -06:00
mimimi085181
0d3343d093 Make efb to texture less broken for paletted textures that are efb copies
Don't change the texID depending on the tlut_hash for paletted textures that are efb copies and don't have an entry in the cache for texID ^ tlut_hash. This makes those textures less broken when using efb to texture.

This is not really fixing those textures, but it's a step forward. The mini map in Twilight Princess for example is in grayscales with this and is more or less usable.
2015-01-19 01:31:41 +01:00
Tillmann Karras
d3f49097c5 VertexLoaderX64: register symbol for code page 2015-01-18 23:20:44 +01:00
degasus
9f13a77799 TexCache: don't try to aggressive reuse the entry
As we pool them now, freeing and reallocating them is quite fast.
2015-01-18 19:58:33 +01:00
degasus
8565f02699 TexCache: use an unordered_multimap for the tex pool 2015-01-18 19:58:33 +01:00
degasus
4639d3b1bc TexCache: also incude textures within the render target pool 2015-01-18 19:47:48 +01:00
degasus
6cd6e6546f TexCache: merge texture and rendertarget factory function 2015-01-18 19:47:48 +01:00
degasus
615ae9f106 TexCache: remove PC_TexFormat
We only support rgba32 for a while now, so there is no need to have everything in common configureable.
2015-01-18 19:47:48 +01:00
Tillmann Karras
bc5cf10ad5 VertexLoaderX64: optimize color conversions 2015-01-18 17:47:18 +01:00
Tillmann Karras
7d0cff05e9 VertexLoaderX64: make table lookup deterministic 2015-01-18 16:22:21 +01:00
Tillmann Karras
1855d56f1a VertexLoaderX64: fix a bunch of stuff
Suggestions by @degasus and @FioraAeterna.
2015-01-18 13:31:28 +01:00
Tillmann Karras
dc01e261d1 VertexLoaderX64: fix duplicate register allocation
Thanks to @shuffle2 for noticing this.
2015-01-18 13:30:21 +01:00