Commit graph

66 commits

Author SHA1 Message Date
Fiora
e3578683e3 Vertex loader: optimize texmtx_write_float4
Seems to be pretty high in the profile in some geometry-heavy games like The
Last Story, and the compiler-generated assembly is terrifyingly bad, so
SSE-ize it.
2014-12-03 11:17:05 -08:00
Fiora
7acd5eba17 Vertex loader: use ABI_CallFunction
Should result in faster/shorter code sequences on platforms where generated
code is close enough to the code segment (e.g. Windows).
2014-11-28 20:26:00 -08:00
Fiora
3ddf82a318 Vertex Loader: SSE implementations of more position/texcoord/normal formats
~35-45% faster NFS:HP2, possibly other vertex-bound games.
2014-11-20 02:13:19 -08:00
degasus
c211450b99 OGL: implement bounding box support with ssbo
This implemention tries to be as accurate as the old SW implemention, but it will remove the dependcy of our vertexloader on videosw.
2014-11-17 21:20:32 +01:00
Lioncash
884ec2ed13 Host: Kill off Host_SysMessage
Equivalent facilities already exist.
2014-11-05 02:30:48 -05:00
comex
eb7f4dac50 Convert registersInUse to BitSet. 2014-10-25 16:57:25 -04:00
crudelios
d281b4d7e1 Remove setting to enable or disable Bounding Box calculation. 2014-10-15 19:02:54 +01:00
crudelios
176ea06e82 Get buildbot to compile. 2014-10-10 12:28:15 +01:00
crudelios
2d4b7e3f3f Reimplement Bounding Box calculation using the software renderer. 2014-10-10 12:27:06 +01:00
comex
f0131c2e09 Mechanical changes to move most CP state to a struct rather than separate globals.
The next commit will add a separate copy of the struct and the ability
for LoadCPReg to work on it.
2014-09-28 21:23:29 -04:00
comex
90638c6806 Switch to an unordered_map as a micro-optimization. 2014-09-28 21:23:29 -04:00
comex
f8452ff501 Fix threading issue with vertex loader JIT.
VertexLoader::VertexLoader was setting loop_counter, a *static*
variable, to 0.  This was nonsensical, but harmless until I started to
run it on a separate thread, where it had a chance of interfering with a
running vertex translator.

Switch to just using a register for the loop counter.
2014-09-28 21:23:28 -04:00
comex
63c62b277d Some changes to VertexLoaderManager:
- Lazily create the native vertex format (which involves GL calls) from
RunVertices rather than RefreshLoader itself, freeing the latter to be
run from the CPU thread (hopefully).

- In order to avoid useless allocations while doing so, store the native
format inside the VertexLoader rather than using a cache entry.

- Wrap the s_vertex_loader_map in a lock, for similar reasons.
2014-09-28 21:23:28 -04:00
Rohit Nirmal
fbc64984ca Include CommonTypes.h instead of Common.h. 2014-09-08 15:39:58 -04:00
comex
c5c0b36046 Remove the inaccurately named ABI_PushAllCalleeSavedRegsAndAdjustStack (it didn't preserve FPRs!) and replace with ABI_PushRegistersAndAdjustStack.
To avoid FPRs being pushed unnecessarily, I checked the uses: DSPEmitter
doesn't use FPRs, and VertexLoader doesn't use anything but RAX, so I
specified the register list accordingly.  The regular JIT, however, does
use FPRs, and as far as I can tell, it was incorrect not to save them in
the outer routine.  Since the dispatcher loop is only exited when
pausing or stopping, this should have no noticeable performance impact.
2014-09-08 01:00:10 -04:00
comex
2dafbfb3ef Improve code and clarify parameters to ABI_Push/PopRegistersAndAdjustStack.
- Factor common work into a helper function.
- Replace confusingly named "noProlog" with "rsp_alignment".  Now that
x86 is not supported, we can just specify it explicitly as 8 for
clarity.
- Add the option to include more frame size, which I'll need later.
- Revert a change by magumagu in March which replaced MOVAPD with MOVUPD
on account of 32-bit Windows, since it's no longer supported.  True,
apparently recent processors don't execute the former any faster if the
pointer is, in fact, aligned, but there's no point using MOVUPD for
something that's guaranteed to be aligned...

(I discovered that GenFrsqrte and GenFres were incorrectly passing false
to noProlog - they were, in fact, functions without prologs, the
original meaning of the parameter - which caused the previous change to
break.  This is now fixed.)
2014-09-08 00:58:56 -04:00
Rohit Nirmal
629ceaf2b1 Split some parts of UpdateBoundingBox into multiple lines. Also,
fix issues causing failure on Lint.
2014-09-06 09:49:27 -05:00
Pierre Bourdon
494a60e41b VertexLoader: Change VtxDesc to use u64 instead of u32
This is required to make packing consistent between compilers: with u32, MSVC
would not allocate a bitfield that spans two u32s (it would leave a "hole").
2014-09-01 11:18:02 +02:00
Lioncash
4af8d9d248 VideoCommon: Clean up brace placements 2014-08-30 18:06:45 -04:00
Pierre Bourdon
16f180524c VertexLoader: do not prepare for vertices if we need to skip them 2014-08-04 20:47:02 -07:00
Pierre Bourdon
4c42b38de1 Merge pull request #428 from Sonicadvance1/x86_32-removal
Remove x86_32 support from Dolphin.
2014-08-03 21:17:28 -07:00
Ryan Houdek
d9b5482840 Remove x86_32 from VertexLoader. 2014-08-03 13:44:37 -05:00
Pierre Bourdon
6f715a1fbe VertexLoader: Remove more global state dependencies (this time IndexGenerator and VertexManager) 2014-08-02 09:34:39 -07:00
Pierre Bourdon
73f9a22e2e VertexLoader: Remove global state dependency on g_nativeVertexFmt 2014-07-26 01:35:09 +02:00
Pierre Bourdon
78c3a22060 VertexLoader: take the VAT object directly for RunVertices 2014-07-24 01:51:37 +02:00
Pierre Bourdon
069801a7d1 VertexLoader: Simplify SetVAT 2014-07-24 01:25:23 +02:00
degasus
7e79806efc remove unused globals
Also change globals into statics which are only used in one file
2014-07-11 16:10:20 +02:00
degasus
22e1aa5bb4 mark all local functions as static 2014-07-11 16:07:23 +02:00
degasus
bb2fc8ecbb VideoCommon: Cache native vertex formats
We are used to have a 1:1 mapping of GX vertex formats and the native (OGL + D3D) ones, but there are by far more GX ones.
This new cache maps them directly so that we don't flush on GX vertex format changes as long as the native one doesn't change.

The idea is stolen from galop1n.
2014-07-04 14:39:27 +02:00
Tillmann Karras
f8280401f6 x64Emitter: J_CC: use 32 bit offset automatically 2014-06-03 23:08:58 +02:00
magumagu
1357277f40 Video backends: mass-replace "xfregs" with "xfmem". 2014-05-16 18:58:07 -07:00
magumagu
8f5342c442 Video backend: merge global var xfmem into xfregs.
There isn't really any reason to keep them separate.
2014-05-16 18:55:31 -07:00
magumagu
818c89313e Video backends: unify xfregs/xfmem structures.
Removes the duplicate swxfregs global variable/struct from the software
backend in favor of the ones from VideoCommon.
2014-05-16 18:55:30 -07:00
Ryan Houdek
2d8cfb89d7 Changes posmtx vertex attribute to integer.
This makes it so we don't need to do some dumb casting from float to integer in our shaders.
Only tested in OpenGL, needs to be tested in D3D.
2014-04-30 19:11:06 -05:00
Tony Wasserka
cdf6172348 Merge pull request #213 from Jezze/vertexloader-cleanups
Vertexloader cleanups
2014-04-10 08:52:36 +02:00
Pierre Bourdon
664c8d30a0 Remove all trailing whitespaces from our codebase. 2014-03-29 11:05:44 +01:00
Jens Nyberg
73176d0333 VideoCommon/VertexLoader: Add more use of std::min and std::max 2014-03-27 00:33:41 +01:00
Jens Nyberg
478a27e052 VideoCommon/VertexLoader: Remove duplicate point min and max calculation 2014-03-27 00:24:48 +01:00
Jens Nyberg
0c62ae9c1a VideoCommon/VertexLoader: Remove NRM enum 2014-03-26 23:56:57 +01:00
Jens Nyberg
4a68550d01 Remove superfluous bit shift 2014-03-18 04:07:45 +01:00
Matthew Parlane
31cfc73a09 Fixes spacing for "for", "while", "switch" and "if"
Also moved && and || to ends of lines instead of start.
Fixed misc vertical alignments and some { needed newlining.
2014-03-11 00:35:07 +13:00
Tillmann Karras
d802d39281 clang-modernize -use-nullptr
and s/\bNULL\b/nullptr/g for *.cpp/h/mm files not compiled on my machine
2014-03-09 21:14:26 +01:00
Ryan Houdek
4f02132f93 Make our architecture defines less stupid.
Our defines were never clear between what meant 64bit or x86_64
This makes a clear cut between bitness and architecture.
This commit also has the side effect of bringing up aarch64 compiling support.
2014-03-04 09:36:59 -06:00
Tillmann Karras
6914eca167 Fix various warnings reported by clang
- mostly remove unused variables
- rename some generic JIT identifiers
2014-02-28 12:28:19 +01:00
Pierre Bourdon
ffe588cc24 Fix more header sorting issues in VideoCommon/ (now check-includes clean). 2014-02-20 01:01:10 +01:00
Lioncash
2afe215271 Convert all includes to relative paths. 2014-02-18 02:19:10 -05:00
Lioncash
3fd87a7636 Second and final pass of clearing out tabs. 2014-02-17 02:19:41 -05:00
Lioncash
6c4ee1753a Fix some vertical alignments
ie. uses spaces for alignment.
2014-02-16 20:12:05 -05:00
Matthew Parlane
32bfcc034f Some tidy up of sprintf to StringFromFormat
Includes a small fix to SetupWiiMemory
2014-02-10 17:25:18 +13:00
Pierre Bourdon
e59f770ccb Revert "Merge pull request #49 from Parlane/sprintf_tidy"
Change broke the build on Debian stable.

This reverts commit 28755439b3, reversing
changes made to 64e01ec763.
2014-02-09 16:14:13 +01:00