~korth/gens-gs-ii.git
4 years ago[libcompat/tests] gtest_main.ogc.inc.cpp: Currently HW_RVL only. libgens-do-MMX-SSE2-audio-video github/libgens-do-MMX-SSE2-audio-video
David Korth [Sun, 30 Aug 2015 21:50:28 +0000 (17:50 -0400)] 
[libcompat/tests] gtest_main.ogc.inc.cpp: Currently HW_RVL only.

4 years ago[libzomg] Metadata::InitProgramMetadata(): Clear strings if the variable is nullptr.
David Korth [Sun, 30 Aug 2015 21:37:24 +0000 (17:37 -0400)] 
[libzomg] Metadata::InitProgramMetadata(): Clear strings if the variable is nullptr.

This was done previously, but I changed it as an optimization. However,
this optimization breaks if the function is called multiple times with
different parameters as nullptr, since the strings won't be cleared.

This partially reverts commit 59433c7106a450a8c811df0992e98d58a32b60b1.
([libgens] config.libgens.h.in: Provide the CMake version information.)

4 years ago[libgens] FastBlur.x86.inc.cpp: FIXME: 2FB version is slower on Core 2 T7200 than...
David Korth [Sun, 30 Aug 2015 21:28:31 +0000 (17:28 -0400)] 
[libgens] FastBlur.x86.inc.cpp: FIXME: 2FB version is slower on Core 2 T7200 than the C implementation.

4 years ago[libgens] FastBlur: Disabled the SSE2 code for now.
David Korth [Sun, 30 Aug 2015 21:25:28 +0000 (17:25 -0400)] 
[libgens] FastBlur: Disabled the SSE2 code for now.

It's generally slower than MMX due to unaligned access restrictions.

[libgens/tests] FastBlurTest: Disabled SSE2 tests, since the SSE2
Fast Blur implementation was disabled.

4 years ago[libgens] FastBlur: Added an SSE2-optimized 32-bit Fast Blur.
David Korth [Sun, 30 Aug 2015 21:17:08 +0000 (17:17 -0400)] 
[libgens] FastBlur: Added an SSE2-optimized 32-bit Fast Blur.

1-FB seems slightly faster than MMX, but 2-FB seems slightly slower:

                 | SSE2 1-FB | SSE2 2-FB |
+----------------+-----------+-----------+
| gcc-5.2.0, -Og |    265 ms |    434 ms |
| gcc-5.2.0, -O2 |    255 ms |    398 ms |
+----------------+-----------+-----------+

TODO: Use movdqa+shift instead of movdqu?

4 years ago[libgens] PausedEffect.x86.inc.cpp: Check framebuffer alignment before loading the...
David Korth [Sun, 30 Aug 2015 21:09:21 +0000 (17:09 -0400)] 
[libgens] PausedEffect.x86.inc.cpp: Check framebuffer alignment before loading the constants into xmm registers.

4 years ago[libgens] FastBlur.x86.inc.cpp: Minor formatting changes.
David Korth [Sun, 30 Aug 2015 20:52:33 +0000 (16:52 -0400)] 
[libgens] FastBlur.x86.inc.cpp: Minor formatting changes.

TODO: Figure out how to blur 8px at a time in the 15/16-bit color
version. I tried the same method I used for 32-bit, and it resulted
in a gigantic mess (and the tests failed).

4 years ago[libgens] FastBlur.x86.inc.cpp: 15/16-bit should use psrlw, not psrld.
David Korth [Sun, 30 Aug 2015 20:47:26 +0000 (16:47 -0400)] 
[libgens] FastBlur.x86.inc.cpp: 15/16-bit should use psrlw, not psrld.

No effective difference due to the mask, but psrlw makes more sense,
since we're working in units of 16-bit pixels.

4 years ago[libgens] FastBlur.x86.inc.cpp: 32-bit color: Blur 4px per iteration.
David Korth [Sun, 30 Aug 2015 20:40:40 +0000 (16:40 -0400)] 
[libgens] FastBlur.x86.inc.cpp: 32-bit color: Blur 4px per iteration.

Benchmark results: (32-bit, MMX)

                 | 2px 1-FB | 2px 2-FB | 4px 1-FB | 4px 2-FB |
+----------------+----------+----------+----------+----------+
| gcc-5.2.0, -Og |   320 ms |   254 ms |   294 ms |   212 ms |
| gcc-5.2.0, -O2 |   321 ms |   221 ms |   294 ms |   199 ms |
+----------------+----------+----------+----------+----------+

4 years ago[libgens] FastBlur.x86.inc.cpp: Added commented-out clobber lists.
David Korth [Sun, 30 Aug 2015 20:32:46 +0000 (16:32 -0400)] 
[libgens] FastBlur.x86.inc.cpp: Added commented-out clobber lists.

4 years ago[libgens/tests] FastBlurTest_benchmark.cpp: Added benchmarks.
David Korth [Sun, 30 Aug 2015 20:25:44 +0000 (16:25 -0400)] 
[libgens/tests] FastBlurTest_benchmark.cpp: Added benchmarks.

Benchmark results:

C version:

                 | 15-bit 1FB | 15-bit 2FB | 16-bit 1FB | 16-bit 2FB | 32-bit 1FB | 32-bit 2FB |
+----------------+------------+------------+------------+------------+------------+------------+
| gcc-5.2.0, -Og |     301 ms |     266 ms |     303 ms |     268 ms |     353 ms |     269 ms |
| gcc-5.2.0, -O2 |     309 ms |     222 ms |     308 ms |     219 ms |     351 ms |     228 ms |
+----------------+------------+------------+------------+------------+------------+------------+

MMX version:

                 | 15-bit 1FB | 15-bit 2FB | 16-bit 1FB | 16-bit 2FB | 32-bit 1FB | 32-bit 2FB |
+----------------+------------+------------+------------+------------+------------+------------+
| gcc-5.2.0, -Og |     163 ms |     130 ms |     163 ms |     129 ms |     320 ms |     254 ms |
| gcc-5.2.0, -O2 |     162 ms |     112 ms |     162 ms |     111 ms |     321 ms |     221 ms |
+----------------+------------+------------+------------+------------+------------+------------+

4 years ago[libgens/tests] FastBlurTest: s/Fast Blur/Fast Blur effect/g
David Korth [Sun, 30 Aug 2015 20:24:31 +0000 (16:24 -0400)] 
[libgens/tests] FastBlurTest: s/Fast Blur/Fast Blur effect/g

4 years ago[libgens/tests] FastBlurTest: s/Paused Effect/Fast Blur/g
David Korth [Sun, 30 Aug 2015 20:17:20 +0000 (16:17 -0400)] 
[libgens/tests] FastBlurTest: s/Paused Effect/Fast Blur/g

4 years ago[libgens/tests] FastBlurTest: New test for the Fast Blur effect.
David Korth [Sun, 30 Aug 2015 20:04:20 +0000 (16:04 -0400)] 
[libgens/tests] FastBlurTest: New test for the Fast Blur effect.

TODO: Add benchmarks.

4 years ago[libgens/tests] EffectTest: Added an abstract function renderType().
David Korth [Sun, 30 Aug 2015 19:44:09 +0000 (15:44 -0400)] 
[libgens/tests] EffectTest: Added an abstract function renderType().

This is used by subclasses to determine if we should read e.g. "SW" or
"SW-int" files. In the case of PausedEffect, "SW" is the old software
renderer that used floating-point, and "SW-int" is the new one that
uses integers for higher performance with slightly less accuracy.

4 years ago[libgens/tests] PausedEffectTest.cpp: Removed some unused includes.
David Korth [Sun, 30 Aug 2015 19:39:02 +0000 (15:39 -0400)] 
[libgens/tests] PausedEffectTest.cpp: Removed some unused includes.

4 years ago[libgens/tests] EffectTest: Split the initialization code out of PausedEffectTest.
David Korth [Sun, 30 Aug 2015 19:37:09 +0000 (15:37 -0400)] 
[libgens/tests] EffectTest: Split the initialization code out of PausedEffectTest.

FastBlurTest will derive from EffectTest, so I won't have to create
a separate copy of the image loading code.

4 years ago[libgens/tests] Renamed PausedEffect.Normal.*.png to Effects.Normal.*.png .
David Korth [Sun, 30 Aug 2015 19:26:57 +0000 (15:26 -0400)] 
[libgens/tests] Renamed PausedEffect.Normal.*.png to Effects.Normal.*.png .

These images will be used as reference images for the Fast Blur test.

4 years ago[libgens] FastBlur: Added RESTRICT to the MMX functions.
David Korth [Sun, 30 Aug 2015 19:23:53 +0000 (15:23 -0400)] 
[libgens] FastBlur: Added RESTRICT to the MMX functions.

4 years ago[libgens] FastBlur.cpp: Split the MMX code into a separate file.
David Korth [Sun, 30 Aug 2015 19:21:30 +0000 (15:21 -0400)] 
[libgens] FastBlur.cpp: Split the MMX code into a separate file.

Consolidated the 1-FB and 2-FB functions into a single function using
DO_1FB and DO_2FB, similar to PausedEffect.

Use variable names for inline asm instead of indexes.

Moved the definition of MASK_DIV2_32_MMX into the 32-bit MMX function.

4 years ago[libgens/tests] Effects/CMakeLists.txt: Copy the reference images to the test directory.
David Korth [Sun, 30 Aug 2015 19:09:38 +0000 (15:09 -0400)] 
[libgens/tests] Effects/CMakeLists.txt: Copy the reference images to the test directory.

4 years ago[libgens] PausedEffect: Added an SSE2 version.
David Korth [Sun, 30 Aug 2015 19:01:42 +0000 (15:01 -0400)] 
[libgens] PausedEffect: Added an SSE2 version.

SSE2 adds the pshuf* opcodes (well, pshufw was added in SSE1),
which makes it easier to improve performance.

Benchmarks: (32-bit color)

                 | MMX 1-FB | MMX 2-FB | SSE2 1-FB | SSE2 2-FB |
+----------------+----------+----------+-----------+-----------+
| gcc-5.2.0, -Og |   602 ms |   515 ms |    443 ms |    370 ms |
| gcc-5.2.0, -O2 |   663 ms |   521 ms |    437 ms |    364 ms |
+----------------+----------+----------+-----------+-----------+

4 years ago[libgens] PausedEffect.x86.inc.cpp: s/px1/px2/
David Korth [Sun, 30 Aug 2015 18:43:19 +0000 (14:43 -0400)] 
[libgens] PausedEffect.x86.inc.cpp: s/px1/px2/

4 years ago[libgens] PausedEffect.x86.inc.cpp: Minor comment updates.
David Korth [Sun, 30 Aug 2015 18:42:29 +0000 (14:42 -0400)] 
[libgens] PausedEffect.x86.inc.cpp: Minor comment updates.

- Blue tint is applied.
- We're doing 2px at a time, not 1.

4 years ago[libgens] PausedEffect.x86.inc.cpp: Removed the braces from GRAY_32_MMX.
David Korth [Sun, 30 Aug 2015 18:37:24 +0000 (14:37 -0400)] 
[libgens] PausedEffect.x86.inc.cpp: Removed the braces from GRAY_32_MMX.

Not sure how this even compiled, since the variable isn't an array.

4 years ago[libgens] PausedEffect.cpp: Split the MMX code into an include file.
David Korth [Sun, 30 Aug 2015 18:35:37 +0000 (14:35 -0400)] 
[libgens] PausedEffect.cpp: Split the MMX code into an include file.

PausedEffect.x86.inc.cpp is compiled differently depending on whether
DO_1FB is defined or DO_2FB is defined. This allows us to easily build
both 1FB and 2FB versions without writing the functions twice.

TODO: Do that for the C++ version?

Benchmark results for 32-bit color, MMX-optimized:

                 |   1FB   |   2FB   |
+----------------+---------+---------+
| gcc-5.2.0, -Og |  604 ms |  516 ms |
| gcc-5.2.0, -O2 |  603 ms |  522 ms |
+----------------+---------+---------+

4 years ago[libgens] Moved the RESTRICT macro from PausedEffect.hpp to macros/common.h.
David Korth [Sun, 30 Aug 2015 18:23:58 +0000 (14:23 -0400)] 
[libgens] Moved the RESTRICT macro from PausedEffect.hpp to macros/common.h.

Added RESTRICT to DoPausedEffect_32_MMX().

4 years ago[libgens] PausedEffect.cpp: Use paddusb instead of paddusw.
David Korth [Sun, 30 Aug 2015 18:21:41 +0000 (14:21 -0400)] 
[libgens] PausedEffect.cpp: Use paddusb instead of paddusw.

The extra precision from paddusw can result in the blue value being
slightly omre than double the grayscale value (usually +1), which
results in test failures.

The 32-bit 2-FB MMX test now passes.

4 years ago[libgens] PausedEffect: Tint the image blue.
David Korth [Sun, 30 Aug 2015 18:07:57 +0000 (14:07 -0400)] 
[libgens] PausedEffect: Tint the image blue.

This doesn't change the benchmarks much for 32-bit MMX:

                 | No tint | With tint |
+----------------+---------+-----------+
| gcc-5.2.0, -Og |  520 ms |    514 ms |
| gcc-5.2.0, -O2 |  527 ms |    538 ms |
+----------------+---------+-----------+

FIXME: PausedEffectTest is still reporting errors when comparing the
32-bit MMX output to the reference image.

4 years ago[libgens] PausedEffect: Write 2 pixels per MMX loop.
David Korth [Sun, 30 Aug 2015 17:57:35 +0000 (13:57 -0400)] 
[libgens] PausedEffect: Write 2 pixels per MMX loop.

It actually seems to be slightly worse than the 1px loop, but I'm keeping
it anyway:

                 | 32-bit MMX (1x) | 32-bit MMX (2x) |
+----------------+-----------------+-----------------+
| gcc-5.2.0, -Og |          509 ms |          520 ms |
| gcc-5.2.0, -O2 |          503 ms |          527 ms |
+----------------+-----------------+-----------------+

TODO:
- Optimize writing to outScreen.
- Tint it blue.

4 years ago[libgens] PausedEffect.cpp: Use integers instead of floats.
David Korth [Sun, 30 Aug 2015 17:38:31 +0000 (13:38 -0400)] 
[libgens] PausedEffect.cpp: Use integers instead of floats.

Similar to the MMX code, we're now using integer arithmetic in the
C code. This improves performance significantly at the cost of a
slight loss of precision.

PausedEffect.SW-int.*.png: New reference images for the integer version.

[libgens/tests] PausedEffectTest.cpp: Use the integer reference images.

Benchmark results when using 'float': (2,000 iterations)
[Retested due to the column repeat issue frrom earlier.]

                 | 15-bit 1FB | 15-bit 2FB | 16-bit 1FB | 16-bit 2FB | 32-bit 1FB | 32-bit 2FB |
+----------------+------------+------------+------------+------------+------------+------------+
| gcc-5.2.0, -Og |   3,109 ms |   3,056 ms |   3,149 ms |   3,114 ms |   3,132 ms |   3,020 ms |
| gcc-5.2.0, -O2 |   2,352 ms |   2,308 ms |   2,222 ms |   2,187 ms |   2,026 ms |   1,916 ms |
+----------------+------------+------------+------------+------------+------------+------------+

Benchmark results when using integers: (2,000 iterations)

                 | 15-bit 1FB | 15-bit 2FB | 16-bit 1FB | 16-bit 2FB | 32-bit 1FB | 32-bit 2FB |
+----------------+------------+------------+------------+------------+------------+------------+
| gcc-5.2.0, -Og |   1,529 ms |   1,512 ms |   1,789 ms |   1,576 ms |   1,355 ms |   1,215 ms |
| gcc-5.2.0, -O2 |   1,470 ms |   1,435 ms |   1,388 ms |   1,378 ms |   1,118 ms |   1,043 ms |
+----------------+------------+------------+------------+------------+------------+------------+

4 years ago[WTF] [libgens/tests] PausedEffectTest.cpp: Fixed a rather major error in copyToFb15...
David Korth [Sun, 30 Aug 2015 17:08:26 +0000 (13:08 -0400)] 
[WTF] [libgens/tests] PausedEffectTest.cpp: Fixed a rather major error in copyToFb15() and copyToFb16().

pData isn't incremented during the loop, so we ended up with framebuffers
that contained the first column repeated throughout the image.

pSrc *is* incremented, so use that for the data source instead of pData.

No changes to the test results for non-MMX.

4 years ago[libgens] Util/Screenshot.cpp: Made the 'rom' parameter optional.
David Korth [Sun, 30 Aug 2015 17:06:49 +0000 (13:06 -0400)] 
[libgens] Util/Screenshot.cpp: Made the 'rom' parameter optional.

LibGens::Screenshot is useful for debugging some things with Effects when
a ROM isn't loaded.

NOTE: While the parameter can be nullptr now, I'm not including a
'default' parameter, so it has to be specified explicitly.

Moved the filename parameter of toFile() to the beginning of the list.

4 years ago[libgens/tests] Effects/CMakeLists.txt: Forgot to add this file...
David Korth [Sun, 30 Aug 2015 17:05:53 +0000 (13:05 -0400)] 
[libgens/tests] Effects/CMakeLists.txt: Forgot to add this file...

4 years ago[libgens] PausedEffect: Initial 2-FB 32-bit MMX paused effect.
David Korth [Sun, 30 Aug 2015 16:38:07 +0000 (12:38 -0400)] 
[libgens] PausedEffect: Initial 2-FB 32-bit MMX paused effect.

Currently only does grayscale, and 1px per iteration, but the results
are quite good:

                 | 32-bit C | 32-bit MMX |
+----------------+----------+------------+
| gcc-5.2.0, -Og | 3,032 ms |     509 ms |
| gcc-5.2.0, -O2 | 1,900 ms |     503 ms |
+----------------+----------+------------+

Note that some of this could be attributed to the C version using floating-point,
whereas the MMX version uses integers. Using integers results in slightly less
precision, but that doesn't really matter for a simple effect.

4 years ago[libgens] PausedEffect.cpp: Removed a few more "MUST BE 336x240!" warnings.
David Korth [Sun, 30 Aug 2015 16:06:46 +0000 (12:06 -0400)] 
[libgens] PausedEffect.cpp: Removed a few more "MUST BE 336x240!" warnings.

4 years ago[libgens] PausedEffect.cpp: Removed the MD_Screen[] notice.
David Korth [Sun, 30 Aug 2015 16:03:28 +0000 (12:03 -0400)] 
[libgens] PausedEffect.cpp: Removed the MD_Screen[] notice.

The Paused Effect is applied to whatever framebuffer(s) are specified.

4 years ago[libgens] PausedEffect.cpp: Added a parameter 'pxCount' to the internal functions.
David Korth [Sun, 30 Aug 2015 16:02:17 +0000 (12:02 -0400)] 
[libgens] PausedEffect.cpp: Added a parameter 'pxCount' to the internal functions.

Previously, PausedEffect assumed a total framebuffer size of 336*240.
Now, it uses the actual framebuffer size. (It's still 336x240 right now,
but that might change later.)

4 years ago[libgens] PausedEffect: Moved the 1-FB function above the 2-FB function.
David Korth [Sun, 30 Aug 2015 15:58:01 +0000 (11:58 -0400)] 
[libgens] PausedEffect: Moved the 1-FB function above the 2-FB function.

4 years ago[libgens/tests] PausedEffectTest_benchmark.cpp: Benchamrks for Paused Effect.
David Korth [Sun, 30 Aug 2015 15:50:20 +0000 (11:50 -0400)] 
[libgens/tests] PausedEffectTest_benchmark.cpp: Benchamrks for Paused Effect.

Note that unlike the other benchmarks, this one derives from
PausedEffectTest. This is mostly because PausedEffectTest has
a ton of support code that I didn't want to simply copy over.

Benchmark results for the C implementation: (2,000 iterations)

                 | 15-bit 1FB | 15-bit 2FB | 16-bit 1FB | 16-bit 2FB | 32-bit 1FB | 32-bit 2FB |
+----------------+------------+------------+------------+------------+------------+------------+
| gcc-5.2.0, -Og |   2,930 ms |   2,863 ms |   2,916 ms |   2,897 ms |   3,186 ms |   2,990 ms |
| gcc-5.2.0, -O2 |   2,347 ms |   2,295 ms |   2,233 ms |   2,238 ms |   2,017 ms |   1,920 ms |
+----------------+------------+------------+------------+------------+------------+------------+

Interestingly, the 2FB versions seem to be slightly faster than the
1FB versions.

Next step: Implementing MMX for 32-bit. I probably won't implement
MMX versions of 15/16-bit right now. (Maybe later?)

4 years ago[libgens/tests] PausedEffectTest: Added ASSERT_NO_FATAL_FAILURE() to copyToFb*()...
David Korth [Sun, 30 Aug 2015 15:37:11 +0000 (11:37 -0400)] 
[libgens/tests] PausedEffectTest: Added ASSERT_NO_FATAL_FAILURE() to copyToFb*() calls.

We don't need to add this to the compareFb() calls, since that's the last
line in the test functions.

4 years ago[libgens/tests] PausedEffectTest.hpp: s/cpp/hpp/
David Korth [Sun, 30 Aug 2015 15:30:56 +0000 (11:30 -0400)] 
[libgens/tests] PausedEffectTest.hpp: s/cpp/hpp/

4 years ago[libgens/tests] PausedEffectTest: Parameterized the tests using CPU flags.
David Korth [Sun, 30 Aug 2015 15:29:11 +0000 (11:29 -0400)] 
[libgens/tests] PausedEffectTest: Parameterized the tests using CPU flags.

Note that PausedEffect doesn't have any MMX or SSE2 optimizations yet.

4 years ago[libgens/tests] PausedEffectTest: Added 15-bit and 16-bit color tests.
David Korth [Sun, 30 Aug 2015 15:21:59 +0000 (11:21 -0400)] 
[libgens/tests] PausedEffectTest: Added 15-bit and 16-bit color tests.

New functions copyToFb15() and copyToFb16() that convert the source image
to 15-bit or 16-bit color when copying.

Added do15bit_1FB(), do15bit_2FB(), do16bit_1FB(), and do16bit_2FB()
functions that test 15-bit and 16-bit color with the 1-FB and 2-FB
functions, respectively.

4 years ago[libgens/tests] PausedEffectTest::copyToFb32(): Only copy the visible part of the...
David Korth [Sun, 30 Aug 2015 15:14:43 +0000 (11:14 -0400)] 
[libgens/tests] PausedEffectTest::copyToFb32(): Only copy the visible part of the image.

'pitch' might include padding for alignment.

4 years ago[libgens/tests] PausedEffectTest: Split image loading into a separate function.
David Korth [Sun, 30 Aug 2015 15:10:08 +0000 (11:10 -0400)] 
[libgens/tests] PausedEffectTest: Split image loading into a separate function.

We're going to use parameterized tests for CPU flags, not color depth,
so in order to selectively load the 15/16/32 images, we have to use
a separate init() function.

do32bit_1FB(), do32bit_2FB(): Call init() using ASSERT_NO_FATAL_FAILURE().
That way, if an assertion occurs in init(), it will propagate, and the
unit test will stop running.

4 years ago[libgens/tests] PausedEffectTest: Added a 2-FB version of the 32-bit Paused Effect...
David Korth [Sun, 30 Aug 2015 14:58:37 +0000 (10:58 -0400)] 
[libgens/tests] PausedEffectTest: Added a 2-FB version of the 32-bit Paused Effect test.

TearDown(): Only unreference the MdFbs if they were allocated in the
first place, i.e. they're not nullptr.

4 years ago[libgens/tests] PausedEffectTest::SetUp(): Set the bpp on all three MdFbs, not just...
David Korth [Sun, 30 Aug 2015 14:56:27 +0000 (10:56 -0400)] 
[libgens/tests] PausedEffectTest::SetUp(): Set the bpp on all three MdFbs, not just fb_normal.

This didn't break anything yet because MdFb defaults to 32-bit color.

4 years ago[libgens/tests] PausedEffectTest: Added helper functions.
David Korth [Sun, 30 Aug 2015 14:53:21 +0000 (10:53 -0400)] 
[libgens/tests] PausedEffectTest: Added helper functions.

copyToFb32(): Copy a Zomg_Img_Data_t to a 32-bit MdFb.

compareFb(): Compare two MdFbs with the same dimensions and color depth.

Other changes:
- SetUp(): Copy the reference images into MdFbs for testing.
- Removed disabled SoundMgr code. This is a leftover from AudioWriteTest,
  which is what this test was initially based on.

4 years ago[libzomg] PngReader: Added a 'flags' parameter.
David Korth [Sun, 30 Aug 2015 14:21:42 +0000 (10:21 -0400)] 
[libzomg] PngReader: Added a 'flags' parameter.

Currently, the only flag available is whether or not to invert the
alpha channel (or filler byte for images that don't have alpha).

[libgens/tests] PausedEffectTest: Specify RF_INVERT_ALPHA when loading the
reference images so we don't have to clear the alpha channel manually.

4 years ago[libzomg] Metadata::toPngData(): Don't write "ROM CRC32" if it's zero.
David Korth [Sat, 29 Aug 2015 21:24:18 +0000 (17:24 -0400)] 
[libzomg] Metadata::toPngData(): Don't write "ROM CRC32" if it's zero.

ZOMG.ini has empty fields; PNG doesn't.

4 years ago[libgens/tests] Effects/PausedEffectTest.cpp: Initial Paused Effect test.
David Korth [Sat, 29 Aug 2015 21:19:29 +0000 (17:19 -0400)] 
[libgens/tests] Effects/PausedEffectTest.cpp: Initial Paused Effect test.

This test loads two reference images (normal and paused), applies the
Paused Effect to the normal image, and compares it to the paused image.

[libzomg] PngReader: Added a function readFromFile().

TODO:
- PngReader: Add a parameter to specify the value to use for the
  high byte. Normally we use 0xFF for compatibility with GL
  transparency, but the internal rendering code uses 0x00.
- Add support for 15-bit and 16-bit color.
- CMake: Copy the reference images to the binary directory.

4 years ago[libgens/tests] PausedEffect.Normal.*.png: Added overscan borders.
David Korth [Sat, 29 Aug 2015 21:10:24 +0000 (17:10 -0400)] 
[libgens/tests] PausedEffect.Normal.*.png: Added overscan borders.

We're going to compare the entire MdFb, so we need the overscan
borders that aren't normally saved in screenshots.

I added the borders manually using KolourPaint. Metadata wasn't
modified, so it still reflects the original screenshots.

4 years ago[libzomg] PngWriterPrivate::T_writePNG_rows_16<>(): Removed unused variables.
David Korth [Sat, 29 Aug 2015 20:37:32 +0000 (16:37 -0400)] 
[libzomg] PngWriterPrivate::T_writePNG_rows_16<>(): Removed unused variables.

These were used for debugging the template parameter changes.

This is a regression from commit 711cf8e28a00950ec07cca66a9da7d21a60dde0a.
([libzomg] PngWriterPrivate::T_writePNG_rows_16<>(): Fill in the unused bits with a copy of the MSBs.)

4 years ago[libzomg] PngWriter: Destroy the PNG write struct before closing the file.
David Korth [Sat, 29 Aug 2015 20:35:53 +0000 (16:35 -0400)] 
[libzomg] PngWriter: Destroy the PNG write struct before closing the file.

Just in case there's any "closing" stuff that needs to be written.

4 years ago[libzomg] PngWriter::writeToFile: Removed an obsolete W32U comment.
David Korth [Sat, 29 Aug 2015 20:32:17 +0000 (16:32 -0400)] 
[libzomg] PngWriter::writeToFile: Removed an obsolete W32U comment.

Since we're #including W32U_mini.h, fopen() is automatically
converted from UTF-8 to the required character set.

4 years ago[cmake] SplitDebugInformation.cmake: Set SPLIT_OK=0 if building with MSVC.
David Korth [Sat, 29 Aug 2015 19:28:23 +0000 (15:28 -0400)] 
[cmake] SplitDebugInformation.cmake: Set SPLIT_OK=0 if building with MSVC.

Otherwise, we attempt to use 'objcopy', which fails miserably.
(Actually, it might work if cygwin is installed, but it shouldn't
be used anyway, since MSVC stores debug symbols in .pdb files.)

This is a regression from commit 094a3fb256d11f5e3e33ecc22a75574c78357125.
([cmake] SplitDebugInformation.cmake: Use CMAKE_OBJCOPY and CMAKE_STRIP.)

4 years ago[libcompat] byteswap_x86.c: Declare dwptr at the top of the function.
David Korth [Sat, 29 Aug 2015 19:26:36 +0000 (15:26 -0400)] 
[libcompat] byteswap_x86.c: Declare dwptr at the top of the function.

This fixes the MSVC build.

byteswap.c: Same thing, even though this file isn't used in the MSVC build.

4 years ago[libgens/tests] Effects/: Added test images for the Paused Effect.
David Korth [Sat, 29 Aug 2015 19:24:11 +0000 (15:24 -0400)] 
[libgens/tests] Effects/: Added test images for the Paused Effect.

"Normal" files are the Sonic 2 title screen in various color depths.
"SW" files are the same images with the software Paused Effect applied.

These will be used to test the software Paused Effect, as well as
upcoming MMX and SSE2 implementations.

4 years ago[libzomg] Metadata: If CRC32 == 0, assume it isn't set.
David Korth [Sat, 29 Aug 2015 18:52:31 +0000 (14:52 -0400)] 
[libzomg] Metadata: If CRC32 == 0, assume it isn't set.

Of course, once I do this, someone will inevitably release a ROM that
has an artificial CRC32 of 0.

4 years ago[libzomg] PngWriterPrivate::T_writePNG_rows_16<>(): Fill in the unused bits with...
David Korth [Sat, 29 Aug 2015 18:35:41 +0000 (14:35 -0400)] 
[libzomg] PngWriterPrivate::T_writePNG_rows_16<>(): Fill in the unused bits with a copy of the MSBs.

This makes the images slightly more accurate, though we don't have the
full accuracy of the 32-bit palette.

Changed the template parameters to take color component bits instead of
masks and shifts. The masks and shifts can be computed at compile time.

4 years ago[libcompat] byteswap_x86.c: Added a note that 'rol 8' was slightly faster on 64-bit.
David Korth [Sat, 29 Aug 2015 18:13:34 +0000 (14:13 -0400)] 
[libcompat] byteswap_x86.c: Added a note that 'rol 8' was slightly faster on 64-bit.

4 years ago[libcompat] byteswap_x86.c: Use swap_two_16_in_32() from byteswap.c.
David Korth [Sat, 29 Aug 2015 18:10:15 +0000 (14:10 -0400)] 
[libcompat] byteswap_x86.c: Use swap_two_16_in_32() from byteswap.c.

This swaps two 16-bit values in a single 32-bit DWORD using shifts
and masks. It turns out this is faster on x86 than 'rol 8':

                 | __swab16 |   rol 8  | 16 in 32 |
+----------------+----------+----------+----------+
| gcc-5.2.0, -Og | 3,678 ms | 3,162 ms | 3,018 ms |
| gcc-5.2.0, -O2 | 3,707 ms | 2,972 ms | 2,894 ms |
+----------------+----------+----------+----------+

Removed the 'rol 8' code.

Removed the __i386__ || __amd64__ check, since this file is only
compiled on i386 and amd64.

4 years ago[cmake] SplitDebugInformation.cmake: Use CMAKE_OBJCOPY and CMAKE_STRIP.
David Korth [Sat, 29 Aug 2015 17:53:45 +0000 (13:53 -0400)] 
[cmake] SplitDebugInformation.cmake: Use CMAKE_OBJCOPY and CMAKE_STRIP.

FIND_PROGRAM(OBJCOPY) finds the system objcopy, which fails when
cross-compiling for e.g. PowerPC. CMAKE_OBJCOPY is set to the
correct cross-compile version, so use that.

Use 'strip' instead of 'objcopy --strip-debug'. objcopy doesn't
remove the .symtab or .strtab sections, which can take up a good
amount of space. They're also copied to the .debug file, so there's
no reason to keep them in the main executable.

4 years agoCMakeLists.txt: New macro DO_SPLIT_DEBUG().
David Korth [Sat, 29 Aug 2015 17:52:36 +0000 (13:52 -0400)] 
CMakeLists.txt: New macro DO_SPLIT_DEBUG().

This eliminates several repeated lines for MSVC debug paths
and splitting debug information.

src/: Use DO_SPLIT_DEBUG().
- Added DO_SPLIT_DEBUG() to test suites in addition to regular executables.

4 years ago[libzomg] Metadata_ogc.cpp: New Metadata implementation for libogc.
David Korth [Sat, 29 Aug 2015 17:35:19 +0000 (13:35 -0400)] 
[libzomg] Metadata_ogc.cpp: New Metadata implementation for libogc.

On Wii and vWii (Wii U), it gets the Wii System Menu version
and IOS version, and uses that as OS. On GameCube, it just
prints "GCN".

CPU is currently hard-coded based on target. On HW_RVL, it prints
"Broadway" or "Espresso", depending on if it's running on a real Wii
or on a Wii U in Wii Mode (vWii).

TODO:
- Add the libogc version number.
- Can the CPU frequency be obtained programmatically instead of
  hard-coding it?

4 years ago[libzomg] CMakeLists.txt: Build the test suite if BUILD_TESTING is set.
David Korth [Sat, 29 Aug 2015 17:33:48 +0000 (13:33 -0400)] 
[libzomg] CMakeLists.txt: Build the test suite if BUILD_TESTING is set.

4 years ago[libgens] CMakeLists.txt: Only compile Timing_unix.cpp on Unix or Linux.
David Korth [Sat, 29 Aug 2015 17:31:54 +0000 (13:31 -0400)] 
[libgens] CMakeLists.txt: Only compile Timing_unix.cpp on Unix or Linux.

Mac OS X is handled in a prior case, so we don't have to explicitly
exclude it from this check.

4 years ago[libzomg] Metadata::InitProgramMetadata(): Save the CPU name from libcompat.
David Korth [Sat, 29 Aug 2015 17:31:23 +0000 (13:31 -0400)] 
[libzomg] Metadata::InitProgramMetadata(): Save the CPU name from libcompat.

4 years ago[libcompat/tests] gtest_main.ogc.inc.cpp: Wait for the user to press HOME before...
David Korth [Sat, 29 Aug 2015 17:11:51 +0000 (13:11 -0400)] 
[libcompat/tests] gtest_main.ogc.inc.cpp: Wait for the user to press HOME before exiting.

Added POWER and RESET callbacks, though they currently don't do anything
while tests are running.

4 years ago[libcompat/tests] Split gtest_main.inc.cpp's libogc code into gtest_main.ogc.inc...
David Korth [Sat, 29 Aug 2015 16:59:57 +0000 (12:59 -0400)] 
[libcompat/tests] Split gtest_main.inc.cpp's libogc code into gtest_main.ogc.inc.cpp.

I'm adding more libogc code for handling controllers, so we should use
separate files for "generic" and libogc versions.

4 years ago[cmake] devkitPPC.RVL.cmake: More fixes.
David Korth [Sat, 29 Aug 2015 16:56:34 +0000 (12:56 -0400)] 
[cmake] devkitPPC.RVL.cmake: More fixes.

Adding libraries to LDFLAGS worked for libogc, but fails miserably
for other libraries due to -Wl,--as-needed. If a library's symbols
aren't immediately needed, the library is removed, which can result
in undefined reference errors.

Fixed the libogc check. Check for libogc.a, not libogc's directory.
(libogc's directory is checked separately.)

Add the libraries, in reverse order, to CMAKE_C_STANDARD_LIBRARIES
and CMAKE_CXX_STANDARD_LIBRARIES. This variable adds the libraries
*after* the object files, which fixes the problem.

Added the BTE and WIIUSE library. I'm going to add a prompt to
gtest_main.cpp for the user to press the HOME button to exit.
This will be useful for libzomg's PrintMetadata test, since the
test would otherwise exit immediately after it displays the results.

4 years ago[cmake] devkitPPC.RVL.cmake: Force overwriting the cached value of CFLAGS and LDFLAGS.
David Korth [Sat, 29 Aug 2015 16:32:29 +0000 (12:32 -0400)] 
[cmake] devkitPPC.RVL.cmake: Force overwriting the cached value of CFLAGS and LDFLAGS.

This fixes an issue where cmake doesn't properly cache the toolchain
flags on the first run, but does cache it on subsequent runs.

4 years ago[libcompat] Added dummy LibCompat_GetCPUFlags() implementations.
David Korth [Sat, 29 Aug 2015 15:57:21 +0000 (11:57 -0400)] 
[libcompat] Added dummy LibCompat_GetCPUFlags() implementations.

The function is used in various places, so we need to at least provide
a dummy implementation in cpuflags.c and cpuflags_ppc.c.

4 years ago[libzomg/tests] PrintMetadata: New basic libzomg test to print ZOMG.ini metadata.
David Korth [Sat, 29 Aug 2015 15:51:52 +0000 (11:51 -0400)] 
[libzomg/tests] PrintMetadata: New basic libzomg test to print ZOMG.ini metadata.

No ROM information is set; only system metadata is being tested.

The metadata is printed with two sets of flags:
- Set 1: MF_Default (default flags)
- Set 2: INT_MAX (all flags)

TODO: Implement CreationTime.

4 years ago[libcompat] cpuflags_ppc.c: Preliminary PowerPC CPU flags implementation.
David Korth [Sat, 29 Aug 2015 15:35:38 +0000 (11:35 -0400)] 
[libcompat] cpuflags_ppc.c: Preliminary PowerPC CPU flags implementation.

Currently only supports HW_RVL and HW_DOL, and only to retrieve the
CPU's full name, which is hard-coded based on the compilation target.

4 years ago[libcompat] Split cpuflags.c into two files.
David Korth [Sat, 29 Aug 2015 15:24:14 +0000 (11:24 -0400)] 
[libcompat] Split cpuflags.c into two files.

The two files are:
- cpuflags.c: Generic version, essentially a nop.
- cpuflags_x86.c: i386/amd64 version.

4 years ago[libcompat] cpuflags.c: Removed some debugging code that printed CPU_FullName.
David Korth [Sat, 29 Aug 2015 15:18:26 +0000 (11:18 -0400)] 
[libcompat] cpuflags.c: Removed some debugging code that printed CPU_FullName.

4 years ago[libcompat] cpuflags.c::elimSpaces(): Break on NULL terminator in src.
David Korth [Sat, 29 Aug 2015 15:18:01 +0000 (11:18 -0400)] 
[libcompat] cpuflags.c::elimSpaces(): Break on NULL terminator in src.

4 years ago[libcompat] cpuflags.c: Split the space elimination algorithm into a separate function.
David Korth [Sat, 29 Aug 2015 15:16:15 +0000 (11:16 -0400)] 
[libcompat] cpuflags.c: Split the space elimination algorithm into a separate function.

Note that while libgenstext's function is called SpaceElim(),
this one is called elimSpaces(). It also doesn't require the
source string to be NULL-terminated, but the destination string
will be NULL-terminated, so sz_dest must be greater than sz_src.

4 years ago[libcompat] cpuflags.c: Eliminate spaces from the brand string.
David Korth [Sat, 29 Aug 2015 15:10:29 +0000 (11:10 -0400)] 
[libcompat] cpuflags.c: Eliminate spaces from the brand string.

We can't use libgenstext because:
1. libcompat shouldn't depend on other Gens/GS II libraries.
2. StringManip is C++; this is C.

The CPU name on my ThinkPad T60p now shows up as:
- "Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz"

4 years ago[gens-sdl] SdlHandler: Align m_segBuffer to a multiple of 16 bytes.
David Korth [Sat, 29 Aug 2015 15:08:24 +0000 (11:08 -0400)] 
[gens-sdl] SdlHandler: Align m_segBuffer to a multiple of 16 bytes.

While implementing some new cpuflags.c functionality to remove excess
spaces from the brand string, SdlHandler::update_audio() started crashing.
The reason was m_segBuffer's address was xxxxxxx8 instead of xxxxxxx0,
which broke the SSE2 code.

aligned_malloc() will ensure that it's aligned to a multiple of 16 bytes.

4 years ago[libcompat] cpuflags.c: Added CPU vendor ID and full name.
David Korth [Sat, 29 Aug 2015 14:56:32 +0000 (10:56 -0400)] 
[libcompat] cpuflags.c: Added CPU vendor ID and full name.

These fields are publicly retrievable with two functions:
- LibCompat_GetCPUVendorID()
- LibCompat_GetCPUFullName()

If the fields are empty, the functions will return NULL.

LibCompat_GetCPUFlags() now saves the vendor ID and retrieves
the brand string if it's available.

TODO: Eliminate extra spaces in the brand string. On my ThinkPad T60p,
the brand string shows up as:
- "Intel(R) Core(TM)2 CPU         T7200  @ 2.00GHz"

I suspect the spaces are there to make it easier to parse the
specific model and frequency, but it gets annoying when using
the brand string as metadata.

4 years ago[libcompat] aligned_malloc.h: memalign() is declared in malloc.h, not stdlib.h.
David Korth [Sat, 29 Aug 2015 14:27:01 +0000 (10:27 -0400)] 
[libcompat] aligned_malloc.h: memalign() is declared in malloc.h, not stdlib.h.

4 years ago[libgens] cpu/Z80_MD_Mem.cpp: Also check for __MACH__.
David Korth [Sat, 29 Aug 2015 14:23:48 +0000 (10:23 -0400)] 
[libgens] cpu/Z80_MD_Mem.cpp: Also check for __MACH__.

4 years ago[libgens] Util/Timing_mac.cpp: Also check for __MACH__.
David Korth [Sat, 29 Aug 2015 14:21:44 +0000 (10:21 -0400)] 
[libgens] Util/Timing_mac.cpp: Also check for __MACH__.

__APPLE__ && !__MACH__ is Classic Mac OS, which is unsupported.

Fixed the #error message; this was a copypasta from Timing_win32.cpp.

4 years ago[libgens] Util/Timing_unix.cpp: Explicitly check for Unix and Linux definitions.
David Korth [Sat, 29 Aug 2015 14:20:59 +0000 (10:20 -0400)] 
[libgens] Util/Timing_unix.cpp: Explicitly check for Unix and Linux definitions.

This ensures that certain embedded platforms that aren't Unix or Linux
compatible won't compile this file.

4 years ago[libcompat] reentrant.h: Only require getpwuid_r() on Unix, Linux, and Mac OS X.
David Korth [Sat, 29 Aug 2015 14:17:00 +0000 (10:17 -0400)] 
[libcompat] reentrant.h: Only require getpwuid_r() on Unix, Linux, and Mac OS X.

Some embedded platforms, e.g. RVL, don't have a concept of user accounts.
There's essentially no way to implement getpwuid_r() there, so a
platform-specific method will need to be used.

4 years ago[cmake] devkitPPC.RVL.cmake: Some more improvements.
David Korth [Sat, 29 Aug 2015 14:12:32 +0000 (10:12 -0400)] 
[cmake] devkitPPC.RVL.cmake: Some more improvements.

- Set ${LIBOGC_INCLUDE_DIR}, ${LIBOGC_PLATFORM},
  ${LIBOGC_LIBRARY_DIR}, and ${LIBOGC_LIBRARY}.

- Show errors if certain directories and files couldn't be found.

- Added definitions for GameCube, though they're disabled right now
  since I'm not targetting GameCube.

4 years ago[extlib] Added EXCLUDE_FROM_ALL and EXCLUDE_FROM_DEFAULT_BUILD to all extlib targets.
David Korth [Sat, 29 Aug 2015 13:56:46 +0000 (09:56 -0400)] 
[extlib] Added EXCLUDE_FROM_ALL and EXCLUDE_FROM_DEFAULT_BUILD to all extlib targets.

This tells CMake that the targets shouldn't be built if none of our own
targets in src/ depend on them.

NOTE: ADD_LIBRARY() has an option EXCLUDE_FROM_ALL, but it doesn't
have EXCLUDE_FROM_DEFAULT_BUILD. Hence, we're using SET_TARGET_PROPERTY()
for both of them.

The second one, EXCLUDE_FROM_DEFAULT_BUILD, is needed if the user decides
to generate Visual Studio projects, since the first one doesn't handle
the case of selecting "Build Solution":
- http://www.cmake.org/pipermail/cmake/2013-February/053531.html

4 years ago[cmake] CheckSystemX8632.cmake: Use CHECK_X86_32 as the temporary variable.
David Korth [Sat, 29 Aug 2015 13:55:37 +0000 (09:55 -0400)] 
[cmake] CheckSystemX8632.cmake: Use CHECK_X86_32 as the temporary variable.

The temporary variable name is displayed in the CMake output,
and TMP_CHECK_X86_32 looks ugly.

4 years agoCMakeLists.txt: Remove second inclusion of options.cmake.
David Korth [Sat, 29 Aug 2015 13:50:28 +0000 (09:50 -0400)] 
CMakeLists.txt: Remove second inclusion of options.cmake.

4 years agoCMakeLists.txt: Removed the UNSET() macro due to issues with cache variables.
David Korth [Sat, 29 Aug 2015 13:50:15 +0000 (09:50 -0400)] 
CMakeLists.txt: Removed the UNSET() macro due to issues with cache variables.

The custom UNSET() macro didn't handle cache variables correctly.
The standard UNSET() macro does, if CACHE is specified.

4 years ago[cmake] CheckZLIB.cmake: Check if HAVE_ZLIB is not defined, not if it's 0.
David Korth [Sat, 29 Aug 2015 13:45:15 +0000 (09:45 -0400)] 
[cmake] CheckZLIB.cmake: Check if HAVE_ZLIB is not defined, not if it's 0.

4 years ago[cmake] libs/: Use ${CMAKE_SOURCE_DIR} and ${CMAKE_BINARY_DIR}.
David Korth [Sat, 29 Aug 2015 13:44:23 +0000 (09:44 -0400)] 
[cmake] libs/: Use ${CMAKE_SOURCE_DIR} and ${CMAKE_BINARY_DIR}.

These are always the top-level directories, whereas the "CURRENT"
directory will be different if we're processing a sub-project.

4 years ago[cmake] CheckOpenGL: Only check for GL and GLEW if HAVE_OPENGL isn't defined.
David Korth [Sat, 29 Aug 2015 13:43:32 +0000 (09:43 -0400)] 
[cmake] CheckOpenGL: Only check for GL and GLEW if HAVE_OPENGL isn't defined.

4 years ago[cmake] devkitPPC.RVL.cmake: Use ${DEVKITPRO} instead of $ENV{DEVKITPRO}.
David Korth [Fri, 28 Aug 2015 05:49:32 +0000 (01:49 -0400)] 
[cmake] devkitPPC.RVL.cmake: Use ${DEVKITPRO} instead of $ENV{DEVKITPRO}.

The environment variables are checked at the beginning, then they're
assigned to CMake variables.

4 years ago[cmake] CheckSystemX8632.cmake: Consistently use ${CHECKED_X86_32}.
David Korth [Fri, 28 Aug 2015 05:29:31 +0000 (01:29 -0400)] 
[cmake] CheckSystemX8632.cmake: Consistently use ${CHECKED_X86_32}.

Some instances were ${CHECK_X86_32}, which prevented the cache
from working correctly.

Use ${TMP_CHECK_X86_32} for the temporary variable instead of
reusing ${CHECK_X86_32}.

4 years ago[libcompat] CMakeLists.txt: Only show the arch-specific byteswap.c warning once.
David Korth [Fri, 28 Aug 2015 05:25:32 +0000 (01:25 -0400)] 
[libcompat] CMakeLists.txt: Only show the arch-specific byteswap.c warning once.

Set an internal cache variable with the byteswap implementation to use,
then add it to libcompat_SRCS.

4 years ago[cmake] devkitPPC.RVL.cmake: Updated with some settings from dolphin-emu.
David Korth [Fri, 28 Aug 2015 05:21:57 +0000 (01:21 -0400)] 
[cmake] devkitPPC.RVL.cmake: Updated with some settings from dolphin-emu.

Notably, CMAKE_SYSTEM_NAME is now "Generic", and
CMAKE_SYSTEM_PROCESSOR is now "powerpc-eabi".

Reference: https://github.com/dolphin-emu/hwtests/blob/master/toolchain-powerpc.cmake