N·K, lidnariq, Sour, and I have been doing a lot of research on sprite evaluation lately. Based on that work, we have a couple of new test ROMs here: sprite_eval_test_emu by N·K and arbitrary_sprite_0_test by me.
N·K's sprite_eval_test_emu tests the two OAM increment behaviors. It passes if the word PASS appears vertically in each of the two boxes. The PPU has two different kinds of primary OAM address increments: +1, which is just a standard increment of the 8-bit address in $2003, and +4, which has the side effect of clearing the bottom 2 bits. If $2003 is not sprite-aligned, which can happen due to mid-frame writes to $2003 or rendering toggles, then the +4 increment can realign it. +4 increments happen whenever a sprite's Y coordinate is not in range, and +1 increments happen otherwise.
However, there is a surprising quirk: the PPU also does this vertical in-range check against the sprite's X coordinate. If it is in range, it does a +1 increment, and otherwise does a +4 increment. These are normally equivalent operations because X coordinates are normally at OAM addresses ending in %11, so +1 and +4 (because of +4's bit-clearing behavior) both result in bit 2 incrementing and bits 1 and 0 clearing. However, when $2003 does not have its normal alignment, these operations are different, determining whether we continue evaluating from misaligned addresses or snap back to the standard alignment. This is what this test checks for.
This (and the other test described below) both depend on changing $2003 at the end of the scanline. Changing $2003 often results in some form of row corruption (either actual corruption or a data copy from another row), which I'll describe on the wiki in more detail in the future. However, it can be safely changed in the dot range of 321-0 without causing any corruption, and this change sticks because it occurs after $2003 is forced to 0 for that scanline (dots 257-320).
The circuit responsible for selecting the increment mode can be seen here in Breaks. When the PPU detects that a sprite Y is in-range, it sends this result through a chain of latches, and it checks the entire chain to see if an in-range result is present in it. It uses +1 increment mode while there is a successful result for the current byte or in this delay chain. This allows it to increment by 1 to copy the full sprite. This chain also forces bytes to evaluate as not-in-range, allowing the chain to eventually empty. However, that chain is not long enough to fully cover the X byte, so it does a vertical check against that value and increments based on that. Notably, this result does not appear to make it into the delay chain (sprite rendering would be pretty broken if it did), so the check must occur after the current result is latched but before the increment happens.
My arbitrary_sprite_0_test draws a diagonal line across the screen made up entirely of sprite 0's, and sweeps a vertical background line across the screen, detecting the scaline on which sprite 0 hit happens on each frame. Sprite 0 is not a property of a sprite's position in OAM ($00 in OAM is *not* necessarily sprite 0) nor even the order of evaluation (the first sprite evaluated is *not* necessarily sprite 0). Rather, it is a property of the time at which a sprite is evaluated. Specifically, on dot 66 during rendering, a 'sprite 0 on next scanline' flag is set if the current sprite is in range and cleared otherwise. This flag is transferred to a 'sprite 0 on this scanline' flag during dots 257-320 during rendering. If the latter flag is set and a hit happens on the output of the first sprite shifter, then the sprite 0 hit flag is set.
The logic for this is in the same circuit linked above. The S_EV signal is used for sprite 0 and is true on dot 66 during rendering, latching the current in-range result into a flip flop ('sprite 0 on next scanline'). This flag is then latched into another flip flop ('sprite 0 on this scanline') by PAR_O during dots 257-320 during rendering. The signal has to be delayed to the end of scanline like this because the 'sprite 0 on this scanline' flag is used across the entire visible part of the scanline, so you can't just update it on dot 66 with state for the next scanline.
This test verifies that an emulator isn't treating $00 in OAM as sprite 0, but it doesn't distinguish between handling sprite 0 based on the order of evaluation (incorrect) and timing of evaluation (correct). This would require mid-screen rendering toggles. The test can also pass even if there are sprite rendering bugs, as currently seen in NesHawk. This is because the test only has visibility into sprite 0 hit timing, not what is actually drawn to the screen. A screenshot is included with the test to compare against an emulator's output to verify it is drawing correctly.
N·K's sprite_eval_test_emu tests the two OAM increment behaviors. It passes if the word PASS appears vertically in each of the two boxes. The PPU has two different kinds of primary OAM address increments: +1, which is just a standard increment of the 8-bit address in $2003, and +4, which has the side effect of clearing the bottom 2 bits. If $2003 is not sprite-aligned, which can happen due to mid-frame writes to $2003 or rendering toggles, then the +4 increment can realign it. +4 increments happen whenever a sprite's Y coordinate is not in range, and +1 increments happen otherwise.
However, there is a surprising quirk: the PPU also does this vertical in-range check against the sprite's X coordinate. If it is in range, it does a +1 increment, and otherwise does a +4 increment. These are normally equivalent operations because X coordinates are normally at OAM addresses ending in %11, so +1 and +4 (because of +4's bit-clearing behavior) both result in bit 2 incrementing and bits 1 and 0 clearing. However, when $2003 does not have its normal alignment, these operations are different, determining whether we continue evaluating from misaligned addresses or snap back to the standard alignment. This is what this test checks for.
This (and the other test described below) both depend on changing $2003 at the end of the scanline. Changing $2003 often results in some form of row corruption (either actual corruption or a data copy from another row), which I'll describe on the wiki in more detail in the future. However, it can be safely changed in the dot range of 321-0 without causing any corruption, and this change sticks because it occurs after $2003 is forced to 0 for that scanline (dots 257-320).
The circuit responsible for selecting the increment mode can be seen here in Breaks. When the PPU detects that a sprite Y is in-range, it sends this result through a chain of latches, and it checks the entire chain to see if an in-range result is present in it. It uses +1 increment mode while there is a successful result for the current byte or in this delay chain. This allows it to increment by 1 to copy the full sprite. This chain also forces bytes to evaluate as not-in-range, allowing the chain to eventually empty. However, that chain is not long enough to fully cover the X byte, so it does a vertical check against that value and increments based on that. Notably, this result does not appear to make it into the delay chain (sprite rendering would be pretty broken if it did), so the check must occur after the current result is latched but before the increment happens.
My arbitrary_sprite_0_test draws a diagonal line across the screen made up entirely of sprite 0's, and sweeps a vertical background line across the screen, detecting the scaline on which sprite 0 hit happens on each frame. Sprite 0 is not a property of a sprite's position in OAM ($00 in OAM is *not* necessarily sprite 0) nor even the order of evaluation (the first sprite evaluated is *not* necessarily sprite 0). Rather, it is a property of the time at which a sprite is evaluated. Specifically, on dot 66 during rendering, a 'sprite 0 on next scanline' flag is set if the current sprite is in range and cleared otherwise. This flag is transferred to a 'sprite 0 on this scanline' flag during dots 257-320 during rendering. If the latter flag is set and a hit happens on the output of the first sprite shifter, then the sprite 0 hit flag is set.
The logic for this is in the same circuit linked above. The S_EV signal is used for sprite 0 and is true on dot 66 during rendering, latching the current in-range result into a flip flop ('sprite 0 on next scanline'). This flag is then latched into another flip flop ('sprite 0 on this scanline') by PAR_O during dots 257-320 during rendering. The signal has to be delayed to the end of scanline like this because the 'sprite 0 on this scanline' flag is used across the entire visible part of the scanline, so you can't just update it on dot 66 with state for the next scanline.
This test verifies that an emulator isn't treating $00 in OAM as sprite 0, but it doesn't distinguish between handling sprite 0 based on the order of evaluation (incorrect) and timing of evaluation (correct). This would require mid-screen rendering toggles. The test can also pass even if there are sprite rendering bugs, as currently seen in NesHawk. This is because the test only has visibility into sprite 0 hit timing, not what is actually drawn to the screen. A screenshot is included with the test to compare against an emulator's output to verify it is drawing correctly.
Statistics: Posted by Fiskbit — Fri Nov 01, 2024 6:25 pm — Replies 0 — Views 66