Monday, 10 August 2020

Re: [Rpcemu] Fix for crash on reset from within the emulator



On Sun, 9 Aug 2020 at 09:54, Sprow <webpages@sprow.co.uk> wrote:
Hi,
When in the emulator (both Interpreter or Recompiler) if you choose
"Shutdown" from the task manager's menu, then click on "Restart" on the
resulting dialogue box on RISC OS 5 this causes emulator 0.9.3 to fatally
exit with

  Bad PC FC001000 FC001000

The reset code in the HAL does the usual trick of turning off the MMU and
jumping to the start of the ROM, where the turn off and jump are arranged to
fit in the CPU's pipeline so it doesn't matter that the image has been
remapped elsewhere - the instructions have already been fetched.

It's possible to workaround this fatal exit by noting how getpccache()
decodes the physical address. You end up in the 0x1f000000 case, so provided
there is 256MB of RAM configured it falls into the

  ram1 != NULL

condition and manages to return a nonsense result, dodging the fatal exit.

However, the real cause appears to be a bug in the handling of writes to CP15
register 1 (control). For some reason there's a check for changes to

  CP15_CTRL_MMU | CP15_CTRL_ROM | CP15_CTRL_SYSTEM

which calls cp15_tlb_flush_all(). That would be harmless since RISC OS does
the flush itself, except that that sets pccache = 0xffffffff as a side effect
so back in the top level decode loop that triggers an attempt to read from
the ROM we just mapped out.

I checked the ARM ARM DDI100 rev E, section 2.4, and there's no mention at
all under the control register of anything to do with the TLB, so I don't
understand what the motivation was to flush it.

Deleting that call fixes the problem, and doesn't affect any of the other OS
releases I have to hand (3.x0 4.0x) either.

Going back to 2006 in Mercurial there are vestiges of similar things, so
another approach would be to properly emulate the pipeline (ie. set pccache =
0xffffffff after <pipeline depth> cycles have elapsed) but if we assume RISC
OS is sensible and does TLB maintenance when it is truly required that would
end up calling cp15_tlb_flush_all() anyway, so CP15 register 1 can stick to
just doing control type things.

Patch below,
Sprow.

Thanks for this.

The code in the emulator differs slightly from a real MMU because we cache some information that doesn't exist, or use caches slightly different from real hardware for performance reasons.

I do recall that without those code changes there were versions of RISC OS which would not boot at all when combined with certain combinations of hardware settings (regarding CPU/VRAM).

The ideal fix for this would be to emulate the pipeline precisely, but that would come with significant overhead and few benefits. A workaround like your patch would seem to be a pragmatic approach. I do have a fix which is slightly different to your patch, and I will test these against all the RISC OS Roms I have, and with all the various hardware settings to make sure this doesn't regress any other boot/reboot situations.

This testing probably won't get finished for a few days due to other commitments, but I'll make sure something gets included in the next release for this.

Matthew

No comments:

Post a Comment