things to try: * add some preset viewports that can be switched via number keys (1, 2, 3 etc) * patch the entire expanded-ram imul8xe on top of imul8 to avoid the 3-cycle thunk penalty :D * square-root special case of multiplication for zx*zx and zy*zy * the hi1*hi2 and lo1*lo2 8-bit muls can be optimized into a 512-byte lookup table * jamey on mastodon tried this but had some problems. see what happens on our version! * double-check rounding behavior is correct * try 3.13 fixed point instead of 4.12 for more precision * can we get away without the extra bit? * y-axis mirror optimization * 'wide pixels' 2x and 4x for a fuller initial image in the tiered rendering * rework the palette cycling to look more like an advancing flow * extact viewport for display & re-input via keyboard * fujinet screenshot/viewport uploader