Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register numbering for dest1/source2 #1

Open
pcbbc opened this issue Mar 5, 2024 · 7 comments
Open

Register numbering for dest1/source2 #1

pcbbc opened this issue Mar 5, 2024 · 7 comments

Comments

@pcbbc
Copy link
Contributor

pcbbc commented Mar 5, 2024

This is absolutely great work! Well done.

Just one issue I've spotted. and it's with dest1 in fig A-6. I think dmy is 0x14, and not 0x11?

I have this code...

0E32: 3C800000 dp0 = r1l
0E33: 3C880000 dp4 = r1l
0E34: 38100001 dn0 = 0x0001
0E35: 38180001 dn4 = 0x0001
0E36: 38200FFF dmx = 0x0FFF
0E37: 38280FFF dmy = 0x0FFF

...and it makes absolutely zero sense to be setting STK here, as it disassembles in Ghidra...

       inst:0e32 00 00 80 3c     MOV        DP0 = R1L 
       inst:0e33 00 00 88 3c     MOV        DP4 = R1L 
       inst:0e34 01 00 10 38     LDI        DN0 = 0x1 
       inst:0e35 01 00 18 38     LDI        DN4 = 0x1 
       inst:0e36 ff 0f 20 38     LDI        DMX = 0xfff 
       inst:0e37 ff 0f 28 38     LDI        STK = 0xfff 

Which implies dest1/source2 in fig A-5 is also wrong. And that appears to be confirmed by this code which, from context, is clearly disabling/enabling the HOST IN IRQ in the SR...

0382: 3D410000 r2l = sr
0383: 5B480100 r2 = r2 | 0x0100					;disable HI_IRQ
0384: 3D400000 sr = r2l
....
03A7: 3DC10000 r3l = sr
03A8: 5A6CFEFF r3 = r3 & 0xFEFF					;enable HI_IRQ
03A9: 3DC00000 sr = r3l

...but we get this in Ghidra...

       inst:0382 00 00 41 3d     MOV        R2L = DP0 
       inst:0383 00 01 48 5b     IOR        R2 = R2 | 0x100 
       inst:0384 00 00 40 3d     MOV        DP0 = R2L 
....
       inst:03a7 00 00 c1 3d     MOV        R3L = DP0 
       inst:03a8 ff fe 6c 5a     IAND       R3 = R3 & 0xfeff 
       inst:03a9 00 00 c0 3d     MOV        DP0 = R3L 

Seems to me the register mappings for dest1/source1 contain 2 completely separate sets of registers, because index 0 maps to both dp0 and sr (if you assume 5 bits)?

Which set of registers is accessed depends on if the instruction is conditional (accesses the various pointer registers), or unconditional (accesses system registers). Either that or you need to consider bit 22 is included as part of the register number for a total of 6 bits?

@Pokechu22
Copy link
Owner

Interesting. I don't really have enough data on hand to experiment with this but it does seem plausible. What did you use to get the other disassembly?

If you want to experiment, the relevant code is in these places:

source1 = (23, 25)
flag_22 = (22, 22)
dest1 = (17, 21)
flag_16 = (16, 16)
dest2 = (23, 25)
source2 = (17, 21)

# Note: sr the following registers are labeled as unconditional and only appear in A-5.
# A-6 gives an extra bit for dest1, but that would require 32 underscores here, so we ignore that.
# lc is labeled as source2 only, as well.
attach variables [ dest1 source2 ] [
# Conditional
DP0 DP1 DP2 DP3 DP4 DP5 DP6 DP7
DN0 DN1 DN2 DN3 DN4 DN5 DN6 DN7
DMX DMY
# Unconditional
SR EIR STK SP LC LSP LSR1 LSR2 LSR3 ESR _ _ _ _
];

#---------------------------------------------------------------------------------------------------
# Figure A-5. Inter-Register Transfer Instruction Format
# The only instruction here is MOV. Note that a general-purpose register is always on one side,
# while a main bus register is on the other.
#---------------------------------------------------------------------------------------------------
# dest = rl OR rl = source
# MOV (Inter-register transfer): Inter-register transfer
interopp:dest1 "=" source1 is source1 & dest1 & flag_16=0 {
dest1 = source1;
}
interopp:dest2 "=" source2 is source2 & dest2 & flag_16=1 {
dest2 = source2;
}
#---------------------------------------------------------------------------------------------------
# Figure A-6. Immediate Value Set Instruction Format
# The only instruction is LDI.
#---------------------------------------------------------------------------------------------------
# dest = imm
# LDI (Immediate value set): Immediate value set
immsetopp:dest1 "=" imm is dest1 & imm & flag_16=0 {
dest1 = imm;
}
immsetopp:dest2 "=" imm is dest2 & imm & flag_16=1 {
dest2 = imm;
}

# Figure A-5. Inter-Register Transfer Instruction Format
:"MOV" interopp loopjumpback is fixed_4=3 & opcode_2=3 & flag_22=1 & interopp & loopjumpback { build interopp; build loopjumpback; }
:"MOV" interopp loopjumpback is fixed_4=3 & opcode_2=3 & flag_22=0 & interopp & cond=0 & loopjumpback { build interopp; build loopjumpback; }
:"IF+MOV" condition interopp loopjumpback is fixed_4=3 & opcode_2=3 & flag_22=0 & interopp & condition & loopjumpback {
if (!condition) goto <done>;
build interopp;
<done>
build loopjumpback;
}
# Figure A-6. Immediate Value Set Instruction Format
:"LDI" immsetopp loopjumpback is fixed_4=3 & opcode_2=2 & immsetopp & loopjumpback { build immsetopp; build loopjumpback; }

I don't think I'll have the time to look into this much myself.

@pcbbc
Copy link
Contributor Author

pcbbc commented Mar 5, 2024 via email

@Pokechu22
Copy link
Owner

What was your motivation for writing the Ghidra SLEIGH implementation?

The Wii Speak microphone uses it, and some people working on Dolphin emulator wanted to look into it. I don't think anything particularly interesting came of it in the end, but the SLEIGH implementation now exists and it's nice that people are finding uses for it.

I can send you my binary if you want?

I don't think it'd be particularly useful for me, but thanks for the offer. (If there was a publicly available official disassembler or assembler then that'd be useful for determining encodings but it doesn't seem like that exists.)

@pcbbc
Copy link
Contributor Author

pcbbc commented Mar 6, 2024

Cool. Agree an official assembler/disassembler would be the hold grail. I don't think one has escaped into the wild though. :(

Okay, I'm testing dest1 values 0x00-0x1F with some test code...

9C70: 38010101 r0l = 0x0101
9C71: 3C200000 dmx = r0l
9C72: 38010202 r0l = 0x0202
9C73: 3C280000 dmy = r0l
9C74: 38015000 r0l = 0x5000
9C75: 3C080000 dp4 = r0l
9C76: 14400008 loop 0x0008 9CB7
9C77: 3C010000 r0l = dp0
9C78: 4C100010 nop,    *dp4++ = r0l
9C79: 3C030000 r0l = dp1
9C7A: 4C100010 nop,    *dp4++ = r0l
9C7B: 3C050000 r0l = dp2
9C7C: 4C100010 nop,    *dp4++ = r0l
9C7D: 3C070000 r0l = dp3
9C7E: 4C100010 nop,    *dp4++ = r0l
9C7F: 3C090000 r0l = dp4
9C80: 4C100010 nop,    *dp4++ = r0l
9C81: 3C0B0000 r0l = dp5
9C82: 4C100010 nop,    *dp4++ = r0l
9C83: 3C0D0000 r0l = dp6
9C84: 4C100010 nop,    *dp4++ = r0l
9C85: 3C0F0000 r0l = dp7
9C86: 4C100010 nop,    *dp4++ = r0l
9C87: 3C110000 r0l = dn0
9C88: 4C100010 nop,    *dp4++ = r0l
9C89: 3C130000 r0l = dn1
9C8A: 4C100010 nop,    *dp4++ = r0l
9C8B: 3C150000 r0l = dn2
9C8C: 4C100010 nop,    *dp4++ = r0l
9C8D: 3C170000 r0l = dn3
9C8E: 4C100010 nop,    *dp4++ = r0l
9C8F: 3C190000 r0l = dn4
9C90: 4C100010 nop,    *dp4++ = r0l
9C91: 3C1B0000 r0l = dn5
9C92: 4C100010 nop,    *dp4++ = r0l
9C93: 3C1D0000 r0l = dn6
9C94: 4C100010 nop,    *dp4++ = r0l
9C95: 3C1F0000 r0l = dn7
9C96: 4C100010 nop,    *dp4++ = r0l
9C97: 3C210000 r0l = dmx
9C98: 4C100010 nop,    *dp4++ = r0l
9C99: 3C230000 r0l = _X11
9C9A: 4C100010 nop,    *dp4++ = r0l
9C9B: 3C250000 r0l = _X12
9C9C: 4C100010 nop,    *dp4++ = r0l
9C9D: 3C270000 r0l = _X13
9C9E: 4C100010 nop,    *dp4++ = r0l
9C9F: 3C290000 r0l = dmy
9CA0: 4C100010 nop,    *dp4++ = r0l
9CA1: 3C2B0000 r0l = _X15
9CA2: 4C100010 nop,    *dp4++ = r0l
9CA3: 3C2D0000 r0l = _X16
9CA4: 4C100010 nop,    *dp4++ = r0l
9CA5: 3C2F0000 r0l = _X17
9CA6: 4C100010 nop,    *dp4++ = r0l
9CA7: 3C310000 r0l = _X18
9CA8: 4C100010 nop,    *dp4++ = r0l
9CA9: 3C330000 r0l = _X19
9CAA: 4C100010 nop,    *dp4++ = r0l
9CAB: 3C350000 r0l = _X1A
9CAC: 4C100010 nop,    *dp4++ = r0l
9CAD: 3C370000 r0l = _X1B
9CAE: 4C100010 nop,    *dp4++ = r0l
9CAF: 3C390000 r0l = _X1C
9CB0: 4C100010 nop,    *dp4++ = r0l
9CB1: 3C3B0000 r0l = _X1D
9CB2: 4C100010 nop,    *dp4++ = r0l
9CB3: 3C3D0000 r0l = _X1E
9CB4: 4C100010 nop,    *dp4++ = r0l
9CB5: 3C3F0000 r0l = _X1F
9CB6: 4C100010 nop,    *dp4++ = r0l
9CB7: 38010001 r0l = 0x0001
9CB8: 3C200000 dmx = r0l
9CB9: 3C280000 dmy = r0l

And here's the dump of the registers I got...

0a52 0a75 28a7 4980 5004 29c8 29b8 2901
0010 0001 0001 0003 ffff 0002 0001 0002
0101 0101 0101 0101 0202 0202 0202 0202
0101 0101 0101 0101 0202 0202 0202 0202
....

First line seem like valid DP0-7 values. 0x5004 is certainly what we'd expect from the test code DP4 value as it executed the first iterration of the loop.
Second line look like valid DN0-7 values from the rest of the stock DSP code.
Third line confirm 0x0101 for DMX is readable with register index 0x10, and 0x0202 is readable for DMY with register index 0x14.
Also DMX and DMY are aliassed at other locations in the 0x10-0x1F range, the result of the DSPs only partial decode of those register indexes (exactly as I was expecting TBH).

However from our disassemblies we do see 0x10 and 0x14 used consistently, so we can be confident those are the official published register indexes for DMX and DMY.

Now my attempt at a similar approach with "system" registers with indexes 0x20-0x3F was not successful. The DSP just crashed. Reading and writing the stack also pops and pushes, which would be fatal. I also tried writing the value back to the same register after reading it, but it didn't like that either. I'll work on a more subtle approach to test one register at a time...

@pcbbc
Copy link
Contributor Author

pcbbc commented Mar 6, 2024

Okay, with this code I was able to test each register in turn by manipulating the contents of X[0x0AAA]...

9C76: 14640100 loop 0x0100 9CDB
9C77: 48110AAA r0l = X[0x0AAA]
9C78: 3C060000 dp3 = r0l
9C79: 28060000 jmp dp3
9C7A: 3C410000 r0l = _sr
9C7B: 3C400000 _sr = r0l
9C7C: 2C002E80 jmp 0x9CD9
9C7D: 3C430000 r0l = _eir
9C7E: 3C420000 _eir = r0l
9C7F: 2C002D00 jmp 0x9CD9
9C80: 3C450000 r0l = _stack
9C81: 3C440000 _stack = r0l
9C82: 2C002B80 jmp 0x9CD9
9C83: 3C470000 r0l = _X23
9C84: 3C460000 _X23 = r0l
9C85: 2C002A00 jmp 0x9CD9
9C86: 3C490000 r0l = _X24
9C87: 3C480000 _X24 = r0l
9C88: 2C002880 jmp 0x9CD9
9C89: 3C4B0000 r0l = _X25
9C8A: 3C4A0000 _X25 = r0l
9C8B: 2C002700 jmp 0x9CD9
9C8C: 3C4D0000 r0l = _???
9C8D: 3C4C0000 _??? = r0l
9C8E: 2C002580 jmp 0x9CD9
9C8F: 3C4F0000 r0l = _X27
9C90: 3C4E0000 _X27 = r0l
9C91: 2C002400 jmp 0x9CD9
9C92: 3C510000 r0l = _X28
9C93: 3C500000 _X28 = r0l
9C94: 2C002280 jmp 0x9CD9
9C95: 3C530000 r0l = _X29
9C96: 3C520000 _X29 = r0l
9C97: 2C002100 jmp 0x9CD9
9C98: 3C550000 r0l = _X2A
9C99: 3C540000 _X2A = r0l
9C9A: 2C001F80 jmp 0x9CD9
9C9B: 3C570000 r0l = _X2B
9C9C: 3C560000 _X2B = r0l
9C9D: 2C001E00 jmp 0x9CD9
9C9E: 3C590000 r0l = _X2C
9C9F: 3C580000 _X2C = r0l
9CA0: 2C001C80 jmp 0x9CD9
9CA1: 3C5B0000 r0l = _X2D
9CA2: 3C5A0000 _X2D = r0l
9CA3: 2C001B00 jmp 0x9CD9
9CA4: 3C5D0000 r0l = _X2E
9CA5: 3C5C0000 _X2E = r0l
9CA6: 2C001980 jmp 0x9CD9
9CA7: 3C5F0000 r0l = _X2F
9CA8: 3C5E0000 _X2F = r0l
9CA9: 2C001800 jmp 0x9CD9
9CAA: 3C610000 r0l = _X30
9CAB: 3C600000 _X30 = r0l
9CAC: 2C001680 jmp 0x9CD9
9CAD: 3C630000 r0l = _X31
9CAE: 3C620000 _X31 = r0l
9CAF: 2C001500 jmp 0x9CD9
9CB0: 3C650000 r0l = _X32
9CB1: 3C640000 _X32 = r0l
9CB2: 2C001380 jmp 0x9CD9
9CB3: 3C670000 r0l = _X33
9CB4: 3C660000 _X33 = r0l
9CB5: 2C001200 jmp 0x9CD9
9CB6: 3C690000 r0l = _X34
9CB7: 3C680000 _X34 = r0l
9CB8: 2C001080 jmp 0x9CD9
9CB9: 3C6B0000 r0l = _X35
9CBA: 3C6A0000 _X35 = r0l
9CBB: 2C000F00 jmp 0x9CD9
9CBC: 3C6D0000 r0l = _X36
9CBD: 3C6C0000 _X36 = r0l
9CBE: 2C000D80 jmp 0x9CD9
9CBF: 3C6F0000 r0l = _X37
9CC0: 3C6E0000 _X37 = r0l
9CC1: 2C000C00 jmp 0x9CD9
9CC2: 3C710000 r0l = _X38
9CC3: 3C700000 _X38 = r0l
9CC4: 2C000A80 jmp 0x9CD9
9CC5: 3C730000 r0l = _X39
9CC6: 3C720000 _X39 = r0l
9CC7: 2C000900 jmp 0x9CD9
9CC8: 3C750000 r0l = _X3A
9CC9: 3C740000 _X3A = r0l
9CCA: 2C000780 jmp 0x9CD9
9CCB: 3C770000 r0l = _X3B
9CCC: 3C760000 _X3B = r0l
9CCD: 2C000600 jmp 0x9CD9
9CCE: 3C790000 r0l = _X3C
9CCF: 3C780000 _X3C = r0l
9CD0: 2C000480 jmp 0x9CD9
9CD1: 3C7B0000 r0l = _X3D
9CD2: 3C7A0000 _X3D = r0l
9CD3: 2C000300 jmp 0x9CD9
9CD4: 3C7D0000 r0l = _X3E
9CD5: 3C7C0000 _X3E = r0l
9CD6: 2C000180 jmp 0x9CD9
9CD7: 3C7F0000 r0l = _X3F
9CD8: 3C7E0000 _X3F = r0l
9CD9: 4C100010 nop,    *dp4++ = r0l
9CDA: 00000000 nop

And then with a few more tests I have...

0x20	0x6010	SR
0x21	0x7FFF  EIR
0x22	0x07DF	STACK 				0x07DF = PC from previous CALL
0x23	0x0001	SP 				seems correct, we are only one CALL deep
0x24	0x9C77	LSR1/LSA			loop start address (first instruction)
0x25	0x9CDA	LSR2/LEA			loop end address (last instruction)
0x26	0x0100 0x00FF ...... 0x0002 0x0001	LSR3/LC	loop count		
0x27	0x0001	LSP				loop stack pointer (confirmed nesting loops = 0x0002)
0x28	0xFFFF					??? all bits R/W
0x29	0xFFFF					??? all bits R/W
0x2A	0xFFFF					??? all bits R/W
0x2B	0x8000					writing 0x0000 or 0xFFFF causes host to crash
0x2C	0x0000	ESR				bits 0-3 R/W, bits 4-F always 0, bit 3 = OVF as per µPD77210 Family architecture U15807EJ2V0UM00
0x2D	0xFFFB					??? all bits R/W
0x2E	0x02AA					CPU ID? just a guess as all bits R/O
0x2F	0x07B5 0x0001 .... 0x00FD 0x00FE	LC???
0x30-0x3F same as 0x2F				???

A few changes here....

  • LSP comes at 0x27 after LSR1 (0x24), LSR2 (0x25) and LSR3 (0x26), and not before them as per A-5.
  • ESR is at 0x2C after 4 unknown registers.
  • The R/O LC, or what I assume is LC, is at 0x2F but counts up? And seems to have a wacky count (0x07B5), perhaps from the last end of loop, on entry? It's certainly different from the LSR3 value of LC on the stack, which make perfect sense with my test code. for the currently executing loop.

Not sure what registers at indexes 0x28, 0x29 and 0x2A are. But I could read write all the bits.
0x2B caused a crash of my host system when I wrote either 0x0000 or 0xFFFF to it. I've no idea what the DSP did. Perhaps a reset/reboot flag?

Pokechu22 added a commit that referenced this issue Mar 11, 2024
Updated dest1 and source2 register mappings. See #1.
@Pokechu22
Copy link
Owner

Thanks! I've merged your PR (#2), and also updated the comment (3687d97).

Do 0x28/0x29/0x2A have different values, or are they aliased to each other?

@pcbbc
Copy link
Contributor Author

pcbbc commented Mar 11, 2024

Do 0x28/0x29/0x2A have different values, or are they aliased to each other?

Good point. I didn’t actually think to test that. 🤨
I’m away this week, but I’ll take a look when I get back.

The aliasing of 0x30-0x3F is a bit weird though. One might expect aliases of 0x20-0x2F, but instead I got repeats of the single register 0x2F for all 16 locations. I guess it’s could somehow be the fall through case of full decode of the 0x20-0x2E range?

Thanks for the merge BTW.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants