PCSX2 Documentation/Introduction to Dynamic Recompilation: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 1: Line 1:
{{DocTabs|Section=3}}
''Originally written by cottonvibes''
''Originally written by cottonvibes''


This blog post is an introduction to dynamic recompilers (dynarecs), and hopes to provide some insight on how they work and why pcsx2 uses them to speed up emulation.
This blog post is an introduction to dynamic recompilers (dynarecs), and hopes to provide some insight on how they work and why pcsx2 uses them to speed up emulation. To first understand why dynarecs are useful, you must first be familiar with a basic interpreter emulator.
It is probably easier to read on our forums, because some of the code didn't wrap nicely on our main blog page....
(Click here to view blog post in forum)
 
To first understand why dynarecs are useful, you must first be familiar with a basic interpreter emulator.
 


Assume we are emulating a very simple processor. Processors have instruction sets which are a set of different instructions they can compute.
Assume we are emulating a very simple processor. Processors have instruction sets which are a set of different instructions they can compute.
Lets assume the processor we are emulating is a made-up chip I'll call SL3 (super lame 3), and has only these 3 instructions (and each instruction has fixed width of 4 bytes):
Lets assume the processor we are emulating is a made-up chip I'll call SL3 (super lame 3), and has only these 3 instructions (and each instruction has fixed width of 4 bytes):


<code lang="asm">
MOV dest_reg, src1_reg // Move source register to destination register
MOV dest_reg, src1_reg // Move source register to destination register
ADD dest_reg, src1_reg, src2_reg // Add source1 and source2 registers, and store the result in destination register
ADD dest_reg, src1_reg, src2_reg // Add source1 and source2 registers, and store the result in destination register
BR relative_address // Branch (jump) to relative address (PC += relative_address * 4)
BR relative_address // Branch (jump) to relative address (PC += relative_address * 4)
</source>


Processors generally have what we call registers which can hold data, and the processor's instructions perform the operations on these registers.
Processors generally have what we call registers which can hold data, and the processor's instructions perform the operations on these registers.
Line 21: Line 17:
Now to program for this processor, we can have the following code:
Now to program for this processor, we can have the following code:


Code:
<source lang="asm">
MOV reg1, reg0
MOV reg1, reg0
ADD reg4, reg2, reg3
ADD reg4, reg2, reg3
BR 5
BR 5
</source>


What this code does is:
What this code does is:
1) It moves register 0 to register 1 (so now register 1 holds a copy of register 0's data).
#It moves register 0 to register 1 (so now register 1 holds a copy of register 0's data).
2) It adds register 2 and register 3 together, and stores the result in register 4.
#It adds register 2 and register 3 together, and stores the result in register 4.
3) It branches 5 instructions further away (so now it jumps to some code that is further down (not shown in above example))
#It branches 5 instructions further away (so now it jumps to some code that is further down (not shown in above example))


So that is how we can program for the SL3 processor in assembly code. But how do we emulate it?
So that is how we can program for the SL3 processor in assembly code. But how do we emulate it?
Line 35: Line 32:
To actually emulate this processor we can use an interpreter. An interpreter simply fetches each instruction opcode and executes them accordingly (e.g. by calling emulated methods for each different instruction). The rest of the emulator (emulating other processors/peripherals of our system) can then get updated sometime in between instructions or after a group of cpu instructions are run. Interpreters are a simple and complete way to emulate a system.
To actually emulate this processor we can use an interpreter. An interpreter simply fetches each instruction opcode and executes them accordingly (e.g. by calling emulated methods for each different instruction). The rest of the emulator (emulating other processors/peripherals of our system) can then get updated sometime in between instructions or after a group of cpu instructions are run. Interpreters are a simple and complete way to emulate a system.


(Click here to see a C++ code example of a simple interpreter)
{http://forums.pcsx2.net/Thread-blog-Introduction-to-Dynamic-Recompilation?pid=102002#pid102002 Click here to see a C++ code example of a simple interpreter]


Using interpreters we constantly have to be fetching and executing instructions one-by-one. There is a lot of overhead in this, and minimal room for optimization since most special case optimizations will have the overhead of checking for them (so it will for example add extra if-statements and conditionals... reducing the gain from the optimization). But there's a faster way to do processor emulation which doesn't have these draw-backs... using dynamic recompilation!
Using interpreters we constantly have to be fetching and executing instructions one-by-one. There is a lot of overhead in this, and minimal room for optimization since most special case optimizations will have the overhead of checking for them (so it will for example add extra if-statements and conditionals... reducing the gain from the optimization). But there's a faster way to do processor emulation which doesn't have these draw-backs... using dynamic recompilation!
Line 45: Line 42:
So for instance remember our above SL3 program:
So for instance remember our above SL3 program:


Code:
<source lang="asm">
MOV reg1, reg0
MOV reg1, reg0
ADD reg4, reg2, reg3
ADD reg4, reg2, reg3
BR 5
BR 5
</source>


Lets assume this code is part of some function and gets called 100's of times a second (this could sound crazy, but games/applications commonly call the same code hundreds or thousands of times a second).
Lets assume this code is part of some function and gets called 100's of times a second (this could sound crazy, but games/applications commonly call the same code hundreds or thousands of times a second).
Line 63: Line 61:
PCSX2 has a very cool emitter that looks very similar to x86-32 assembly, except the instructions have an 'x' before them.
PCSX2 has a very cool emitter that looks very similar to x86-32 assembly, except the instructions have an 'x' before them.
So for example:
So for example:
<source lang="asm">
mov eax, ecx;
mov eax, ecx;
</source>
is
is
<source lang="asm">
xMOV(eax, ecx);
xMOV(eax, ecx);
</source>
with the pcsx2 emitter.
with the pcsx2 emitter.


Line 72: Line 74:
The code for actually recompiling these blocks looks something like this:
The code for actually recompiling these blocks looks something like this:


Code:
<source lang="asm">
// This is our emulated MOV instruction
// This is our emulated MOV instruction
void MOV() {
void MOV() {
Line 158: Line 160:
     }
     }
}
}
</source>


Note the above code doesn't have any logic to successfully exit once it starts executing recompiled blocks... I left this stuff out in order to not complicate things... so assume that somehow execution ends and we can get back to running the other parts of the emulator...  
Note the above code doesn't have any logic to successfully exit once it starts executing recompiled blocks... I left this stuff out in order to not complicate things... so assume that somehow execution ends and we can get back to running the other parts of the emulator...  
Line 181: Line 184:


I should also add that currently pcsx2 has the following dynarecs:
I should also add that currently pcsx2 has the following dynarecs:
EE recompiler (for MIPS R5900 ee-core processor)
*EE recompiler (for MIPS R5900 ee-core processor)
IOP recompiler (for MIPS R3000A i/o processor)
*IOP recompiler (for MIPS R3000A i/o processor)
microVU recompiler (for VU0/VU1, and COP2 instructions)
*microVU recompiler (for VU0/VU1, and COP2 instructions)
Super VU recompiler (can be used instead of microVU for VU0/VU1)
*Super VU recompiler (can be used instead of microVU for VU0/VU1)
newVIF unpack recompiler (recompiles vif unpack routines)
*newVIF unpack recompiler (recompiles vif unpack routines)
r3000air (not yet finished, but should one day supersede the IOP recompiler)
*r3000air (not yet finished, but should one day supersede the IOP recompiler)


{{PCSX2 Developers Blog Navbox}}
{{PCSX2 Documentation Navbox}}
ninja
782

edits