preview

Different Aspects Of Advanced Micro Devices And Instructions

Good Essays

PFACC mmreg1, mmreg2/mem64 0Fh 0Fh/AEh Converts packed floating point operand to a packed 32-bit integer. PFADD is a vector instruction that computes addition of the destination operand and source operand (Advanced Micro Devices, Inc., 2000). PFADD mmreg1, mmreg2/mem64 0Fh 0Fh/9Eh Packed, floating-point addition PFCMPEQ is a vector instruction that performs a comparison of the destination and source operands and generates all one bits or all zero bits based on the result (Advanced Micro Devices, Inc., 2000). PFCMPEQ mmreg1, mmreg2/mem64 0Fh 0Fh/B0h Packed, floating-point comparison, equal to PFCMPGE is a vector instruction that compares the destination and source operands and generates all one bits or all zero bits based on …show more content…

This information is used to identify the boundaries between variable length x86 instructions, distinguish DirectPath from VectorPath early-decode instructions, and locate the opcode byte in each instruction (Advanced Micro Devices, Inc., 2000). The predecode logic also detects code branches, such as CALLs, RETURNs and short unconditional JMPs. When a branch is detected, predecoding begins at the target of the branch (Advanced Micro Devices, Inc., 2000). Branch Prediction The fetch logic accesses the branch prediction table at the same time as the instruction cache and uses the branch prediction table information to predict the direction of the branch instructions (Advanced Micro Devices, Inc., 2000). The Athlon uses a combination of a branch target address buffer (BTB), a global history bimodal counter (GHBC) table, and return address stack (RAS) hardware to predict and accelerate branches (Advanced Micro Devices, Inc., 2000). Predicted-taken branches incur only a single-cycle delay to redirect the instruction fetcher to the target instruction. The minimum penalty for a misprediction is ten cycles (Advanced Micro Devices, Inc., 2000). The BTB is a 2048-entry table that caches the predicted target address of a branch in each entry. The Athlon uses a 12-entry return address stack to predict return addresses from a call. As CALLs are fetched, the next extended instruction pointer is pushed onto the return

Get Access